The idea
Zhima Credit (under Alibaba Group) is a most popular third party online credit scoring in China, which uses advanced technologies such as AI and cloud computing to objectively present the credit status of individuals and companies. It has been empowered by Zhima Credit Score in many business scenarios such as leasing, shopping, business travel, and local life, allowing merchants to provide better and more convenient services to more users. The relationship between people and business is made simple because of credit.
Under the user's authorization, Zhima Credit Score is calculated based on the user's various consumption and behavior data on the Internet, combined with Internet financial lending information, using cloud computing and machine learning, etc. Many technologies, such as logistic regression, decision tree, random forest and other algorithms, are used to process data in various dimensions and objectively present personal credit status in the five dimensions of userโs credit history, behavior preference, performance ability, identity traits, and personal relationship. The Zhima score ranges from 350 to 950. The higher the score, the better the credit. A higher Zhima Score can help users obtain more efficient and quality services. For example, generally, users with high Zhima Credit Scores have advantages in amount and interest rate for applying for loans, because their corresponding default rates are lower.
CASE 1: THE INFLUENCE OF ZHIMA CREDIT SCORE ON DEFAULT PREDICTION FOR PERSONAL LOANS
The experiment in this paper has two stages: preprocessing and classification. In the preprocessing stage, Synthetic Minority Oversampling Technique (SMOTE) is used to avoid imbalanced problem in the data. SMOTE artificially synthesizes new samples based on the minority samples and adds them to the dataset, so that the classes are balanced. Then two datasets, called ADD ZHIMA and REMOVE ZHIMA, are generated by retaining and removing the feature of Zhima Credit Score, Respectively. Six well-known classification algorithms are applied in this experiment. Thay are C4.5, Random forest, Naive Bayes, K-Nearest Neighbor (KNN), Support Vector Machines (SVM) and Back Propagation Neural Network. The classification algorithms learn from a set of training samples and corresponding category labels, and output a trained classifier. The classifier can assign unlabeled samples to a class.
As shown in Fig.1, two experimental results are given: (1) The performances of all classifiers are reduced if the feature of Zhima Credit Score is removed. For example, after removing Zhima Credit Score, accuracy of C4.5 decreases from 92.64% to 92.31%, AUC of C4.5 decreases from 0.942 to 0.937. (2) The classifiers have different sensitivities to Zhima Credit Score. The numerical changes on both accuracy and AUC between two datasets are different on different classifiers. Among the six classifiers, SVM have results which are most obvious difference.Conclusion
Conclusion
The usage of Zhima Credit Score can improve the classification performances of default prediction, and the degree of improvement is different on different classifiers.Conclusion