The invention discloses a P2P borrower credit evaluation method based on
big data. The invention comprises a
data acquisition module, a
data processing module and a
model building module. In the era of
big data, credit data sources are expanding, mainly including the following four aspects: credit data generated by financial institutions, credit data generated by relevant government departments, credit data generated by other public utilities, Internet credit data generated by the network. The data module is mainly divided into two parts, the credit data generated by financial institutions, relevant government departments and public utilities are qualitatively defined as structured data collection;
Social media data, such as WeChat friends and Sina Weibo, are collected as unstructured datain Internet credit data.
Data processing module is mainly aimed at structured data, including data balance
processing and
feature selection. As that imbalance phenomenon exist in the structured dataof personal credit, the invention uses
CART-SMOTE
algorithm for data balance
processing; Under the background of
big data, the characteristics of personal credit
evaluation data are complicated, and irrelevant and redundant variables will have adverse influence on the accuracy of
model prediction. The invention uses
random forest and
gradient descent decision tree to select evaluation characteristics. The structured
data model uses an improved lightGBM for preliminary credit ratings;
Feature extraction from unstructured social text data, credit evaluation and affective tendency analysis usingin-depth learning. Then the emotional tendencies in personal
social media text data are fed back to the credit evaluation of P2P borrowers to study the correlation between them. Provide a reference for the final credit evaluation structure.