Working PaperPDF Available

machine learning application in online lending credit risk prediction

Authors:
 Machine Learning Application in Online Leading Credit Risk
Prediction

Abstract
         
    

 !
"!#
   $     %
  !     &' ! 
(
&')#*!
+  ,-   
  .    "*!  
 !        

Keywords: Online lending, Big data, Random forest, XGBoost
1. Introduction
         / 
 
   
0     
,#12.
'.

       + 
!+3
         

/!(! 

+4!



5
 +    
+6*
7       
*  ! ! '
!  8 9   
:8!+
(    !     
   + ; 7   
!
+,-%
 ( 
<  !  +      
 =+        
 '!        
>0+
!+
,,
       
!  !   
=  +      
,1< !$?!
        @ A  
!   #
    2      
   !    ,3 < ! 
    2
,4
       
         
?!
,5
B
"#
  %!+!
 &'        $
       8  
(      )#* ! ! 7C=!

2. Data and Variable De*nition
D
   !    !   
%.
2.1 Lending Platform
#
!($  E!  
$ !    !   
9 1, $#  F
',---!1---#!
,---6---
2.2 Phone Records
G   ! ! ! 2
          
$  ,8 

2.3 Third Party Data
     
;,!!"!(".

,
3. Models developed
% !    &'    
:;;-@
    A =    
      #   
  D 9-H  
!3-H
1,,359
;-461

3.1. Random forest model parameter tuning
    
       
        

!   ,6 G 
   =+      
     
@IA! I! I  +    
   *I+ 
(*I+
(
!@J
tp
tp+fp
!J
tp
tp+fn
A  

=! G           
  5-  5--!      
  
< 5---5--
G<,E
I!
<1BI,35!G
 ! < 3 ' .     !
I5---!I1!I
,
<, G#    
D>J5-!5--!5---
<1G#I
1!4!6
<3G#I
,!3!5
3.2 XGBoost *ne parameter tuning
&'&' 
KK
     

.&'
  < ! 
  + "   
%$
"7!
"+.
,9 &'+"

&'      7 !
.I @ .   A!  @0 A!
I @  L   
A! @     
 A! II @L     
(  A!  @0, "   A  
@& + (A
      
7C=@A
7C=!

M    .I      
!.I1-!7C=

*!1 
      --,! =I   -9!
  -9! 8II  ,!   --,!   -,!
7C=!-;4-64
1?&'
4. Experimental results
 ,-          
&'?!

  7  :;;-  ! "*!
III6I,:::3,:I@#  
6A!III,I,:::3,4I@
#     ,#  A!
III,1I,:::3
1-I @ #     ,1  A!
>I8I
I&! III9I,:::3,1I! >I8I*I&!
II
I6-I,:::316I! ==!
III14I,:::314I      ,-
<4
 ,-     &'   "*!
III,I
I,:::3,4I! ==!
III9I,:::3,1I!II
I3I,:::3,6I! III6I,:::3,:I!
*!=
=! IIIII6I!
III,1I,:::31-I
.
7"*
 !  369H!  ;  
-,6HG ,- &' 

*       &'   
  !
,-
           
!
+<!.
!(
:;;-I
&'  7   &' ! "   
"      +    
&'
0"        < 5   
     
 )#*   +     
   )#*  
.)#*
 -6494! &'  -91-3* )#*  
!
)#*!    '
"!&'"
     =! &'  
+
"*
III6I,:::3,:I
III,I,:::3,4I
III,1I,:::31-I
>78%I8II&
III9I,:::3,1I
>78%I8I*I&
III6-I,:::316I
==
III14I,:::314I
-
--,
--1
--3
--4 369H
-,:H -,6H -,6H -,6H -,6H -,5H -,5H -,5H -,5H
@A,-

"*
III,I,:::3,4I
==
III9I,:::3,1I
III3I,:::3,6I
III6I,:::3,:I
*
==
IIIII6I
III,1I,:::31-I
-
----5
---,
---,5
---1
---15
---3
---35 -31H
-13H -13H -1-H
-,6H -,5H -,5H -,4H -,4H -,4H
@A&',-
<4,-@A@A&'

<5@A)#*@A)#*&'
5. Conclusion
 ! &'
        .   
          
 %   "     
,-
&'
'
 .    "*!   
!
   )#*      
+&'
6. Literature
@,A * * ?   G1G % ? '  1--:!
:6@1AD,9#45
@1A ' *=! & < %  <  
 D     1  ' ! 1--;N
1@,AD3;#65
@3A B!?8% &!OGD' 8 0
1-!P=7 *!M
1;!>,D143#15:!1-,,
@4A Q(!B!Q0R< @1-,5A,D,;E
,-,,:6F4-:54#-,5#--,:#,
@5A 0= Q<,6@1---A,4;#,91
@6A  7 Q  E B B! O7 8 *
7P!4%!G#?!C*!,;;:
@9A 0!R Q *!1-,1!
1!,:,#,:;
@:A 0 = ! O7      D 
+     P  Q 
<! M ,6! > 1! 1---!  ,4;#,91 D ,-,-,6F*-,6;#
1-9-@--A---34#-
@;A ER!??!S=QO=
P =>=P1--9
@,-A 7&  7 7  Q 
8"!M,!>3!71-,,
@,,A 0>70!O7.
+!P%.
 ! M 36! > 1! 1--;!  3-1:#3-33 D
,-,-,6F1--:-,-,:
@,1A B!7!%0Q0!O0(
!P7%!M;;!
>,D4;#:4!1--;
@,3A 0!!7 )!%<G0!)*!O*
 D   
TP?)*< BG*!
Q<)*&!?C!,#41!1--;
@,4A <! *!  & R Q! OE    
  T %  GP>%
BG!,#63!1--:
@,5A B ?! = )! R B! * R @1-,5A O7   G1G
!O< ,@,AD,#:
@,6A '!0@1--,A!8!45!5#31
@,9A * '! G *!  '! * ?"! 
?  Q  = 7 @-;95#:::9A
M,6,#>,,!81-,9
... Traditionally, lenders use statistical models, include linear discriminant model (Lessmann et al., 2015), logistic regression model (Lessmann et al., 2015), decision tree model (Fitzpatrick et al., 2016), neural network model (Zhou et al., 2016;Leong, 2016), and genetic programming model (Bhatia et al., 2017& Yu, 2017, to predict a customer's creditworthiness. While it is recognized that traditional credit scoring methods are highly predictive, most of previous studies mainly focus on improving credit decision's accuracy (Benyacoub et al, 2022) using traditional financial information such as loan repayment and credit bureau. ...
Article
Full-text available
Mobile e-commerce has grown rapidly in the last decade because of the development of mobile network services, computing capabilities and big data’s applications. Financial institutions have been undergoing fundamental transformation in credit risk areas, specifically to traditional credit policy, that is now inadequate for accurately evaluating an individual’s credit risk profile in a timely manner. A big-scale dataset representing deep mobile usage of 450,722 anonymous mobile users with a 28-month loan history and mobile behavior of both iOS and Android is designed, can add value for credit scoring in terms of better accuracy and lower feature acquisition cost by introducing a cost-based quantum-inspired evolutionary algorithm (QIEA) feature selection method. The QIEA adopts quantum-based individual representation and quantum rotation gate operator to improve feature exploration capability of conventional genetic algorithm (GA). The expected feature yield fitness function introduced in QIEA able to identify cost-effective feature subsets. Experimental results show that quantum-based method achieves good predictive performances even with only 70–80% number of features selected by GAs, and hence achieve lower feature acquisition costs with budget constraints. Additionally, computational time can be reduced by 30–60% compared with GAs depending on different feature set sizes.
... Ma and Lv (2019) proposed MLIA algorithm for predicting financial credit risks, which was an improved model of a machine learning algorithm (Ma and Lv, 2019). Yu (2017) constructed an integrated machine learning model based on historical transaction data to study online lending credit risk prediction. Zhao et al. (2018) used the machine learning method of least square support vector machines (LSSVM) to predict systemic financial risks. ...
Article
Full-text available
The rapid development of financial technology not only provides a lot of convenience to people's production and life, but also brings a lot of risks to financial security. To prevent financial risks, a better way is to build an accurate warning model before the financial risk occurs, not to find a solution after the outbreak of the risk. In the past decade, deep learning has made amazing achievements in the fields, such as image recognition, natural language processing. Therefore, some researchers try to apply deep learning methods to financial risk prediction and most of the results are satisfactory. The main work of this paper is to review the predecessors' work of deep learning for financial risk prediction according to three prominent characteristics of financial data: heterogeneity, multi-source, and imbalance. We first briefly introduced some classical deep learning models as the model basis of financial risk prediction. Then we analyzed the reasons for these characteristics of financial data. Meanwhile, we studied the differences of commonly used deep learning models according to different data characteristics. Finally, we pointed out some open issues with research significance in this field and suggested the future implementations that might be feasible.
... Xgboost is a highly scalable and effective end-to-end tree boosting technique [30]. It has been applied to predict loan default probability, and its maximum K-S value has reached 0.7203, suggesting very effective performance [31]. Random forest is another ensemble algorithm that uses decision trees as base classifiers. ...
... More complicated tree structural models, like gradient boosted decision tree (GBDT) and extreme gradient boosted (XGBoost) decision tree models could be used to build fraud detection systems. [23] compared performances of logistic regression, GBDT and deep learning models on credit card fraud detection, and [24] compared performances of random forest (RF) and XGBoost on detecting frauds that conducted through P2P lending platform. ...
Article
Full-text available
Credit risk has been a widespread and deep penetrating problem for centuries, but not until various credit derivatives and products were developed and novel technologies began radically changing the human society, have fraud detection, credit scoring and other risk management systems become so important not only to some specific firms, but to industries and governments worldwide. Frauds and unpredictable defaults cost billions of dollars each year, thus, forcing financial institutions to continuously improve their systems for loss reduction. In the past twenty years, amounts of studies have proposed the use of data mining techniques to detect frauds, score credits and manage risks, but issues such as data selection, algorithm design, and hyperparameter optimization affect the perceived ability of the proposed solutions and it is difficult for auditors and researchers to explore and figure out the highest level of general development in this area. In this survey we focus on a state of the art survey of recently developed data mining techniques for fraud detection and credit scoring. Several outstanding experiments are recorded and highlighted, and the corresponding techniques, which are mostly based on supervised learning algorithms, unsupervised learning algorithms, semisupervised algorithms, ensemble learning, transfer learning, or some hybrid ideas are explained and analysed. The goal of this paper is to provide a dense review of up-to-date techniques for fraud detection and credit scoring, a general analysis on the results achieved and upcoming challenges for further researches.
Article
Full-text available
Objetivo do estudo: O objetivo deste artigo é comparar a regressão logística clássica e dois métodos de machine learning para credit scoring, o random forest e o XGBoost, visando identificar qual apresenta melhor desempenho na previsão de inadimplência.Metodologia/abordagem: O desempenho dos modelos estimados foi comparado com base em acurácia, estatística Kolmogorov-Smirnov, além de curva ROC.Originalidade/Relevância: Foi utilizada uma base de dados exclusiva com informações de 3.844 pequenas e médias empresas, clientes de uma locadora de automóveis com atuação em todo o Brasil.Principais resultados: Os resultados sugerem que os métodos de machine learning apresentam capacidade preditiva maior quando comparados com a regressão logística. O XGBoost teve o melhor desempenho, entre os métodos analisados.Contribuições teóricas/metodológicas: Este artigo corrobora a utilização de variáveis não financeiras para a previsão de inadimplência e a superioridade dos métodos estatísticos mais modernos frente à abordagem clássica.
ResearchGate has not been able to resolve any references for this publication.