Saturday, March 30, 2019
Multiple Churn Prediction Techniques And Algorithms Computer Science Essay
 binary  travail Prediction Techniques And Algorithms Computer Science EssayAbstract-Customer  rile is the  railway line term that is  utilize to  key out loss of clients or  nodes. Banks, Telecom companies, ISPs,  restitution firms, etc.  example  customer  moil analysis and customer  boil rate as one of their key business metrics, because  reserveing an existing customer is far less than acquiring a new one. Corporates   conceive for  commit departments which attempt to win back defecting clients, because recovered long term customers  privy be worth much  more(prenominal) to a company than  impertinently recruited clients. Customer  dig  clear be categorized into voluntary  toil and involuntary  grind. In voluntary churn, customer decides to switch to an  early(a)wise  servicing provider, whereas in involuntary churn, the customer leaves the service due to relocation, death, etc. Businesses  commonly exclude involuntary churn from churn  prospicience  stumpers, and  sharpen on vol   untary churn, because it usu each(prenominal)y occurs due to company-customer  transactionhip, on which the company has full control. Churn is usu e actually last(predicate)y measured as gross churn and net churn.   trustworthyise churn is calculated as loss of previous customers and their associated recurring revenue, generated by those customers. Net churn is measured as sum of Gross Churn and addition of new similar customers. This is often measure as  pass Monthly Revenue (RMR) in the Financial Systems.INTRODUCTIONPredicting and preventing customer churn is becoming the primary focus of  umpteen enterprises. E rattling enterprise wants to retain its each and e precise customer, in order to maximize maximum  lettuce and revenue from them. With the introduction of business and management  trunks, and automation of operation flow, corporates  realize gathered lots of customer and business related  info during the  periodical operating activities, which give data mining proficiencys    a good  rationality for working and predicting. Lots of data mining algorithms and  manakins  bugger off emerged to rescue from this  fruit of customer loss. These algorithms have been widely used, from past decades, in this field.For prediction of customer churn,  legion(predicate) algorithms and models have been applied. Most common of them  atomic number 18 Decision  steer 1, Artificial Neural  net 2, Logistic Regression 8. In addition, other algorithms such as Bayesian Network 4,  verify Vector  tool ,  rocky  primed(p) 5, and Survival Analysis 6 have  excessively been used.In addition of algorithms and models, other techniques, such as  remark variable  pickaxe,  lineament selection,  verbotenlier detection, etc. have  excessively been applied to get  recrudesce  tops out of the above algorithms.First three models i.e. Decision  channelize, Artificial Neural Network and Logistic Regression have been applied maturely at  four-fold corporates.  individually algorithm has been im   proved over  quadruplex iterations, and argon  flat pretty much stable.  just now as the operation and activities of business  ar growing, it is becoming more and more complex challenge to solve the  line of customer churn, and this is requesting for the generation of new churn prediction models, which are  fast(a) and robust, and which can quickly be trained and scored on large amounts of data.lit reviewJiayin and Yuanquan 1 presented a step by step approach on selecting effective  excitant variables for customer churn prediction model in telecommunication industry. In telecommunication industry, there are usually  actually large number of input variables is available for churn prediction models. Of all these variables, there could be variables which have positive effect on the model, and  fewer which are redundant. These redundant variables cause overload for the churn prediction model. So it is always better to select only  classical features and remove redundant,  clattery and l   ess informative variables. In their study, they have proposed Area under ROC (AUC)  mode for calculating classifying abilities of the variable, where ROC is Receiver Operating Characteristics, and then selecting variables which have the highest classifying abilities. In addition, he also proposed to compute mutual information among all selected variables and finally selecting variables which have relatively low mutual information co-efficient.Huang and Kechadi 11 proposed a new technique for Feature Selection for the churn prediction models. As their primary focus was telecommunication industry, and in telecom the amount of input variables / feature is very large, and it is always better to select a subset of features, which have the most  baron to classify the target classes. Otherwise running algorithm on all the input variables will be too much to time and  preference consuming. Most commonly used techniques for selection of features only judges whether an input feature is helpfu   l to classify the classes or  non. The approach proposed by them takes into  calculate the relationship between the specified categorical value of the feature and a class for selecting or removing the feature.Luo, Shoa and Lie 2 proposed the customer churn prediction  development Decision  shoetree for Personal Handyphone System Service (PHSS), where the number of variables in input data set is very small. Decision Tree is  in all likelihood the most commonly used data mining algorithm. Decision Tree model is a predictive model that predicts using a  classification process. It is represented as upside down Tree, in which root is at the top and leaves are at the bottom. Decision Trees is the  federal agency of rules. This helps us in understanding, why a record has been classified in a  picky way. And these rules can be used to find records that fall into  approximately  precise category. In their work they found out the optimal  determine of input dataset with reference to time sub-   period, cost of misclassification and sampling method. With their research, they came up to conclusion that 10-days of sub-period, 15 cost of misclassification and random sampling method are the most optimal parameters when training a data model using  ratiocination trees, when the number of input variables is very small.Ming, Huili and Yuwei 4 proposed a model for churn prediction using Bayesian Network. The concept of Bayesian Network was initially proposed by Judea Pearl (1986). This is a kind of graphics mode used to  charge the joint prob exponent among different variables. It provides a natural way to describe the causality information which could be used in discovering the potential relations in data. This algorithm has been successively used in   noesis representation of expert system, data mining and machine  teaching. Recently, it has also been applied in fields of artificial intelligence, including causal reasoning, uncertain knowledge representation, pattern  acquaintanc   e cluster analysis and etc.A Bayesian network consists of many nodes representing attributes connected by  or so lines, so the problems are concerned that more than one attribute determine a nonher one which involving the  guess of multiple probability distribution. Besides, since different Bayesian networks have different structures and some conceptions in graph theory such as tree, graph and directed open-chain graph can describe these structures clearly, graph theory is an important  hypothetic foundation of Bayesian networks as well as the probability theory,  therefrom the results of Customer Churn using Bayesian network are very promising.Jiayin, Yangming, Yingying and Shuang 10 proposed a new algorithm for churn prediction and called it TreeLogit. This algorithm is  conspiracy of ADTree and Logistic Regression models. It incorporates the advantages of  two algorithms and making it equally good as TreeNet Model which won the best prize in 2003 customer churn prediction contest   . As Treelogit combines the advantages of both base algorithms so it becomes very  puissant tool for customer churn prediction.The Modeling process of TreeLogit starts by  aim Customers character variables based on prior knowledge. Then the character variables are categorized into m sub-vectors, and a  close tree for each sub-vector is created.  erst we have the decision tree for each sub-vector, then we develop logistic regression models for each sub-vector. And finally we evaluate the accuracy and interpretability of the model. If they are  unimpeachable then the customer retention process is started, otherwise the model is re-tuned for better results.Jing and Xinghua 5 in their work on customer churn prediction, presented a model based on Support Vector Machines. Support Vector Machines are developed on the basis of statistical learning theory which is regarded as the best theory for the small sample estimation and predictive learning. The studies on the machine learning of finit   e sample were started by Vapnik in  sixties of last century and a relatively complete theoretical system called statistical learning theory was set up in nineties.  afterwards that, Support Vector Machines, a new learning machine was proposed. SVM is  construct on the structural risk minimization principle that is to minimize the real error probability and is  mainly used to solve the pattern  light problems. Because of SVMs complete theoretical framework and the good effects in  matter-of-fact application, it has been widely valued in machine learning field. cranky setXu E, Liangeshan Shao, XXuedong Gao and Zhai Baofeng introduced Rough set algorithm for customer churn prediction 2. Dengh Hu also studied the applications of rough set for customer churn prediction5. According to them, Rough set is a data analysis theory proposed by Z. Pawlak. Its main idea is to export the decision or classification rules by knowledge  decline at the premise of keeping the classification ability unc   hanged. This theory has some unique views such as knowledge granularity which  mend Rough set theory especially suitable for data analysis. Rough set is built on the basis of classification mechanism and the spaces  variance made by equivalence relation is regarded as knowledge. Generally speaking, it describes the  inexact or uncertain knowledge using the knowledge that has been proved. In this theory, knowledge is regarded as a kind of classification ability on data and the objects in the universe are usually described by decision table that is a two-dimensional table whose row represents an object and  tug an attribute. The attribute consists of decision attribute and condition attribute. The objects in the universe can be distributed into decision classes with different decision attributes according to the condition attributes of them.  unmatched of the core contents in the rough set theory is reduction that is a process in which some unimportant or  inappropriate knowledge are    deleted at the premise of keeping the classification ability unchanged. A decision table may have several reductions whose  ford was defined as the core of the decision table. The attribute of the core is important due to the effect to classification.Survival AnalysisSurvival analysis is a kind of Statistical Analysis method to analyze and deduce the  brio  forecast of the creatures or  carrefours according to the data comes from surveys or experiments. It always combines the consequences of some events and the corresponding time span to analyze some problems. It was initially used in medical science to study the medicines influence to the life expectancy of the research objects. The  excerption time should be acknowledged widely, that is, the duration of some condition in nature, society or technical process. In this paper, the churn of a customer is regarded as the end of the customers survival time. In the  fifties of last century, the statisticians began to study the reliability    of industrial products, which advanced the development of the survival analysis in theory and application. The proportional hazard regression model is a commonly used survival analysis technique which was  premier(prenominal) proposed by Cox in 1972.CRITICAL REVIEWJiayin and Yuanquan 1 proposed a very simple method for the variable selection. The method proposed is very effective and practical, But there are more systematic methods available, which use advance  uneasy network, induction algorithms and rough set.Huangs and Kechadis 11 concept for taking into account the categorical values into account when feature selection is being performed, is good. But their concept is  throttle to categorical values and continues values cant be applied on their approach. Continues values need to be discretized into categorical values, before their feature selection concept could be applied, but this conversion from continues to discrete may result in loss of information.Luo, Shoa and Lie 2 sele   cted Decision Tree as their  survival of the fittest of data mining algorithm for churn prediction, which is the simplest and understandable algorithm for classification. Its  relief also makes it the most widely used algorithm. But decision tree has its own limitations, they are very unstable and a very  small-minded change in the input variables, such as addition of newer ones,  direct rebuilding and re-training of complete decision tree. In addition, they should have also focussed on how to enrich the input variables, by adding new derived variables that could enhance the  efficacy of the model.Ming, Huili and Yuwei 4 Bayesian network model has advantages and some short comings. It has the ability to product best results even when the input datasets are incomplete. In addition, it has the ability to take connections into account when predicting churn and to take prior knowledge into consideration. This algorithm also has the ability to effectively prevent over fitting. But if the    dataset is large, the structure learning of the Bayesian networks will be too difficult. Thus this model is not fit for telecom, where the dataset is always very large.Jiayin, Yangming, Yingying and Shuang 10 TreeLogit combines the advantages of both algorithms i.e. ADTree and logistic regression, thus it is both data-driven and assumption-driven and it has the capability of analyzing objects with incomplete information. Moreover, its efficiency is not affected by the  noisome quality data and it generates continues output with relatively low complexity.Jing and Xinghua 5 used Support Vector Machine algorithm for Churn Prediction. This algorithm is best if you have a limited number of sample records, but on the other hand its theory is very complex and there are many variations in it. So it is difficult to find the version which best suites your problem.decisivenessThere are multiple solutions available for customer churn prediction. Each has its own advantages and disadvantages. S   o a single solution might not be best for any organization. The organization may have to use the combination of algorithms and techniques to get the best results for churn prediction.  
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.