Uncategorized

2. REVIEW OF LITERATURE 2.1 Related works [1] A Survey on Churn

2. REVIEW OF LITERATURE

2.1 Related works

[1] A Survey on Churn Analysis in Various Business Domains JAEHYUN AHN1 , JUNSIK HWANG1 , DOYOUNG KIM1 , HYUKGEUN CHOI1 , AND SHINJIN KANG 2 , (Member, IEEE 2020) 10.1109/ACCESS.2020.3042657

Research describes churn prediction strategies that have been published so far in this work. In the sectors of Internet services, games, insurance, and management, churn prediction is applied. There is a significant variance in its definition and application because it has been used extensively to strengthen the predictability of numerous industry/academic domains. Study compared and contrasted the various definitions of churn used in the disciplines of business administration, marketing, information technology, telecommunications, newspapers, insurance, and psychology in this paper. Study identified and explained churn loss, feature engineering, and prediction models using this information. By combining scattered churn studies in industry/academic sectors, this study can be utilized to pick the definition of churn and its associated models fit for the service field that researchers are most interested in.

Keywords: Churn analysis, churn prediction, machine learning, log analysis, data mining, customer retention, customer relationship management.

Summary: This research compared churn prediction analytic strategies using log data in this study. In the sectors of Internet services and gaming, insurance, and management, churn analysis is employed. Churn prediction research usually starts to improve business outcomes. As a result, rather than evaluating a customer’s total churn, the time frame is utilized to identify potential churning consumers. Customer turnover costs are determined using CAC or CLV. In the past, academics employed statistics, graph theory, and classical machine learning algorithms to forecast customer turnover using survival analysis or time series analysis. Deep learning algorithms have recently been used to perform churn prediction analysis. Deep learning algorithms have been discovered to perform better than other algorithms. This is most likely due to computers collecting vast amounts of customer log data and the churn prediction model using the whole set of data. (Ahn, Hwang,Kim,Choi,& Kang,2020)

[2] Hrithik Jha , Dr. Vijayarajan V, Mobile Internet Throughput Prediction using Machine Learning Techniques, Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC 2020) IEEE Xplore Part Number: CFP20V90-ART; ISBN: 978-1-7281-5461-9,10.1109/ICOSEC49089.2020.9215436

Data streaming and internet usage are two of the most popular activities in the twenty-first century. Streaming can be used for a variety of purposes, including music, movies, and even video games. All of these activities are dependent on the user’s internet speed. While a broadband or cable connection is preferable, many consumers rely on mobile data to access services. When people utilize mobile data to access the internet, they may not always have the same internet conditions or speed, so the material they want to access should be dynamic and flexible to network changes. The research investigates strategies for predicting a user’s internet throughput using a variety of fundamental factors that are readily available. This would allow the server to anticipate a user’s throughput and offer data or content at a bit rate that is appropriate for that throughput, resulting in less buffering and excessive loading and, as a result, a higher quality of experience (QoE).

Keywords—Mobile internet, machine learning, decision trees

Summary: Because predicting Internet throughput is such an open-ended topic, this study proposes a naive method based on a variety of algorithms and generally available user characteristics. The proposed system is simple to adapt and implement. With additional training data, the model will improve, as it will be able to anticipate all types of network and physical circumstances. The vectorized inputs and a decision tree offered the best accuracy during the testing phase. Furthermore, by including multiple hyperparameters and model changes, the proposed study endeavor still has a long way to go. Currently, in the experimentation – The decision tree model is still the most promising method. Neural networks will be available for testing in the near future. using the weight distribution as a guide to the specifications of a decision tree. (Jha & Vijayarajan, 2020)

[3] Customer churn prediction using machine learning, Master of Science in Industrial Economics June 2021, Benjamin Ghaffari & Yasin Osman.

The fast development of technical infrastructure has altered the way businesses operate. Customer churning has become a huge problem and a threat to all businesses as a result of ongoing digitalization. With more and more products and services to choose from, customer churning has become a major problem and a threat to all businesses. Within the realm of financial administration in the business-to-business (B2B) setting, this study developed a machine learning-based churn prediction model for a subscription-based service provider. The goal of this research is to add to the body of knowledge in this area of churn prediction. The study examined two ensemble learners, XGBoost and Random Forest, with a single base learner, Nave Bayes, for the suggested model. The research keeps the rules of the design science approach, where creators utilized the machine learning to iteratively fabricate and assess the produced model, utilizing the measurements, exactness, accuracy, review, and F1-score. The information has been gathered from a membership based specialist co-op, inside the monetary organization area. Since the utilized dataset is imbalanced with a greater part of non-churners, discoveries assessed three different inspecting strategies, SMOTE, SMOTEENN and Random Under Sampler, to adjust the dataset. From the aftereffects of examination, they infer that machine learning is a helpful methodology for predicting churn. Furthermore, results show that ensemble learners perform better compared to single base learners and that a reasonable preparation dataset is supposed to work on the exhibition of the classifiers.

Keywords: Customer churning, Machine Learning, business-to-business, subscription-based companies.

Summary: Findings show that utilizing machine learning, customer turnover can be predicted with great accuracy. Furthermore, conclusions are ensemble learners, utilizing boosting and bagging, enhanced the performance of churn prediction model when compared to the single learner Nave Bayes model, which is consistent with past research. The XGBoost classifier, in particular, outperformed the others in terms of overall accuracy, precision, recall, and F1-score.

It ought to be noticed that the dataset used in this project report was unequal and tried a few examining strategies to check whether adjusting the preparation dataset would influence the result. Study reason that adjusting using test approaches well affects the issue viable. They likewise see those adjusting prompts lower classifier accuracy, or the pace of accurately characterized churners, and higher classifier recall, which compares to the model’s capacity to expect churn. (Benjamin Ghaffari and Yasin Osman, 2020)

[4] Machine Learning Based Customer Churn Prediction In Banking. 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA). doi:10.1109/iceca49313.2020.9297529, Rahman, M., & Kumar, V. (2020).

In every industry, the number of service suppliers is continually increasing. Customers in the banking sector nowadays have a plethora of options when it comes to deciding where to place their money. As a result, client turnover and engagement has risen to the top of most banks’ priority lists. At this study, a method for predicting client turnover in a bank is proposed, which employs machine learning techniques, which are a type of artificial intelligence. By studying customer behavior, the study encourages the investigation of the likelihood of turnover. In this study, the KNN, SVM, Decision Tree, and Random Forest classifiers are used. In addition, several feature selection algorithms have been implemented in order to identify the most relevant features and to assess system performance. The experiment was carried out on Kaggle’s churn modelling dataset. The findings are compared in order to establish a model that has a higher level of precision and predictability. As a result, when compared to other models, the Random Forest model after oversampling is more accurate.

Keywords:- Customer churn in Bank, k-Nearest Neighbor, Support Vector Machine, Decision Tree, Random Forest.

Summary: Client commitment has become one of the critical needs in the financial business, as it has in some other businesses. Banks should find client agitate potential when doable to cure this predicament. A few examinations on financial beat expectation are presently in progress. Clients’ churn rate is estimated in an assortment of ways by different associations using different sorts of information or data. It’s basic to have a framework that can anticipate client agitating in banking in a summed up style in the beginning phases. The framework should have the option to collaborate with both fixed and potential information sources that are not attached to a specific specialist organization. Furthermore, the model should be in a way that permits it to utilize insignificant information while giving most extreme forecast throughput. This examination means to meet these necessities. The objective of this examination is to foster the best model for anticipating beginning phase client turnover in a bank. The review had a little example size (10000 examples) and was seriously unequal. Genuine business bank information, then again, would be essentially more noteworthy. Both of these issues can be lightened somewhat by oversampling. For this review, the model took a gander at KNN, SVM, Decision Tree, and RF classifiers under different circumstances. At the point when the RF classifier is joined with oversampling, a predominant outcome is gotten (95.74 percent). Tree classifiers (Decision Tree and RF) have nothing to do with highlight choice techniques. As the outcomes show, highlight decrease (include determination) brings down the tree classifiers’ forecast score. Something else to note is that, not normal for different classifiers, oversampling in SVM brings down the score. The justification for this is that the Bank dataset is slanted. Thus, SVM can’t deal with the information actually. (Rahman and Kumar,2020).

When comparing with the previous work with our proposed work, they have used some of the machine learning algorithms like SVM, KNN, Decision tree, Random Forest, XG Boost classifier and also Neural Network. But where these algorithms fail is in providing the good accuracy and also the good classification. But, our algorithms are providing good accuracy score, precision and recall values and mainly our hybrid algorithm takes KNN and Gradient Boosting Classifier, with Logistic Regression as a supporting algorithm is providing 99% of accuracy.