Style Sampler

Layout Style

Patterns for Boxed Mode

Backgrounds for Boxed Mode

Search News Posts

  • General Inquiries 1-888-555-5555

  • Support 1-888-555-5555

anaplatform Data Consultancy
Churn Detection

We are with you in your Digital Transformation Journey with our advanced analytics solutions.

Churn Detection

Customer Churn Detection

Case Study - Using Data Analysis for Detecting Customer Churn

The dataset includes 14 features about the customers and their products at a bank along with 10,000 customers (i.e. rows). Using the features, the goal is to determine whether a customer will churn (exited = 1). We therefore want to create a supervised learning method to carry out a classification task using machine learning.

  • CustomerId—contains random values and has no effect on customer leaving the bank.
  • Surname—the surname of a customer has no impact on their decision to leave the bank.
  • CreditScore—can have an effect on customer churn, since a customer with a higher credit score is less likely to leave the bank.
  • Geography—a customer’s location can affect their decision to leave the bank.
  • Gender—it’s interesting to explore whether gender plays a role in a customer leaving the bank.
  • Age—this is certainly relevant, since older customers are less likely to leave their bank than younger ones.
  • Tenure—refers to the number of years that the customer has been a client of the bank. Normally, older clients are more loyal and less likely to leave a bank.
  • Balance—Good indicator of customer churn, as people with a higher balance in their accounts are less likely to leave the bank compared to those with lower balances.
  • NumOfProducts—refers to the number of products that a customer has purchased through the bank.
  • HasCrCard—denotes whether or not a customer has a credit card. This column is also relevant, since people with a credit card are less likely to leave the bank.
  • IsActiveMember—active customers are less likely to leave the bank.
  • EstimatedSalary—as with balance, people with lower salaries are more likely to leave the bank compared to those with higher salaries.
  • Exited—whether or not the customer left the bank.

Churn prediction is likely to have an imbalance class distribution. The number of customers who churned (i.e. left) is usually much less than the number of customers who did not churn. We can check the distribution of values with the value_counts function. We see that there is imbalance in the target variable. There is an imbalance in the target variable (“Exited”). It is important to eliminate the imbalance.


Churn prediction is likely to have an imbalance class distribution. The number of customers who churned (i.e. left) is usually much less than the number of customers who did not churn. We can check the distribution of values with the value_counts function. We see that there is imbalance in the target variable. There is an imbalance in the target variable (“Exited”). It is important to eliminate the imbalance.


Churn prediction is likely to have an imbalance class distribution. The number of customers who churned (i.e. left) is usually much less than the number of customers who did not churn. We can check the distribution of values with the value_counts function. We see that there is imbalance in the target variable. There is an imbalance in the target variable (“Exited”). It is important to eliminate the imbalance.


The essential decision we need to make is how many units or Product X to produce each week. That's our decision variable which we denote as x. The weekly revenues are then $270x. The costs include the value of the raw materials and each form of labor. If we produce x units a week, then the total cost is $40x. which means there is a profit earned on each unit of X produced, so let's produce as many as possible.

There are three constraints that limit how many units can be produced. There is market demand for no more than 40 units per week. Producing x = 40 units per week will require 40 hours per week of Labor A, and 80 hours per week of Labor B. Checking those constraints we see that we have enough labor of each type, so the maximum profit will be $1600 per week. What we conclude is that market demand is the 'most constraining constraint.' Once we've made that deduction, the rest is a straightforward problem that can be solved by inspection.

Transaction Value

The graph below samples one such visualization that you would use to capture a trend hidden in the sample data set shared earlier on in the case study.

DATE

CAC

DAX

DJI

SP

FTSE

NIKKEI

NASDAQ

VIX

04.01.2010

4013,96

6048,29

10583,95

4013,96

6048,29

10583,95

4013,96

6048,29

05.01.2010

4013,96

6048,29

10583,95

4013,96

6048,29

10583,95

4013,96

6048,29

06.01.2010

4013,96

6048,29

10583,95

4013,96

6048,29

10583,95

4013,96

6048,29

07.01.2010

4013,96

6048,29

10583,95

4013,96

6048,29

10583,95

4013,96

6048,29

08.01.2010

4013,96

6048,29

10583,95

4013,96

6048,29

10583,95

4013,96

6048,29

...

...

...

...

...

...

...

...

...

08.01.2022

4013,96

6048,29

10583,95

4013,96

6048,29

10583,95

4013,96

6048,29


It shows the status of the S&P 500 stock index published by Standard & Poor's.


Click on Calculate Return button to see portfolio return results and maximum benefit.



Case Study 2 - Portfolio Return Calculation with Multiple Regression

In this case study, Prediction modeling was done with Multivariate regression.


Click on the calculate button to see the portfolio return result!





Result:


Case Study - Access to Financial Data

As an example, downloading financial data from Yahoo Finance and investing.com via Python for company A is shown.