Advanced Analysis

Time Series Analysis & Forecasting

Sales Forecast
Sales Forecast

Forecasting Results:

  • Model Type: Linear Regression
  • Test Set Performance: R² = -0.43, RMSE = £425,821.04
  • Interpretation: The negative R² indicates high variability in the data and insufficient historical data for accurate prediction
  • Pattern Identified: Linear trend with potential seasonal patterns
  • Forecast Horizon: 6 months ahead

Note: The forecasting performance could be improved with more historical data and by incorporating seasonal components.

Customer Segmentation Analysis

Customer Clusters
Customer Clusters
Cluster Comparison
Cluster Comparison

K-means Clustering Results:

Cluster 0: High-Value Regular Customers

  • Medium recency, high frequency, high monetary value
  • 25.3% of customers
  • Best group to target with loyalty and upgrade campaigns

Cluster 1: Recent Low-Value Customers

  • High recency, low frequency, low monetary value
  • 31.7% of customers
  • Suitable for relationship-building campaigns

Cluster 2: Inactive High-Value Customers

  • Low recency, medium frequency, high monetary value
  • 18.9% of customers
  • Suitable for reactivation and recovery campaigns

Cluster 3: Low-Value Occasional Customers

  • Low recency, low frequency, low monetary value
  • 24.1% of customers
  • Suitable for reactivation or exclusion from costly campaigns

Predictive Modeling

Feature Importance for Customer Spending
Feature Importance for Customer Spending
Churn Prediction Confusion Matrix
Churn Prediction Confusion Matrix

Customer Spending Prediction:

  • Model Type: Random Forest
  • Performance: R² = 0.68, RMSE = £5,279.98
  • Top Predictors: Order Count, Average Order Value, Total Quantity
  • Interpretation: The model explains 68% of the variance in customer spending

Customer Churn Prediction:

  • Model Type: Random Forest
  • Performance: 69% accuracy
  • Top Predictors: Order Count, Average Order Value, Purchase Frequency
  • Churn Definition: No purchase for 3+ months

Price Elasticity Analysis

Price vs. Quantity Relationship
Price vs. Quantity Relationship
Product Elasticity Comparison
Product Elasticity Comparison

Price Elasticity Findings:

  • Overall Price Elasticity: -0.36
  • Interpretation: Demand is relatively inelastic (absolute value less than 1), meaning changes in price lead to proportionally smaller changes in demand
  • Business Implication: There is room for price optimization without significantly impacting demand
  • Product Variation: Individual products show varying elasticity, with some being highly elastic (price-sensitive) and others being inelastic
  • Pricing Strategy: Products with low elasticity can sustain higher prices, while elastic products should be competitively priced

Customer Lifetime Value Analysis

CLV vs. Purchase Frequency
CLV vs. Purchase Frequency
CLV Distribution
CLV Distribution

CLV Insights:

  • Average CLV: £4,237.91
  • Top 10% Contribution: 50.3% of total revenue
  • CLV Prediction Model: R² = 0.68
  • Top CLV Predictors: Order Count, Total Quantity, Average Order Value
  • CLV Distribution: Highly skewed, with many low-value customers and few high-value ones
  • CLV-Frequency Correlation: Strong positive correlation (0.78)

Cohort Analysis

Customer Retention by Cohort
Customer Retention by Cohort

Cohort Analysis Findings:

  • First Month Drop: Significant drop in customer retention after the first month (common pattern in retail)
  • Long-term Retention: Older cohorts show better long-term retention rates
  • Seasonal Patterns: Cohorts acquired during certain months (e.g., November) show better retention
  • Best Performing Cohort: December 2010 cohort has the highest long-term retention
  • Worst Performing Cohort: April 2011 cohort has the lowest retention rates
  • Business Implication: Focus on improving retention in the critical first 3 months after acquisition