Customer Loyalty Prediction Using RFM and Product Diversification Features Through Cluster-Derived NPS Proxy Labels
Keywords:
RFM, Clustering, NPS Proxy, Clustering; Classification, Customer LoyaltyAbstract
Knowing and forecasting the level of customer loyalty is vital to generate operational efficiency and effective business strategy in the FMCG’s distribution industry. This article combines behavioral clustering and supervised machine learning to derive a hybrid analytical model to predict customer loyalty with a Net Promoter Score (NPS) Proxy based on transactional behavior. Based on sales data in PT XYZ (Mojokerto) and real datasets, RFM (Recency, Frequency, Monetary), product diversification were used to feature acquisition of rawdata. Principal Component Analysis (PCA) was employed for dimensionality reduction, and subsequently K-Means, Agglomerative, and DBSCAN clustering techniques were compared by means of internal validity metrics. Agglomerative Clustering was the most successful algorithm with Silhouette Score (0.8919) and Calinski–Harabasz Index (2585.11), as well as low Davies–Bouldin Index (0.5266), which means it produced very compact and distinct clusters. These three clusters were translated into NPS Proxy segments: Detractor, Passive, and Promoter by applying to them different depending on their response behavior. Because of strong class imbalance (98.8% Detractors, 1.06% Passives, 0.12% Promoters), SMOTE was applied to balance the training set before classification. The grid search was applied to tune hyperparameters for KNN, SVM, Gradient Boosting, Logistic Regression and Random Forest machine learning models. In these classifiers, the Random Forest classifier produced highest prediction results for F1-macro = 1.00, Precision = 1.00 and Recall = 1.00 in both training and testing data indicating outstanding generalization with ease in discrimination of overfitting use case. In general, the proposed framework effectively translates transactional data into action-oriented loyalty insights and lead to dependable prediction of customer NPS Proxy labels. Implications of the results Findings underline the relevance of integrating RFM analysis, product-level behavior, clustering validation and machine learning classification in pursuing a scalable customer loyalty management for FMCG distribution environments
References
[1] C. Aurelia and N. Kusumawati, “The Effect of Online Customer Experience Toward Customer Satisfaction and Customer Loyalty,” 2024. doi: 10.2991/978-94-6463-234-7_57.
[2] R. A. Kamaroellah, A. Eliyana, and R. Mubarak, “Service Distribution And Satisfaction Toward Customer Loyalty,” Amwaluna: Jurnal Ekonomi dan Keuangan Syariah, vol. 5, no. 1, 2021, doi: 10.29313/amwaluna.v5i1.6021.
[3] J. Le Bon, “The Customer Compromise and ComproScore: Toward a New Concept and Metric to Assess Customer Satisfaction, Buying Process, and Loyalty: An Abstract,” in Developments in Marketing Science: Proceedings of the Academy of Marketing Science, 2019. doi: 10.1007/978-3-030-02568-7_105.
[4] Q. Yang and L. Young-Chan, “What Drives the Digital Customer Experience and Customer Loyalty in Mobile Short-Form Video Shopping? Evidence from Douyin (TikTok),” Sustainability, vol. 14, no. 17, 2022, doi: 10.3390/su141710890.
[5] K. Yum, J. Kim, “The Influence of Perceived Value, Customer Satisfaction, and Trust on Loyalty in Entertainment Platforms,” Applied Science, vol. 14, no. 13, 2024, doi: 10.3390/app14135763.
[6] W. Guo, F. Liu, and X. Zhang, “Research on Insurance Customer Segmentation Model and Marketing Strategy Based on Big Data and Machine Learning,” in ACM International Conference Proceeding Series, 2021. doi: 10.1145/3469213.3471326.
[7] X. Wang and L. Liu, “Customer segmentation and marketing strategy of commercial banks based on CLV,” in Advances in Intelligent and Soft Computing, 2012. doi: 10.1007/978-3-642-27334-6_30.
[8] X. Li and Y. S. Lee, “Customer Segmentation Marketing Strategy Based on Big Data Analysis and Clustering Algorithm,” Journal of Cases on Information Technology, vol. 26, no. 1, 2024, doi: 10.4018/JCIT.336916.
[9] H. Do and S. Lee, “Marketing Segmentation Strategy Based on Internal Customers,” The Korean Academic Association of Business Administration, vol. 37, no. 1, 2018, doi: 10.18032/kaaba.2018.31.7.1307.
[10] F. M. Hilmy, R. A. Nurhaliza, M. Q. Huzyan Octava, and G. Alfian, “Web-based E-Commerce Customer Segmentation System Using RFM and K-Means Model,” in 2023 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies, 3ICT 2023, 2023. doi: 10.1109/3ICT60104.2023.10391650.
[11] J. Pechter and A. Kuusik, “NPS from the customer’s perspective: The influence of the recent experience,” International Journal of Market Research, vol. 66, no. 2–3, 2024, doi: 10.1177/14707853231214188.
[12] T. Ho and V. H. Nguyen, “Customer Analytics Using Sentiment Analysis and Net Promoter Score,” in Encyclopedia of Data Science and Machine Learning, 2022. doi: 10.4018/978-1-7998-9220-5.ch062.
[13] N. Sivabrovornvatana, “Utilizing Net Promoter Score To Assess Customer Satisfaction And Brand Loyalty In The Real Estate Industry Of Thailand,” Journal of Business Leadership and Management, vol. 1, no. 1, 2023, doi: 10.59762/jblm845920461120231009142231.
[14] L. Eger and M. Mičík, “Customer-oriented communication in retail and Net Promoter Score,” Journal of Retailing and Consumer Services, vol. 35, 2017, doi: 10.1016/j.jretconser.2016.12.009.
[15] M. Barath, “Net Promoter Score as Measuring Instrument of Customer Brand Loyalty,” in Studies in Systems, Decision and Control, vol. 421, Springer Science and Business Media Deutschland GmbH, pp. 363–377, 2022. doi: 10.1007/978-3-030-97008-6_16.
[16] B. Hardianto and S. Wijaya, “Analysis of The Impact of Net Promoter Score on Financial Performance With Customer Loyalty As Mediation,” International Journal of Social Service and Research, vol. 3, no. 6, pp. 1478–1488, Jun. 2023, doi: 10.46799/ijssr.v3i6.401.
[17] L. Abednego, C. E. Nugraheni, and A. Salsabina, “Customer Segmentation: Transformation from Data to Marketing Strategy,” Conference Series, vol. 4, no. 1, 2023, doi: 10.34306/conferenceseries.v4i1.645.
[18] R. Mandrai, P. Sharma, and B. Borkakaty, “Customer Risk Prediction: A Machine Learning Ensemble Approach,” in International Conference on Electrical, Computer and Energy Technologies, ICECET 2023, 2023. doi: 10.1109/ICECET58911.2023.10389208.
[19] Y. Suh, “Discovering customer segments through interaction behaviors for home appliance business,” Journal of Big Data, vol. 12, 2025. doi: 10.1186/s40537-025-01111-y.
[20] J. M. A. M. Ramos and F. A. Silva, “Customer Lifetime Value Prediction: A Machine Learning Approach,” 2023. doi: 10.5753/eniac.2023.234262.
[21] I. Z. P. Hamdan, M. Othman, Y. M. M. Hassim, S. Marjudi, and M. M. Yusof, “Customer Loyalty Prediction for Hotel Industry Using Machine Learning Approach,” International Journal on Informatics Visualization, vol. 7, no. 3, 2023, doi: 10.30630/joiv.7.3.1335.
[22] S. Tavassoli and H. Koosha, “Hybrid ensemble learning approaches to customer churn prediction,” Kybernetes, vol. 51, no. 3, 2022, doi: 10.1108/K-04-2020-0214.
[23] P. Lalwani, M. K. Mishra, J. S. Chadha, and P. Sethi, “Customer churn prediction system: a machine learning approach,” Computing, vol. 104, no. 2, 2022, doi: 10.1007/s00607-021-00908-y.
[24] Y. Beeharry and R. T. Fokone, “Hybrid approach using machine learning algorithms for customers’ churn prediction in the telecommunications industry,” Concurr Comput, vol. 34, no. 4, 2022, doi: 10.1002/cpe.6627.
[25] K. Li, C. Xue, Z. Zhao, M. Zhu, X. Cui, S. Xu, and J. Zou, “Deciphering modern customer loyalty: a machine learning approach,” 2023. doi: 10.1117/12.3013297.
[26] H. F. Lee and M. Jiang, “A Hybrid Machine Learning Approach for Customer Loyalty Prediction,” in Communications in Computer and Information Science, 2021. doi: 10.1007/978-981-16-5188-5_16.
[27] I. Z. P. Hamdan and M. Othman, “Predicting Customer Loyalty Using Machine Learning for Hotel Industry,” Journal of Soft Computing and Data Mining, vol. 3, no. 2, 2022, doi: 10.30880/jscdm.2022.03.02.004.
[28] C. Schröer, F. Kruse, and J. M. Gómez, “A Systematic Literature Review on Applying CRISP-DM Process Model”, Procedia Computer Science, vo. 181, pp526-534, 2021. doi: 10.1016/j.procs.2021.01.199.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Alina Rusyda Hariadi, Wiyli Yustanti, Unung Istopo Hartanto (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

