Computational Cost Reduction Strategies for Business Cases

Document Type : Research Paper

Author

Faculty of Economics and Business, Universitas Pelita Harapan, Indonesia

Abstract

Feature selection and parameter optimization are vital techniques in the data mining process, significantly impacting the computational costs of machine learning. Computational cost is a critical consideration in business analytics, making feature selection and parameter optimization research crucial for reducing operational costs. This study investigates the performance of 10 dimensionality reduction methods and 2 parameter optimization techniques in various business applications. The evaluation focuses on predictive accuracy and run time. The analysis reveals distinctive tendencies among the filtering methods, highlighting time-consuming behaviors in different business scenarios for Weight by Rule (WRul) and Weight by Relief (Wrel). Additionally, the study proposes a cost-effective approach to parameter optimization by utilizing grid search and evolutionary algorithms, particularly when the optimal parameter range is unknown.

Keywords

Main Subjects


Boser, B.E., Guyon, I.M. and Vapnik, V.N., 1992, July. A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory (pp. 144-152). ACM.
Cao, L.J., Chua, K.S., Chong, W.K., Lee, H.P. and Gu, Q.M., 2003. A comparison of PCA, KPCA, and ICA for dimensionality reduction in support vector machine. Neurocomputing55(1-2), pp.321-336.
Fernandes, K., Vinagre, P. and Cortez, P., 2015, September. A proactive, intelligent decision support system for predicting the popularity of online news. In Portuguese Conference on Artificial Intelligence (pp. 535-546). Springer, Cham.
Feurer, M. and Hutter, F., 2019. Hyperparameter optimization. In Automated Machine Learning (pp. 3-33). Springer, Cham.
Garg, A., 2020. Comparing Machine Learning Algorithms and Feature Selection Techniques to Predict Undesired Behavior in Business Processes and Study of Auto ML Frameworks.
Guyon, I. and Elisseeff, A., 2003. An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), pp.1157-1182.
Guyon, I. and Elisseeff, A., 2003. An introduction to variable and feature selection. Journal of machine learning research3(Mar), pp.1157-1182.
Igel, C., 2014. No free lunch theorems: Limitations and perspectives of metaheuristics. In Theory and principled methods for the design of metaheuristics (pp. 1-23). Springer, Berlin, Heidelberg.
Kira, K. and Rendell, L.A., 1992, July. The feature selection problem: Traditional methods and a new algorithm. In Aaai (Vol. 2, pp. 129-134).
Kou, G., Yang, P., Peng, Y., Xiao, F., Chen, Y. and Alsaadi, F.E., 2020. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Applied Soft Computing86, p.105836.
Martiniano, A., Ferreira, R.P., Sassi, R.J. and Affonso, C., 2012, June. Application of a neuro-fuzzy network in the prediction of absenteeism at work. In 7th Iberian Conference on Information Systems and Technologies (CISTI 2012) (pp. 1-4). IEEE.
Moro, S., Cortez, P. and Rita, P., 2014. A data-driven approach to predicting the success of bank telemarketing. Decision Support Systems62, pp.22-31.
Pal SK, Mitra P (2004) Pattern Recognit Algorithms Data Min, 1st edn. Chapman and Hall/CRC, London
Shearer, C., 2000. The CRISP-DM model: the new blueprint for data mining. Journal of data warehousing, 5(4), pp.13-22.
Yu, T. and Zhu, H., 2020. Hyper-parameter optimization: A review of algorithms and applications. arXiv preprint arXiv:2003.05689.
Zhu, W., Feng, J. and Lin, Y., 2014, February. Using gini-index for feature selection in text categorization. In 2014 International Conference on Information, Business and Education Technology (ICIBET 2014). Atlantis Press