Big Data in Tourism and Hospitality Industry: Predictive Analytics of Hotel Room Trends
Abstract
This study investigates predictive analytics applications in the hospitality sector, specifically employing the XGBoost algorithm to predict room selection patterns based on guest data. Analysis of 900 booking records revealed that three variables—"Length of Stay," "Rating," and "Guest Type"—exhibited the strongest predictive power for room preferences. The implementation achieved 85% classification accuracy, revealing subtle correlations between customer characteristics and accommodation choices. Our findings suggest that hotels can leverage similar analytical frameworks to refine inventory management strategies, develop targeted promotional campaigns, and streamline operational workflows. The investigation also identified methodological limitations regarding class distribution in the dataset, suggesting that enhanced feature selection techniques could potentially reduce error rates in subsequent modeling approaches. This work contributes to the growing body of evidence demonstrating how advanced data analytics can drive competitive advantage and sustainability initiatives within tourism enterprises.
Keywords
Full Text:
PDFReferences
Al-Qudah, D. A., Al-Zoubi, A. M., Castillo-Valdivieso, P. A., & Faris, H. (2020). Sentiment analysis for e-payment service providers using evolutionary extreme gradient boosting. IEEE Access, 8, 189930–189944. https://doi.org/10.1109/ACCESS.2020.3032216
Al Jassim, R. S., Al Mansoory, S., Jetly, K., & Almaqbali, H. (2024). Enhancing Hotel Performance Prediction in Oman’s Tourism Industry: Insights from Machine Learning, Feature Analysis, and Predictive Factors. IEEE Conference on Evolving and Adaptive Intelligent Systems. https://doi.org/10.1109/EAIS58494.2024.10570014
Alotaibi, Y., Malik, M. N., Khan, H. H., Batool, A., ul Islam, S., Alsufyani, A., & Alghamdi, S. (2021). Suggestion mining from opinionated text of big social media data. Computers, Materials and Continua, 68(3), 3323–3338. https://doi.org/10.32604/cmc.2021.016727
Ampountolas, A., & Legg, M. (2024). Predicting daily hotel occupancy: a practical application for independent hotels. Journal of Revenue and Pricing Management, 23(3), 197–205. https://doi.org/10.1057/s41272-023-00445-7
Ampountolas, A., & Legg, M. P. (2021). A segmented machine learning modeling approach of social media for predicting occupancy. International Journal of Contemporary Hospitality Management, 33(6), 2001–2021. https://doi.org/10.1108/IJCHM-06-2020-0611
Azhar, A. N., & Khodra, M. L. (2020). Fine-tuning Pretrained Multilingual BERT Model for Indonesian Aspect-based Sentiment Analysis. 2020 7th International Conference on Advanced Informatics: Concepts, Theory and Applications, ICAICTA 2020. https://doi.org/10.1109/ICAICTA49861.2020.9428882
Bagherzadeh, S., Shokouhyar, S., Jahani, H., & Sigala, M. (2021). A generalizable sentiment analysis method for creating a hotel dictionary: using big data on TripAdvisor hotel reviews. Journal of Hospitality and Tourism Technology, 12(2), 210–238. https://doi.org/10.1108/JHTT-02-2020-0034
Bisoi, S., Roy, M., & Samal, A. (2020). Impact of artificial intelligence in the hospitality industry. International Journal of Advanced Science and Technology, 29(5), 4265–4276.
Bulchand-Gidumal, J., William Secin, E., O’Connor, P., & Buhalis, D. (2024). Artificial intelligence’s impact on hospitality and tourism marketing: exploring key themes and addressing challenges. Current Issues in Tourism, 27(14), 2345–2362. https://doi.org/10.1080/13683500.2023.2229480
Chen, M., Xu, H., Wu, Y., & Wu, J. (2024). Sentiment Analysis of Hotel Reviews based on BERT and XGBoost. Proceedings - 2024 3rd International Conference on Computer Technologies, ICCTech 2024, pp. 11–15. https://doi.org/10.1109/ICCTech61708.2024.00011
Cheng, Y., Zhao, B., Peng, S., Li, K., Yin, Y., & Zhang, J. (2024). Effects of cultural landscape service features in national forest parks on visitors’ sentiments: A nationwide social media-based analysis in China. Ecosystem Services, 67. https://doi.org/10.1016/j.ecoser.2024.101614
Christodoulou, E., Gregoriades, A., Pampaka, M., & Herodotou, H. (2021a). Application of Classification and Word Embedding Techniques to Evaluate Tourists’ Hotel-revisit Intention. International Conference on Enterprise Information Systems, ICEIS - Proceedings, Vol. 1, pp. 216–223. https://doi.org/10.5220/0010453502160223
Christodoulou, E., Gregoriades, A., Pampaka, M., & Herodotou, H. (2021b). Evaluating the Effect of Weather on Tourist Revisit Intention using Natural Language Processing and Classification Techniques. Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, pp. 2479–2484. https://doi.org/10.1109/SMC52423.2021.9658820
Febrian, Y. Y., Wijaya, D. R., & Ervina, E. (2024). Hotel Reservation Cancellation Prediction using Boosting Model. 2024 2nd International Conference on Software Engineering and Information Technology, ICoSEIT 2024, pp. 138–143. https://doi.org/10.1109/ICoSEIT60086.2024.10497479
Gao, J., Meng, Q., Zhang, L., & Hu, D. (2022). How does the ambient environment respond to the industrial heat island effects? An innovative and comprehensive methodological paradigm for quantifying the varied cooling effects of different landscapes. GIScience and Remote Sensing, 59(1), 1643–1659. https://doi.org/10.1080/15481603.2022.2127463
Herrera, A., Arroyo, Á., Jiménez, A., & Herrero, Á. (2024). Forecasting hotel cancellations through machine learning. Expert Systems, 41(9). https://doi.org/10.1111/exsy.13608
Ho, R. C., Withanage, M. S., & Khong, K. W. (2020). Sentiment drivers of hotel customers: a hybrid approach using unstructured data from online reviews. Asia-Pacific Journal of Business Administration, 12(3–4), 237–250. https://doi.org/10.1108/APJBA-09-2019-0192
Hu, T., & Song, H. (2023). Analysis of Influencing Factors and Distribution Simulation of Budget Hotel Room Pricing Based on Big Data and Machine Learning from a Spatial Perspective. Sustainability (Switzerland), 15(1). https://doi.org/10.3390/su15010617
Huang, J. (2020). Application of locally linear embedding algorithm on hotel data text classification. Journal of Physics: Conference Series, Vol. 1634. https://doi.org/10.1088/1742-6596/1634/1/012014
Kaur, A., Goyal, S., & Batra, N. (2024). Smart Hospitality Review: Using IoT and Machine Learning to Its Most Value in the Hotel Industry. 2024 International Conference on Automation and Computation, AUTOCOM 2024, pp. 320–324. https://doi.org/10.1109/AUTOCOM60220.2024.10486158
Kozlovskis, K., Liu, Y., Lace, N., & Meng, Y. (2023). Application of Machine Learning Algorithms To Predict Hotel Occupancy. Journal of Business Economics and Management, 24(3), 594–613. https://doi.org/10.3846/jbem.2023.19775
Liu, Z., Jiang, P., Wang, J., Du, Z., Niu, X., & Zhang, L. (2023). Hospitality order cancellation prediction from a profit-driven perspective. International Journal of Contemporary Hospitality Management, 35(6), 2084–2112. https://doi.org/10.1108/IJCHM-06-2022-0737
Mathew, E., & Abdulla, S. (2021). Machine learning to find purchase duration of chain hotels in the UAE. 2021 International Symposium on Networks, Computers and Communications, ISNCC 2021. https://doi.org/10.1109/ISNCC52172.2021.9615706
Pangestu, M. B., Barakbah, A. R., & Muliawati, T. H. (2020). Data analytics for hotel reviews in multi-language based on factor aggregation of sentiment polarization. IES 2020 - International Electronics Symposium: The Role of Autonomous and Intelligent Systems for Human Life and Comfort, 324–331. https://doi.org/10.1109/IES50839.2020.9231625
Reddy, Y. B., Prathima, C., Jaladi, S., Dinesh, B., & Arun Kumar, J. R. (2023). Machine Learning Techniques for Detecting and Analyzing Online Fake Reviews. Artificial Intelligence and Knowledge Processing: Improved Decision-Making and Prediction, pp. 271–278. https://doi.org/10.1201/9781003328414-23
Sánchez-Franco, M. J., & Aramendia-Muneta, M. E. (2023). Why do guests stay at Airbnb versus hotels? An empirical analysis of necessary and sufficient conditions. Journal of Innovation and Knowledge, 8(3). https://doi.org/10.1016/j.jik.2023.100380
Satu, M. S., Ahammed, K., & Abedin, M. Z. (2020). Performance Analysis of Machine Learning Techniques to Predict Hotel booking Cancellations in Hospitality Industry. ICCIT 2020 - 23rd International Conference on Computer and Information Technology, Proceedings. https://doi.org/10.1109/ICCIT51783.2020.9392648
Serrano, L., Ariza-Montes, A., Nader, M., Sianes, A., & Law, R. (2021). Exploring preferences and sustainable attitudes of Airbnb green users in the review comments and ratings: a text mining approach. Journal of Sustainable Tourism, 29(7), 1134–1152. https://doi.org/10.1080/09669582.2020.1838529
Shakhovska, K., Shakhovska, N., & Veselý, P. (2020). The sentiment analysis model of services providers’ feedback. Electronics (Switzerland), 9(11), 1–15. https://doi.org/10.3390/electronics9111922
Shallan, M. S., Moawad, I. F., El Naggar, R., & Montasser, H. (2024). Using Machine Learning Techniques to Maximize Profitability in the Hospitality Industry. 6th International Conference on Computing and Informatics, ICCI 2024, pp. 182–188. https://doi.org/10.1109/ICCI61671.2024.10485148
Sreenivas, G., Murthy, K. M., Prit Gopali, K., Eedula, N., & Mamatha, H. R. (2023). Sentiment Analysis of Hotel Reviews - A Comparative Study. 2023 IEEE 8th International Conference for Convergence in Technology, I2CT 2023. https://doi.org/10.1109/I2CT57861.2023.10126445
Taherkhani, L., Daneshvar, A., Amoozad Khalili, H., & Sanaei, M. R. (2024). Intelligent decision support system using nested ensemble approach for customer churn in the hotel industry. Journal of Business Analytics, 7(2), 83–93. https://doi.org/10.1080/2573234X.2023.2281317
Wang, J., Wu, J., Sun, S., & Wang, S. (2024). The relationship between attribute performance and customer satisfaction: An interpretable machine learning approach. Data Science and Management, 7(3), 164–180. https://doi.org/10.1016/j.dsm.2024.01.003
Yadav, S., Singh, R., Manigandan, E., Unni, M. V., Bhuvaneswari, S., & Girdharwal, N. (2023). Research on Factors Affecting Consumer Purchasing Behavior on E-commerce Website During COVID-19 Pandemic based on RBF-SVM Network. 2nd International Conference on Automation, Computing and Renewable Systems, ICACRS 2023 - Proceedings, pp. 371–376. https://doi.org/10.1109/ICACRS58579.2023.10404765
Yoo, M., Singh, A. K., & Loewy, N. (2024). Predicting hotel booking cancelation with machine learning techniques. Journal of Hospitality and Tourism Technology, 15(1), 54–69. https://doi.org/10.1108/JHTT-07-2022-0227
Zhao, M. A., & Jayadi, R. (2021). Forecasting Daily Visitors and Menu Demands in an Indonesian Chain Restaurant using Support Vector Regression Machine. AIMS 2021 - International Conference on Artificial Intelligence and Mechatronics Systems. https://doi.org/10.1109/AIMS52415.2021.9466036
DOI: https://doi.org/10.36256/ijtl.v6i1.474
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Indonesian Journal of Tourism and Leisure

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Indonesian Journal of Tourism and Leisure Indexed and Archieved By:
Indonesian Journal of Tourism and Leisure is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.