The International Arab Journal of Information Technology (IAJIT)

A Decision Support System Using Demographic Issues: A Case Study in Turkey

The demographic distribution of people by cities is an important parameter to address the people’s behaviour. To distinguish people behaviour is useful for companies to understand the customer behaviour. In this article, a case study covering all 81 cities in Turkey and measuring 35 topics for each of them is handled. By using these topics and cities, it is investigated that how the cities are clustered. Because its efficiency, the Agglomerative hierarchical clustering and the K- medoids clustering methods in rapidminer data mining software are used to cluster the data. To measure the efficiency of the agglomerative clustering algorithm, the Cophenetic Correlation Coefficient (CPCC) is used. After clustering, the results are inserted into a geographic information system to depict the results in a Turkey map. The results show that, the cities distributed in the same geographical areas are in the same clusters with some exempts. On the other hand, some cities those are in different provinces show the same behaviour. The results of the study can also be used as a decision support system for a customer relations management.

[1] Anselin L. and Getis A., Spatial Statistical Analysis and Geographic Information Systems, The Annals of Regional Science, vol. 26, no. 1, pp. 19-33, 1992.

[2] Chiang W., To Mine Association Rules of Customer Values via a Data Mining Procedure with Improved Model: An Empirical Case Study, Expert Systems with Applications, vol. 38, no. 3, pp. 1716-1722, 2011.

[3] Cranshaw J., Schwartz R., Hong J., and Sadeh N., The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City, in Proceeding of 6th International AAAI Conference on Weblogs and Social Media, Dublin, pp. 58-65, 2012.

[4] Deng Z., Lu Y., Wei K., and Zhang J., Understanding Customer Satisfaction and Loyalty: An Empirical Study of Mobile Instant Messages in China, International Journal of Information Management, vol. 30, no. 4, pp. 289- 300, 2010.

[5] Dvoroznak M., tvorba/rapidminer-clustering_performance_ plugin-average_silhouette- cophenetic_coefficient, Last Visited 2014.

[6] Felici G. and Vercellis C., Mathematical Methods for Knowledge Discovery and Data Mining, Idea Group Reference, 2007.

[7] Garla S., Chakraborty G., and Gaeth G., Comparison of K-means, Normal Mixtures and Probabilistic-D Clustering for B2B Segmentation using Customers Perceptions, in Proceeding of the SAS Global Forum, Las Vegas, pp. 1-8, 2012.

[8] Gorsevski P., Donevska K., Mitrovski C., and Frizado J., Integrating Multi-criteria Evaluation Techniques with Geographic Information Systems for Landfill Site Selection: A Case Study Using Ordered Weighted Average, Waste Management, vol. 32, no. 2, pp. 287-296, 2012.

[9] Han J., Kamber M., and Pei J., Data Mining: Concepts and Techniques, Morgan Kaufmann Press, 2006.

[10] Hossain M. and Leo S., Customer Perception on Service Quality in Retail Banking in Middle East: The Case of Qatar, International Journal of Islamic and Middle Eastern Finance and Management, vol. 2, no. 4, pp. 338-350, 2009.

[11] Kaur J. and Gupta G., Optimized Clustering Algorithm with Hybrid K-Means and Hierarchical Algorithms, International Journal for Multi Disciplinary Engineering and Business Management, vol. 2, no. 1, pp. 4-7, 2014.

[12] Khan K., Baharudin B., and Khan, A. Identifying Product Features from Customer Reviews Using Hybrid Dependency Patterns, The International Arab Journal of Information Technology, vol. 11, no. 3, pp. 281-286, 2014.

[13] Kuo Y., Wu C., and Deng W., The Relationships Among Service Quality, Perceived Value, Customer Satisfaction, and Post-purchase Intention in Mobile Value-added Services, Computers in Human Behavior, vol. 25, no. 4, pp. 887-896, 2009.

[14] Madhulatha T., Comparison between K-Means and K-Medoids Clustering Algorithms, 400 The International Arab Journal of Information Technology, Vol. 14, No. 3, May 2017 Advances in Computing and Information Technology Communications in Computer and Information Science, vol. 198, pp. 472-481, 2011.

[15] Mavi B., Research on Livable Cities in Turkey, CNBC-E Business Magazine, vol. 9, pp. 64-98 2011.

[16] Musso J., The Political Economy of City Formation in California: Limits to Tiebout Sorting, Social Science Quarterly, vol. 82, no. 1, pp. 139-153, 2001.

[17] Pamuk A., Geography of Immigrant Clusters in Global Cities: A Case Study of San Francisco, International Journal of Urban and Regional Research, vol. 28, no. 2, pp. 287-307, 2004.

[18] Prabhu S. and Venkatesan N., Data Mining and Warehousing, New Age International Publishers, 2006.

[19] Schwarz N., Urban Form Revisited-Selecting Indicators for Characterizing European Cities, Landscape and Urban Planning, vol. 96, no. 1, pp. 29-47, 2010.

[20] Teknomo K., Hierarchical Clustering Tutorial, ng/index.html, Last Visited 2014.

[21] Wu R. and Chou P., Customer Segmentation of Multiple Category Data in E-commerce Using a Soft-Clustering Approach, Electronic Commerce Research and Applications, vol. 10, no. 3, pp. 331-341, 2011. Suat Secgin gained his BSc degree from Dokuz Eylul University at the department of Electrical and Electronics Engineering in 1992. He also gained his MSc degree from the same university s Computer Engineering Department with the thesis of Mobile Networks and Data Access Strategies. Currently he is a Phd student in the Dokuz Eylul University Computer Engineering department. He is a member of Electrical Engineering Camperships and also has been working for Turk Telekom. Some of his research areas is traffic engineering in packet based networks, wireless networking and data mining. Gokhan Dalkilic received BS degree in Computer Engineering from Ege University, Izmir, Turkey, in 1997, MS degrees in Computer Science from University of Southern California, Los Angeles, USA, in 1999, and from Ege University International Computing Institute, Izmir, Turkey, in 2001, and Ph.D. degree in Computer Engineering from Dokuz Eylul University, Izmir, Turkey, in 2004. He had been a visiting lecturer in University of Central Florida, Orlando, USA from January 2003 to December 2003. He has been an Assistant Professor of the Department of Computer Engineering of Dokuz Eylul University, Izmir, Turkey since 2004. His research areas are cryptography, statistical language processing and computer networks. His fields of studies are lightweight authentication, cryptography, and NLP. He has over 50 papers and four books to his name. Appendix A. The CNBC-E Data The total numbers of 35 rows are shown in the following. The missing values for some cities were replaced with the average number of other cities. The ranking values of the data constitute the table that is being used. 1. Unemployment rate. 2. Amount of tax per capita. 3. Deposit amount per capita. 4. Public expenditure per capita. 5. Number of cars per adult. 6. Number of house per capita. 7. Competitiveness. 8. Average per capita expenditure for rental. 9. Air utilization rate. 10. Household per capita consumption of electricity. 11. Rate of university graduates. 12. Literacy rate. 13. Rate of pre-school students per teacher. 14. Number of pre-school students per classroom. 15. Rate of primary school students per teacher. 16. Number of primary school students per classroom. 17. Rate of secondary school students per teacher. 18. Number of secondary school students per classroom. 19. Number of people per doctor. 20. Number of hospital beds per capita. 21. Crime rate. 22. Earthquake risk. 23. Rate of traffic accident per vehicle. 24. Forest area ratio. 25. Air quality. 26. Divorce ratio. 27. Rate of shopping centers per urban area and population. 28. Rate of 5 stars hotels per urban area and population. 29. Rate of licensed sportsmen per population. 30. Rate of number of library and art work per population. 31. Rate of number of visitors to museums per population. 32. Rate of theatre audience per population. 33. Theatre seat capacity rate per population. A Decision Support System Using Demographic Issues 401 34. Rate of cinema audience per population. 35. Customer satisfaction performance order. B. Tree View of the Clusters for Different K Values Figure 15. Cluster 147 (Tree depth is 2). Figure 16. Cluster 149 (Tree depth is 2). Figure 17. Cluster 155 (Tree depth is 2). Figure 18. Cluster 158 (Tree depth is 2). C. Further Clusters Figure 19. Nested clusters for k=8. Figure 20. The city distribution for k=8. Figure 21. The city distribution for k=16. Figure 22. The city distribution for k=27. Figure 23. The city distribution for k=40.