Bang Zhang, Lelin Zhang, Ting Guo, Yang Wang, Fang Chen
International Conference on Knowledge Discovery & Data Mining
Urbanization is a global trend that we have all witnessed in the past decades. It brings us both opportunities and challenges. On the one hand, urban system is one of the most sophisticated social-economic systems that is responsible for efficiently providing supplies meeting the demand of residents in various of domains, e.g., dwelling, education, entertainment, healthcare, etc. On the other hand, significant diversity and inequality exist in the development patterns of urban systems, which makes urban data analysis difficult. Different urban regions often exhibit diverse urbanization patterns and provide distinct urban functions, e.g., commercial and residential areas offer significantly different urban functions. It is desired to develop the data analytic capabilities for discovering the underlying cross-domain urbanization patterns, clustering urban regions based on their function similarity and predicting region popularity in specified domains. Previous studies in the urban data analysis area often just focus on individual domains and rarely consider cross-domain urban development patterns hidden in different urban regions. In this paper, we propose the infinite urbanization process (IUP) model for simultaneous urban region function discovery and region popularity prediction. The IUP model is a generative Bayesian nonparametric process that is capable of describing a potentially infinite number of urbanization patterns. It is developed within the supervised topic modelling framework and is supported by a novel hierarchical spatial distance dependent Bayesian nonparametric prior over the spatial region partition space. The empirical study conducted on the real-world datasets shows promising outcome compared with the state-of-the-art techniques.