Exploring the Uncertainty of Activity Zone Detection Using Digital Footprints with Multi-Scaled DBSCAN

Download manuscript-final

While exploring mobility patterns based on digital footprints captured from social networks, the density-based spatial clustering of applications with noise (DBSCAN) method is often used to identify activity zones which an individual regularly visits. However, DBSCAN is sensitive to the two parameters, including the search radius of a cluster (eps), and the minimum number of points (minpts). This paper first discusses the uncertainty while detecting an individual’s activity zones through digital footprints. An improved density-based clustering algorithm for mobility analysis known as Multi-Scaled DBSCAN (M-DBSCAN), is then presented to mitigate the detection uncertainty of clusters produced by DBSCAN at different scales of density and cluster size. Next, we demonstrate that M-DBSCAN iteratively calibrates suitable local eps and minpts values instead of using one global parameter setting as DBSCAN for detecting clusters of varying densities, and proves to be very effective for detecting potential activity zones (clusters) with the historic geo-tagged tweets of selected users. Besides, M-DBSCAN can significantly reduce the noise ratio (the proportion of trajectory points not included in any cluster) by identifying all points capturing the activities performed in each zone. Using the historic geo-tagged tweets of a large number of users in Madison, Wisconsin and Washington, D.C., the results of M-DBSCAN and DBSCAN with a minpts value of 4 and varying eps values reveal that: 1) M-DBSCAN can capture dispersed clusters with low density of points, and therefore detecting more activity zones for each user and resulting in a lower noise ratio; 2) A value of 40m or higher should be used for eps in order to reduce the possibility of collapsing distinctive activity zones, and ensure a relatively low noise ratio during the clustering process; and 3) A value between 200m to 300m is recommended for eps while using DBSCAN for detecting activity zones.