Department of Mathematics
Traffic data is exceedingly useful for road network management and is typically massive in size and full of errors, noise and abnormal traffic behaviors, which are regarded as outliers because they are inconsistent with the rest of the data. Hence the outlier detection (OD) problem is non-trivial. A novel method is presented for detecting outliers in large-scale traffic data by modeling it as a Dirichlet Process Mixture Model (DPMM). In essence, input traffic signals are truncated and mapped to a covariance signal descriptor, then its vector dimension is further reduced by Principal Component Analysis (PCA). This modified signal vector is then modeled by a DPMM. As traffic signals generally share heavy spatialtemporal similarities within signals and among various categories of traffic signals, classical OD methods are incapable to distinguish these similarities and to discern their differences. The contribution of this paper is to represent real-world traffic data (764,027 vehicles) by a generic DPMM-based method to perform an unsupervised OD to achieve a detection rate of 96.67% in a 10-fold cross validation.
Dirichlet process mixture models, outlier detection, unsupervised learning, traffic flow analysis
Source Publication Title
IET Intelligent Transport Systems
IET (Institution of Engineering and Technology)
© The Institution of Engineering and Technology 2015
Three grants have supported this research: the HKSAR Research Grant Council, China: Project HKU754109 and Project HKBU12201814, and the HKBU FRG: FRG1/12-13/075.
Link to Publisher's Edition
Ngan, Henry Y. T., Nelson H. C. Yung, and Anthony G. O. Yah. "Outlier detection in traffic data based on the Dirichlet process mixture model." IET Intelligent Transport Systems 9.7 (2015): 773-781.