Extracting interrelated information from road-related social media data

作者:Zhou, Shenghua; Ng, S. Thomas; Huang, Guanying; Dao, Jicao; Li, Dezhi*
来源:ADVANCED ENGINEERING INFORMATICS, 2022, 54: 101780.
DOI:10.1016/j.aei.2022.101780

摘要

The social media data (SMD) have been viewed as a potential and promising information source of road con-ditions. However, most existing SMD-based sensing approaches (SMDSAs) either ignore interrelations among information items (e.g., name, direction, and status of the road) or rely on rigid grammar rules to establish entities' interrelations. Additionally, current SMDSAs in the transportation domain are unable to link the extracted text-formatted information with domain-specific models (e.g., virtual road model, VRM). In order to fill such gaps, this work proposes an improved SMDSA of road conditions, which involves a three-stage (i.e., SMD classification, relation inference, and entity pair recognition) interrelated information extraction model, as well as a semantic converter to feed the SMD-provided text-formatted information into VRMs. The proposed SMDSA is demonstrated by the newly annotated datasets of tweets in Lexington, USA. The three-stage interrelated infor-mation extraction model outperforms conventional rule-based methods and deep-learning algorithms (e.g., Text CNN, Bi-LSTM, Piecewise CNN, and Capsule Net). The SMD-enabled VRM also preliminarily shows its capacity to optimize signal timings during incidents that change the road network topology. This work contributes to cir-cumventing the reliance on human-made rules during SMDSAs' development, bridging user-generated SMD with operable VRMs for potential real-world road management, and providing a standard tweet dataset annotated with interrelation triplets to help promote SMDSA studies.