摘要

While prediction for large-scale multidimensional spatio-temporal data is widely used in applications such as business analytics and open source metrics, service data collected from these applications often have values that are missing from the data. The accuracy of prediction results requires that the prediction model used is highly robust and scalable, something that is typically lacking in traditional prediction models. To this end, we propose a back temporal autoregressive matrix factorization (BTAMF) framework, which can construct an efficient prediction model for time series data with missing values by mining the trend and periodic characteristics of the prediction results. It first uses the target data and its left neighbors to construct trend autoregressive and then uses the target data and its associated periodic data to construct periodic autoregressive to mine the trend and periodicity properties of time series data. We then integrate the tensor/matrix factorization and both autoregressive models into a single optimization model for missing value completion and prediction through optimization of the objective functions. This scheme, which exploits global and local information of the target data, can directly predict the data without pre-completing the missing values, which provides a strong guarantee of the temporal integrity of the time series data. To validate the efficient completion and prediction effects of BTAMF in real-world scenarios, we select ten real-world behavioral datasets from GitHub and seven comparison models for experiments. Our experimental results show that BTAMF has a stronger advantage over current state-of-the-art methods in terms of scalability and prediction accuracy.

  • 单位
    苏州大学

全文