The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2013 vol.25)
pp: 877-892
Lidan Shou , Zhejiang University, Hangzhou
Xuan Shang , Zhejiang University, Hangzhou
Ke Chen , Zhejiang University, Hangzhou
Gang Chen , Zhejiang University, Hangzhou
Chao Zhang , Zhejiang University, Hangzhou
ABSTRACT
Time series is an important form of data available in numerous applications and often contains vast amount of personal privacy. The need to protect privacy in time-series data while effectively supporting complex queries on them poses nontrivial challenges to the database community. We study the anonymization of time series while trying to support complex queries, such as range and pattern matching queries, on the published data. The conventional k-anonymity model cannot effectively address this problem as it may suffer severe pattern loss. We propose a novel anonymization model called (k, P)-anonymity for pattern-rich time series. This model publishes both the attribute values and the patterns of time series in separate data forms. We demonstrate that our model can prevent linkage attacks on the published data while effectively support a wide variety of queries on the anonymized data. We propose two algorithms to enforce (k, P)-anonymity on time-series data. Our anonymity model supports customized data publishing, which allows a certain part of the values but a different part of the pattern of the anonymized time series to be published simultaneously. We present estimation techniques to support query processing on such customized data. The proposed methods are evaluated in a comprehensive experimental study. Our results verify the effectiveness and efficiency of our approach.
INDEX TERMS
Couplings, Databases, Publishing, Pattern matching, Data models, Data privacy, Correlation, time series, Privacy, anonymity, pattern
CITATION
Lidan Shou, Xuan Shang, Ke Chen, Gang Chen, Chao Zhang, "Supporting Pattern-Preserving Anonymization for Time-Series Data", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 4, pp. 877-892, April 2013, doi:10.1109/TKDE.2011.249
REFERENCES
[1] CMU Graphics Lab Motion Capture Database, http:/mocap. cs.cmu.edu/, 2012.
[2] O. Abul, M. Atzori, F. Bonchi, and F. Giannotti, "Hiding Sequences," Proc. IEEE 23rd Int'l Conf. Data Eng. (ICDE) Workshops, pp. 147-156, 2007.
[3] C.C. Aggarwal and P.S. Yu, "A Condensation Approach to Privacy Preserving Data Mining," Proc. Ninth Int'l Conf. Extending Database Technology (EDBT), pp. 183-199, 2004.
[4] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms, second ed. MIT press and McGraw-Hill, 2001.
[5] R. Dewri, I. Ray, and D. Whitley, "On the Optimal Selection of k in the k-Anonymity Problem," Proc. IEEE 24th Int'l Conf. Data Eng. (ICDE), pp. 1364-1366, 2008.
[6] D. Gunopulos and G. Das, "Time Series Similarity Measures," Proc. Tutorial Notes of the Sixth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (Tutorial PM-2), pp. 243-307, 2000.
[7] E. Keogh and T. Folias, "UCR Time Series Data Mining Archive," http://www.cs.ucr.edu/ eamonnTSDMA/, 2012.
[8] E.J. Keogh, K. Chakrabarti, S. Mehrotra, and M.J. Pazzani, "Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases," Proc. ACM SIGMOD Conf., pp. 151-162, 2001.
[9] E.J. Keogh, K. Chakrabarti, M.J. Pazzani, and S. Mehrotra, "Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases," Knowledge Information Systems, vol. 3, no. 3, pp. 263-286, 2001.
[10] E.J. Keogh and M.J. Pazzani, "An Enhanced Representation of Time Series which Allows Fast and Accurate Classification, Clustering and Relevance Feedback," Proc. Fourth Int'l Conf. Knowledge Discovery and Data Mining (KDD), pp. 239-243, 1998.
[11] J. Li, Y. Tao, and X. Xiao, "Preservation of Proximity Privacy in Publishing Numerical Sensitive Data," Proc. ACM SIGMOD Conf., pp. 473-486, 2008.
[12] N. Li, T. Li, and S. Venkatasubramanian, "t-Closeness: Privacy Beyond k-Anonymity and l-Diversity," Proc. IEEE 23rd Int'l Conf. Data Eng. (ICDE), pp. 106-115, 2007.
[13] J. Lin, E.J. Keogh, S. Lonardi, and B.Y. chi Chiu, "A Symbolic Representation of Time Series, with Implications for Streaming Algorithms," Proc. Eighth ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery (DMKD), pp. 2-11, 2003.
[14] L. Sweeney, "k-Anonymity: Privacy Protection Using Generalization and Suppression," Int'l J. Uncertainty Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 571-588, 2002.
[15] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam, "l-Diversity: Privacy Beyond k-Anonymity," Proc. 22nd Int'l Conf. Data Eng. (ICDE), p. 24, 2006.
[16] N. Mohammed, B.C.M. Fung, and M. Debbabi, "Walking in the Crowd: Anonymizing Trajectory Data for Pattern Analysis," Proc. 18th ACM Conf. Information and Knowledge Management (CIKM), pp. 1441-1444, 2009.
[17] M.E. Nergiz, M. Atzori, and Y. Saygin, "Perturbation-Driven Anonymization of Trajectories," Technical Report 2007-TR-017, ISTI-CNR, 2007.
[18] J. Nin and V. Torra, "Towards the Evaluation of Time Series Protection Methods," Information Sciences, vol. 179, no. 11, pp. 1663-1677, 2009.
[19] S. Papadimitriou, F. Li, G. Kollios, and P.S. Yu, "Time Series Compressibility and Privacy," Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), pp. 459-470, 2007.
[20] R.G. Pensa, A. Monreale, F. Pinelli, and D. Pedreschi, "Pattern-Preserving k-Anonymization of Sequences and its Application to Mobility Data Mining," Proc. Int'l Workshop Privacy in Location-Based Applications (PiLBA), 2008.
[21] P. Samarati, "Protecting Respondents' Identities in Microdata Release," IEEE Trans. Knowledge Data Eng., vol. 13, no. 6, pp. 1010-1027, Nov./Dec. 2001.
[22] L. Singh and M. Sayal, "Privacy Preserving Burst Detection of Distributed Time Series Data Using Linear Transforms," Proc. IEEE Symp. Computational Intelligence and Data Mining (CIDM), pp. 646-653, 2007.
[23] S. Theodoridis and K. Koutroumbas, Pattern Recognition, third ed. Elsevier, 2006.
[24] X. Xiao and Y. Tao, "Anatomy: Simple and Effective Privacy Preservation," Proc. 32nd Int'l Conf. Very Large Data Bases (VLDB), pp. 139-150, 2006.
[25] J. Xu et al., "Utility-Based Anonymization for Privacy Preservation with Less Information Loss," SIGKDD Explorations, vol. 8, no. 2, pp. 21-30, 2006.
30 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool