Subscribe
Issue No.06 - June (2010 vol.22)
pp: 884-899
Shuguo Han , Nanyang Technological University, Singapore
Wee Keong Ng , Nanyang Technological University, Singapore
Li Wan , Nanyang Technological University, Singapore
Vincent C.S. Lee , Monash University, Victoria
ABSTRACT
Gradient descent is a widely used paradigm for solving many optimization problems. Gradient descent aims to minimize a target function in order to reach a local minimum. In machine learning or data mining, this function corresponds to a decision model that is to be discovered. In this paper, we propose a preliminary formulation of gradient descent with data privacy preservation. We present two approaches—stochastic approach and least square approach—under different assumptions. Four protocols are proposed for the two approaches incorporating various secure building blocks for both horizontally and vertically partitioned data. We conduct experiments to evaluate the scalability of the proposed secure building blocks and the accuracy and efficiency of the protocols for four different scenarios. The excremental results show that the proposed secure building blocks are reasonably scalable and the proposed protocols allow us to determine a better secure protocol for the applications for each scenario.
INDEX TERMS
Privacy-preserving data mining, gradient-descent method, secure multiparty computation, stochastic approach, least square approach.
CITATION
Shuguo Han, Wee Keong Ng, Li Wan, Vincent C.S. Lee, "Privacy-Preserving Gradient-Descent Methods", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 6, pp. 884-899, June 2010, doi:10.1109/TKDE.2009.153
REFERENCES
 [1] H. Anton and C. Rorres, Elementary Linear Algebra: Applications Version, ninth ed. Wiley, 2005. [2] L. Baird and A. Moore, "Gradient Descent for General Reinforcement Learning," Proc. 1998 Conf. Advances in Neural Information Processing Systems II, pp. 968-974, 1999. [3] L. Baird and P. Wang, "3D Object Perception Using Gradient Descent," J. Math. Imaging and Vision, vol. 5, no. 2, pp. 111-117, 1995. [4] J. BarIlan and D. Beaver, "Non-Cryptographic Fault-Tolerant Computing in Constant Number of Rounds of Interaction," Proc. Eighth ACM Symp. Principles of Distributed Computing (PODC), pp. 201-209, 1989. [5] C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender, "Learning to Rank Using Gradient Descent," Proc. 22nd Int'l Conf. Machine Learning (ICML), pp. 89-96, 2005. [6] R. Cramer and I. Damgård, "Secure Distributed Linear Algebra in a Constant Number of Rounds," Proc. 21st Int'l Cryptology Conf. Advances in Cryptology, pp. 119-136, 2001. [7] W. Du and M.J. Atallah, "Privacy-Preserving Cooperative Statistical Analysis," Proc. 17th Ann. Computer Security Applications Conf., pp. 102-110, Dec. 2001. [8] W. Du, Y. Han, and S. Chen, "Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification," Proc. Fourth SIAM Int'l Conf. Data Mining (SDM), pp. 222-233, Apr. 2004. [9] W. Du and Z. Zhan, "Building Decision Tree Classifier on Private Data," Proc. IEEE Int'l Conf. Privacy, Security and Data Mining, pp. 1-8, 2002. [10] P.-A. Fouque, J. Stern, and G.-J. Wackers, "Cryptocomputing with Rationals," Proc. Sixth Int'l Conf. Financial Cryptography, pp. 136-146, 2002. [11] B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen, "On Private Scalar Product Computation for Privacy-Preserving Data Mining," Proc. Seventh Ann. Int'l Conf. in Information Security and Cryptology, pp. 104-120, Dec. 2004. [12] O. Goldreich, Foundations of Cryptography. Cambridge Univ. Press, 2001. [13] S. Han and W.K. Ng, "Privacy-Preserving Genetic Algorithms for Rule Discovery," Proc. Ninth Int'l Conf. Data Warehousing and Knowledge Discovery (DaWak), pp. 407-417, Sept. 2007. [14] S. Han and W.K. Ng, "Privacy-Preserving Linear Fisher Discriminant Analysis," Proc. 12th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), May 2008. [15] M.H. Hayes, Statistical Digital Signal Processing and Modeling, chapter 9.4, first ed., p. 541. Wiley, 1996. [16] J. Hertz, A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Computation. Addison-Wesley, 1991. [17] Z.-Q. Hong and J.-Y. Yang, "Optimal Discriminant Plane for a Small Number of Samples and Design Method of Classifier on the Plane," Pattern Recognition, vol. 24, no. 4, pp. 317-324, 1991. [18] R. Jenssen, D. Erdogmus, K.E. Hild, J.C. Principe, and T. Eltoft, "Information Cut for Clustering Using a Gradient Descent Approach," Pattern Recognition, vol. 40, no. 3, pp. 796-806, 2007. [19] M. Kantarcioglu and C. Clifton, "Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data," IEEE Trans. Knowledge and Data Eng., vol. 16, no. 9, pp. 1026-1037, Sept. 2004. [20] Y. Lindell and B. Pinkas, "Privacy Preserving Data Mining," Advances in Cryptology, pp. 36-53, Springer-Verlag, 2000. [21] D. Metzler, "Using Gradient Descent to Optimize Language Modeling Smoothing Parameters," Proc. 30th Ann. Int'l ACM SIGIR, pp. 687-688, 2007. [22] T.M. Mitchell, Machine Learning. McGraw-Hill, 1997. [23] P. Paillier, "Public-Key Cryptosystems Based on Composite Degree Residuosity Classes," Proc. EUROCRYPT, pp. 223-238, 1999. [24] G. Strang, Linear Algebra and Its Applications, fourth ed. Thomson, Brooks/Cole, 2006. [25] C.K. Tan, http://mega.ist.utl.pt/ic-padi/public06-07/ projectobigintegerdoc.html, 2002. [26] J. Vaidya and C. Clifton, "Privacy Preserving Association Rule Mining in Vertically Partitioned Data," Proc. Eighth ACM SIGKDD, pp. 639-644, July 2002. [27] J. Vaidya and C. Clifton, "Privacy-Preserving $k$ -Means Clustering over Vertically Partitioned Data," Proc. Ninth ACM SIGKDD, pp. 206-215, 2003. [28] J. Vaidya and C. Clifton, "Privacy Preserving Naïve Bayes Classifier for Vertically Partitioned Data," Proc. SIAM Int'l Conf. Data Mining, pp. 522-526, 2004. [29] J. Vaidya and C. Clifton, "Secure Set Intersection Cardinality with Application to Association Rule Mining," J. Computer Security, vol. 13, no. 4, pp. 593-622, 2005. [30] L. Wan, W.K. Ng, S. Han, and V.C.S. Lee, "Privacy-Preservation for Gradient Descent Methods," Proc. 13th ACM SIGKDD, pp. 775-783, Aug. 2007. [31] S. Waugh, UCI Machine Learning Repository: Boston Housing Data, 1995. [32] A.C. Yao, "How to Generate and Exchange Secrets," Proc. IEEE 27th Ann. Symp. Foundations of Computer Science, pp. 162-167, 1986. [33] H. Yu, X. Jiang, and J. Vaidya, "Privacy-Preserving SVM Using Nonlinear Kernels on Horizontally Partitioned Data," Proc. ACM Symp. Applied Computing, pp. 603-610, 2006. [34] H. Yu, J. Vaidya, and X. Jiang, "Privacy-Preserving SVM Classification on Vertically Partitioned Data," Proc. 10th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), pp. 647-656, Apr. 2006.