2013 IEEE 16th International Conference on Computational Science and Engineering (2013)
Sydney, Australia Australia
Dec. 3, 2013 to Dec. 5, 2013
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSE.2013.200
Four main challenges (volume, velocity, variety, veracity) have confronted computation algorithm designers in big data mining. Homomorphic cryptosystem with secured multi-party computation of matrix operations has been shown to yield high privacy preserving while data miners perform information retrieval from big data. This research concerns with the computation complexity of the big data with specific focus on computational load reduction while preserving data privacy. We propose a Teo-Han-Lee (THL) algorithm with various matrix operations to reduce the cryptographic cost significantly by cutting off at least one-third or more total computational operations. In THL, a pre-generated random key technique that we propose to apply here can decrease the computational time in which the random keys can be retrieved from memory without being generated on the fly. We further develop a collusion-resistant secure sum product protocol (CRSSPP) which is integrated in THL algorithm over arbitrary partitioned data. Experimental results demonstrated that THL-CRSSPP algorithm is more efficient than Vaidya et al SVM method  (state-of-the-art SVM method) and hence would be more applicable to the cloud-based big data mining. The THL-CRSSPP algorithm can also be integrated into Hadoop Mahout with a minimal effort.
Support vector machines, Data privacy, Protocols, Kernel, Encryption
S. G. Teo, S. Han and V. C. Lee, "Privacy Preserving Support Vector Machine Using Non-linear Kernels on Hadoop Mahout," 2013 IEEE 16th International Conference on Computational Science and Engineering(CSE), Sydney, Australia Australia, 2013, pp. 941-948.