The Community for Technology Leaders
2013 IEEE 29th International Conference on Data Engineering (ICDE) (2010)
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
ISBN: 978-1-4244-5445-7
pp: 1025-1036
Girish Venkatachaliah , IBM India Software Lab, Bangalore, India
L Venkata Subramaniam , IBM India Research Lab, New Delhi, India
Shrinivas Kulkarni , IBM India Software Lab, Bangalore, India
Tanveer A Faruquie , IBM India Research Lab, New Delhi, India
Mukesh Mohania , IBM India Research Lab, New Delhi, India
K Hima Prasad , IBM India Research Lab, New Delhi, India
Pramit Basu , IBM India Software Lab, Bangalore, India
ABSTRACT
There is often a transient need within enterprises for data cleansing which can be satisfied by offering data cleansing as a transient service. Every time a data cleansing need arises it should be possible to provision hardware, software and staff for accomplishing the task and then dismantling the set up. In this paper we present such a system that uses virtualized hardware and software for data cleansing. We share actual experiences gained from building such a system.We use a cloud infrastructure to offer virtualized data cleansing instances that can be accessed as a service. We build a system that is scalable, elastic and configurable. Each enterprise has unique needs which makes it necessary to customize both the infrastructure and the cleansing algorithms to address these needs. In this paper we will present a system that is easily configurable to suit the data cleansing needs of an enterprise.
INDEX TERMS
CITATION
Girish Venkatachaliah, L Venkata Subramaniam, Shrinivas Kulkarni, Tanveer A Faruquie, Mukesh Mohania, K Hima Prasad, Pramit Basu, "Data cleansing as a transient service", 2013 IEEE 29th International Conference on Data Engineering (ICDE), vol. 00, no. , pp. 1025-1036, 2010, doi:10.1109/ICDE.2010.5447789
289 ms
(Ver 3.1 (10032016))