The sqlLoader Data-Loading Pipeline
January/February 2008 (vol. 10 no. 1)
pp. 38-48
Alex Szalay, Johns Hopkins University
Ani R. Thakar, Johns Hopkins University
Jim Gray, Microsoft Research
Using a database management system (DBMS) is essential to ensure the data integrity and reliability of large, multidimensional data sets. However, loading multiterabyte data into a DBMS is a time-consuming and error-prone task that the authors have tried to automate by developing the sqlLoader pipeline—a distributed workflow system for data loading.

Index Terms:
Sloan Digital Sky Survey Science Archive, SDSS, astronomy, large-scale databases, database management systems
Alex Szalay, Ani R. Thakar, Jim Gray, "The sqlLoader Data-Loading Pipeline," Computing in Science and Engineering, vol. 10, no. 1, pp. 38-48, Jan.-Feb. 2008, doi:10.1109/MCSE.2008.18
