Issue No. 01 - January/February (2008 vol. 10)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MCSE.2008.18
Alex Szalay , Johns Hopkins University
Ani R. Thakar , Johns Hopkins University
Jim Gray , Microsoft Research
Using a database management system (DBMS) is essential to ensure the data integrity and reliability of large, multidimensional data sets. However, loading multiterabyte data into a DBMS is a time-consuming and error-prone task that the authors have tried to automate by developing the sqlLoader pipeline—a distributed workflow system for data loading.
Sloan Digital Sky Survey Science Archive, SDSS, astronomy, large-scale databases, database management systems
A. R. Thakar, J. Gray and A. Szalay, "The sqlLoader Data-Loading Pipeline," in Computing in Science & Engineering, vol. 10, no. , pp. 38-48, 2008.