2012 7th Open Cirrus Summit (2011)
Atlanta, Georgia USA
Oct. 12, 2011 to Oct. 13, 2011
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/OCS.2011.8
BookPrep is a Print-On-Demand service that takes raw scans and converts them to print-ready files. It requires large amount of storage and takes an average of 5 hours of CPU time to process a single book with about 300 pages. The experiment we conducted involved moving the processing of books on Open Cirrus closer to the location of the data. At three Open Cirrus sites we installed BookPrep service and we pre-populated each site with region-specific scanned books. When requests come in to process a book, each request is routed to the compute node closest to the source data. The compute node is then expected to store the processed data on the same network. The compute nodes are allocated and deallocated based on demand. There is a cloud based metadata repository that is used to update the metadata associated with each book regardless of the location of the source and derived data. The goal of this experiment is to determine if performance can be improved by moving book processing close closer to source data location. The fundamental reason behind the success of MapReduce is the notion of moving compute close to data and we would like to see if that same principal can be applied to a pull based scheduling model.
distribution, Clouds, Web services, Imaging and printing
P. Reddy, S. Puthanveedu, D. Milojicic and S. Dudekula, "Globally Distributed BookPrep - Open Crirrus-Hosted Service for Book Preparation," 2012 7th Open Cirrus Summit(OCS), Atlanta, Georgia USA, 2011, pp. 11-16.