|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| N.C. Rowe, A. Zaky, "Load Balancing of Parallelized Information Filters," IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 2, pp. 456-461, March/April, 2002. | |||
| BibTex | x | ||
| @article{ 10.1109/69.991730, author = {N.C. Rowe and A. Zaky}, title = {Load Balancing of Parallelized Information Filters}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {14}, number = {2}, issn = {1041-4347}, year = {2002}, pages = {456-461}, doi = {http://doi.ieeecomputersociety.org/10.1109/69.991730}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - Load Balancing of Parallelized Information Filters IS - 2 SN - 1041-4347 SP456 EP461 EPD - 456-461 A1 - N.C. Rowe, A1 - A. Zaky, PY - 2002 KW - information filtering KW - data parallelism KW - load balancing KW - information retrieval KW - conjunctions KW - optimality KW - and Monte Carlo methods VL - 14 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
We investigate the data-parallel implementation of a set of information filters used to rule out uninteresting data from a database or data stream. We develop an analytic model for the costs and advantages of load rebalancing for the parallel filtering processes, as well as a quick heuristic for its desirability. Our model uses binomial models of the filter processes and fits key parameters to the results of extensive simulations. Experiments confirm our model. Rebalancing should pay off whenever processor communications costs are high. Further experiments showed it can also pay off even with low communications costs for 16-64 processes and 1-10 data items per processor; then, imbalances can increase processing time by up to 52 percent in representative cases, and rebalancing can increase it by 78 percent, so our quick predictive model can be valuable. Results also show that our proposed heuristic rebalancing criterion gives close to optimal balancing. We also extend our model to handle variations in filter processing time per data item.
[1] N.J. Belkin and W.B. Croft, "Information Filtering and Information Retrieval: Two Sides of the Same Coin?" Comm. ACM, Vol. 35, No. 12, Dec. 1992, pp. 29-38.
[2] G. Dahlquist and A. Bjorck, Numerical Methods. Englewood Cliffs, N.J.: Prentice-Hall, 1974.
[3] J. De Keyser and D. Roose,“Load balancing data parallel programs on distributed memory computers,” Parallel Computing, vol. 19, pp. 1,199-1,219, 1993.
[4] C. Faloutsos, “Signature Based Text Retrieval Methods: A Survery,” IEEE Data Eng. Bull., vol. 13, no. 1, pp. 25-32, Mar. 1990.
[5] M. Jarke and J. Koch, “Query Optimization in Database Systems,” ACM Computer Surveys, vol. 16, pp. 111–152, 1984.
[6] D. Nicol and P. Reynolds, “Optimal Dynamic Remapping of Data Parallel Computations,” IEEE Trans. Computers, vol. 39, no. 2, pp. 206-219, Feb. 1990.
[7] K. Pattipatti and M. Dontamsetty, “On a Generalized Test Sequencing Problem,” IEEE Trans. Systems, Man, and Cybernetics, vol. 22, no. 2, pp. 392-396, Mar./Apr. 1992.
[8] N.C. Rowe, “Using Local Optimality Criteria for Efficient Information Retrieval with Redundant Information Filters,” ACM Trans. Information Systems, vol. 14, no. 2, pp. 138-174, Apr. 1996.
[9] N.C. Rowe, “Preicse and Efficient Retrieval of Captioned Images: The MARIE Project,” Library Trends, vol. 48, no. 2, pp. 475-495, Fall 1999.
[10] C. Stanfill and B. Kahle, "Parallel Free-Text Search on the Connection Machine System," Comm. ACM, Dec. 1986, pp. 1229-1239.
[11] H.S. Stone, High Performance Computer Architectures.Reading, Mass.: Addison-Wesley, 1987.

