Conference, International Asia-Pacific Web (2010)
Apr. 6, 2010 to Apr. 8, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/APWeb.2010.16
Given a set of lists, where items of each list are sorted by the ascending order of their values, the objective of this paper is to figure out the common items that appear in all of the lists efficiently. This problem is sometimes known as common items extraction from sorted lists. To solve this problem, one common approach is to scan all items of all lists sequentially in parallel until one of the lists is exhausted. However, we observe that if the overlap of items across all lists is not high, such sequential access approach can be significantly improved. In this paper, we propose two algorithms, MergeSkip and MergeESkip, to solve this problem by taking the idea of skipping as many items of lists as possible. As a result, a large number of comparisons among items can be saved, and hence the efficiency can be improved. We conduct extensive analysis of our proposed algorithms on one real dataset and two synthetic datasets with different data distributions. We report all our findings in this paper.
W. Lu, X. Du, G. P. Fung, J. Chen, X. Zhou and C. Rong, "Efficient Common Items Extraction from Multiple Sorted Lists," Conference, International Asia-Pacific Web(APWEB), Buscan, Korea, 2010, pp. 219-225.