This Article 
 Bibliographic References 
 Add to: 
Policies for Caching OLAP Queries in Internet Proxies
October 2006 (vol. 17 no. 10)
pp. 1124-1135

Abstract—The Internet now offers more than just simple information to the users. Decision makers can now issue analytical, as opposed to transactional, queries that involve massive data (such as, aggregations of millions of rows in a relational database) in order to identify useful trends and patterns. Such queries are often referred to as On-Line-Analytical Processing (OLAP). Typically, pages carrying query results do not exhibit temporal locality and, therefore, are not considered for caching at Internet proxies. In OLAP processing, this is a major problem as the cost of these queries is significantly larger than that of the transactional queries. This paper proposes a technique to reduce the response time for OLAP queries originating from geographically distributed private LANs and issued through the Web toward a central data warehouse (DW) of an enterprise. An active caching scheme is introduced that enables the LAN proxies to cache some parts of the data, together with the semantics of the DW, in order to process queries and construct the resulting pages. OLAP queries arriving at the proxy are either satisfied locally or from the DW, depending on the relative access costs. We formulate a cost model for characterizing the respective latencies, taking into consideration the combined effects of both common Web access and query processing. We propose a cache admittance and replacement algorithm that operates on a hybrid Web-OLAP input, outperforming both pure-Web and pure-OLAP caching schemes.

[1] M. Abrams, C. Standridge, G. Abdulla, S. Williams, and E. Fox, “Caching Proxies: Limitations and Potentials,” Proc. Fourth Int'l World Wide Web Conf.: The Web Revolution, pp. 119-133, Dec. 1995.
[2] K. Amiri, S. Park, R. Tewari, and S. Padmanabhan, “Scalable Template-Based Query Containment Checking for Web Semantic Caches,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), pp. 493-504, 2003.
[3] M. Arlitt, L. Cherkasova, J. Dilley, R. Friedrich, and T. Jin, “Evaluating Content Management Techniques for Web Proxy Caches,” Proc. ACM SIGMETRICS Performance Evaluation Rev., vol. 27, no. 4, pp. 3-11, Mar. 2000.
[4] A. Balmin, F. Ozcan, K. Beyer, R. Cochrane, and H. Pirahesh, “A Framework for Using Materialized XPath Views in XML Query Processing,” Proc. 30th Int'l Conf. Very Large DataBases (VLDB '04), pp. 60-71, 2004.
[5] E. Baralis, S. Paraboschi, and E. Teniente, “Materialized View Selection in a Multidimensional Database,” Proc. 23rd Int'l Conf. Very Large Data Bases (VLDB '97), pp. 156-165, 1997.
[6] P. Barford, A. Bestavros, A. Bradley, and M. Crovella, “Changes in Web Client Access Patterns: Characteristics and Caching Implications,” World Wide Web J., vol. 2, nos. 1-2, pp. 15-28, 1999.
[7] A. Bestavros, “WWW Traffic Reduction and Load Balancing through Server-Based Caching,” IEEE Concurrency, vol. 5, no. 1, pp. 56-67, Jan.-Mar. 1997.
[8] D. Calvanese, G. Giacomo, M. Lenzerini, and M. Vardi, “View-Based Query Containment,” Proc. 22nd ACM Symp. Principles of Database Systems (PODS '03), pp. 56-67, 2003.
[9] P. Cao and S. Irani, “Cost-Aware WWW Proxy Caching Algorithms,” Proc. USENIX Symp. Internet Technology and Systems, pp. 193-206, Dec. 1997.
[10] P. Cao, J. Zhang, and K. Beach, “Active Cache: Caching Dynamic Contents on the Web,” Proc. Middleware '98 Conf., pp. 373-388, Sept. 1998.
[11] A. Chankhunthod, P.B. Danzig, C. Neerdals, M.F. Schwartz, and K.J. Worrell, “A Hierarchical Internet Object Cache,” Proc. USENIX Technical Conf., pp. 153-163, Jan. 1996.
[12] H. Gupta and I.S. Mumick, “Selection of Views to Materialize Under a Maintenance-Time Constraint,” Proc. Int'l Conf. Database Theory (ICDT '99), pp. 453-470, 1999.
[13] J.S. Gwertzman and M. Seltzer, “The Case for Geographical Push-Caching,” Proc. Fifth Workshop Hot Topics in Operating Systems (HotOS-V), pp. 51-55, 1995.
[14] J. Hammer, H. Garcia-Molina, J. Widom, W. Labio, and Y. Zhuge, “The Stanford Data Warehousing Project,” IEEE Data Eng. Bull., vol. 18, no. 2, pp. 41-48, 1995.
[15] V. Harinarayan, A. Rajaraman, and J.D. Ullman, “Implementing Data Cubes Efficiently,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 205-216, 1996.
[16] S. Jamin, C. Jin, Y. Jin, D. Riaz, Y. Shavitt, and L. Zhang, “On the Placement of Internet Instrumentation,” Proc. IEEE INFOCOM '00 Conf., pp. 295-304, Mar. 2000.
[17] S. Jin and A. Bestavros, “Popularity-Aware Greedy Dual-Size Web Proxy Caching Algorithms,” Proc. 20th IEEE Int'l Conf. Distributed Computing Systems (ICDCS '00), pp. 254-261, Apr. 2000.
[18] P. Kalnis and D. Papadias, “Proxy-Sever Architectures for OLAP,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 367-378, 2001.
[19] P. Kalnis, W. Siong, B. Ng, C. Ooi, D. Papadias, and K.L. Tan, “An Adaptive Peer-to-Peer Network for Distributed Caching of OLAP Results,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 25-36, 2002.
[20] Y. Kotidis and N. Roussopoulos, “DynaMat: A Dynamic View Management System for Data Warehouses,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 371-382, 1999.
[21] A. Labrinidis and N. Roussopoulos, “WebView Materialization,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 367-378, 2000.
[22] N. Laoutaris, V. Zissimopoulos, and I. Stavrakakis, “On the Optimization of Storage Capacity Allocation for Content Distribution,” Computer Networks, vol. 47, no. 3, pp. 409-428, 2005.
[23] B. Li, M. Golin, G. Italiano, and X. Deng, “On the Optimal Placement of Web Proxies in the Internet,” Proc. IEEE INFOCOM '99 Conf., pp. 1282-1290, 1999.
[24] P. Lorenzetti, L. Rizzo, and L. Vicisano, “Replacement Policies for a Proxy Cache,” IEEE/ACM Trans. Networking, vol. 8, no. 2, pp. 158-170, Apr. 2000.
[25] T. Loukopoulos, P. Kalnis, I. Ahmad, and D. Papadias, “Active Caching of On-Line-Analytical-Processing Queries in WWW Proxies,” Proc. 30th Int'l Conf. Parallel Processing (ICPP '01), pp. 419-426, Sept. 2001.
[26] Q. Luo, J.F. Naughton, R. Krishnamurthy, P. Cao, and Y. Li, “Active Query Caching for Database Web Servers,” Proc. Int'l Workshop Web and Databases (WebDB), pp. 92-104, 2000.
[27] Q. Luo and W. Xue, “Template-Based Proxy Caching for Table-Valued Functions,” Proc. Ninth Int'l Conf. Database Systems for Advanced Applications (DASFAA '04), pp. 339-351, 2004.
[28] B. Mandhani and D. Suciu, “Query Caching and View Selection for XML Databases,” Proc. 31st Int'l Conf. Very Large Databases (VLDB '05), pp. 469-480, 2005.
[29] S. Martello and P. Toth, Knapsack Problems: Algorithms and Computer Implementations. John Wiley and Sons, 1990.
[30] OLAP Council, “OLAP Council APB-1 OLAP Benchmark, Release II,” http:/, 2001.
[31] L. Qiu, V. Padmanabhan, and G. Voelker, “On the Placement of Web Server Replicas,” Proc. IEEE INFOCOM '01 Conf., pp. 1587-1596, Apr. 2001.
[32] P. Scheuermann, J. Shim, and R. Vingralek, “WATCHMAN: A Data Warehouse Intelligent Cache Manager,” Proc. 22nd Int'l Conf. Very Large Databases (VLDB '96), pp. 51-62, 1996.
[33] J. Shim, P. Scheuermann, and R. Vingralek, “Proxy Cache Algorithms: Design, Implementation and Performance,” IEEE Trans. Knowledge and Data Eng., vol. 11, no. 4, pp. 549-562, July/Aug. 1999.
[34] A. Shukla, P.M. Deshpande, and J.F. Naughton, “Materialized View Selection for Multidimensional Data Sets,” Proc. 24th Int'l Conf. Very Large Databases (VLDB '98), pp. 488-499, 1998.
[35] W.R. Stevens, TCP/IP Illustrated, vol. 3. Addison-Wesley, 1996.
[36] D. Theodoratos and T.K. Sellis, “Data Warehouse Configuration,” Proc. 23rd Int'l Conf. Very Large Databases (VLDB '97), pp. 126-135, 1997.
[37] D. Wessels and K. Claffy, “Internet Cache Protocol (ICP) Version 2,” RFC2186, 1998.
[38] R. Wooster and M. Abrams, “Proxy Caching that Estimates Page Load Delays,” Proc. Sixth Int'l World Wide Web Conf., pp. 977-986, Apr. 1997.
[39] N.E. Young, “On-Line Caching as Cache Size Varies,” Proc. Symp. Discrete Algorithms (SODA '91), pp. 241-250, Jan. 1997.

Index Terms:
Distributed systems, data communication aspects, Internet applications databases, Web caching, OLAP.
Thanasis Loukopoulos, Ishfaq Ahmad, "Policies for Caching OLAP Queries in Internet Proxies," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 10, pp. 1124-1135, Oct. 2006, doi:10.1109/TPDS.2006.143
Usage of this product signifies your acceptance of the Terms of Use.