Skip to main content

The impact of spatial data redundancy on SOLAP query performance

Abstract

Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.

References

  1. 1.

    Beckmann N, Kriegel HP, Schneider R and Seeger B. The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles.Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data; 1990. p. 322–331.

  2. 2.

    Bimonte S, Tchounikine A and Miquel M. Spatial OLAP: Open Issues and a Web Based Prototype.Proceedings of the 10 th AGILE International Conference on Geographic Information Science; 2007. 11 p.

  3. 3.

    Brinkhoff T, Kriegel HP, Schneider R. Comparison of Approximations of Complex Objects Used for Approximation-based Query Processing in Spatial Database Systems.Proceedings of 9 th International Conference on Data Engineering; 1993. p. 40–49.

  4. 4.

    Brinkhoff T, Kriegel HP, Schneider R and Seeger B. Multi-step Processing of Spatial.Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data; 1994. p. 197–208.

  5. 5.

    U.S. Census Bureau.TIGER: Topologically Integrated Geographic Encoding and Referencing system. Available from: 〈http://www.census.gov/geo/www/tiger〉. Acess in: March 2009.

  6. 6.

    Fidalgo RN, Times VC, Silva J and Souza FF. GeoDWFrame: A Framework for Guiding the Design of Geographical Dimensional Schemas.Proceedings of the 6 th International Conference on Data Warehousing and Knowledge Discovery; 2004. p. 26–37.

  7. 7.

    Gaede V and Günther O.Multidimensional Access Methods.ACM Computing Surveys 1998; 30(2):170–231.

    Article  Google Scholar 

  8. 8.

    The GiST Indexing Project. Available from: http://gist.cs.berkeley.edu. Acess in: March 2009.

  9. 9.

    Gray J, Chaudhuri S, Bosworth A, Layman A, Reichart D, Venkatrao M et al. Data cube: A Relational Aggregation Operator Generalizing Group-by, Cross-tab, and Sub-totals.Data Mining and Knowledge Discovery 1997;1(1):29–53.

    Article  Google Scholar 

  10. 10.

    Guttman A. R-Trees: A Dynamic Index Structure for Spatial Searching.ACM SIGMOD Record 1984;14(2):47–57.

    Article  Google Scholar 

  11. 11.

    Harinarayan V, Rajaraman A and Ullman JD. Implementing Data Cubes Efficiently.ACM SIGMOD Record 1996;25(2):205–216.

    Article  Google Scholar 

  12. 12.

    Kimball R and Ross M.The Data Warehouse Toolkit. 2 ed. Wiley; 2002.

  13. 13.

    Lo ML and Ravishankar CV. Spatial Hash-Joins.Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data; 1996. p. 247–258.

  14. 14.

    Malinowski E and Zimányi E. Representing Spatiality in a Conceptual Multidimensional Model.Proceedings of the 12 th ACM International Workshop on Geographic Information Systems; 2004. p. 12–22.

  15. 15.

    Malinowski E and Zimányi E. Spatial Hierarchies and Topological Relationships in the Spatial MultiDimER Model.Proceedings of the 22 nd British National Conference on Databases; 2005. p.17–28.

  16. 16.

    Malinowski E and Zimányi E.Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications. 1 ed. Springer; 2008.

  17. 17.

    O’Neil P and Graefe G. Multi-Table Joins Through Bitmapped Join Indices.ACM SIGMOD Record 1995;24(3):8–11.

    Article  Google Scholar 

  18. 18.

    O’Neil P, O’Neil E and Chen X. The Star Schema Benchmark. Available from: http://www.cs.umb.edu/~poneil/ starschemab.pdf. Acess in: January 2007.

  19. 19.

    O’Neil EJ, O’Neil PE, Wu K. Bitmap Index Design Choices and Their Performance Implications.Proceedings of the 11 th International Database Engineering and Applications Symposium; 2007. p. 72–84.

  20. 20.

    O’Neil P, Quass D. Improved Query Performance with Variant Indexes.Proceedings of the International Conference on Management of Data; 1997. p. 38–49.

  21. 21.

    Papadias D, Kalnis P, Zhang J, Tao Y. Efficient OLAP Operations in Spatial Data Warehouses.Proceedings of the 7 th International Symposium on Advances in Spatial and Temporal Databases; 2001. p. 443–459.

  22. 22.

    Poess M, Floyd C. New TPC Benchmarks for Decision Support and Web Commerce.SIGMOD Record 2000;29(4):64–71.

    Article  Google Scholar 

  23. 23.

    Rao F, Zhang L, Yu X, Li Y and Chen Y. Spatial hierarchy and OLAP-favored search in spatial data warehouse.Proceedings of the 6 th International Workshop on Data Warehousing and OLAP; 2003. p. 48–55.

  24. 24.

    Rigaux P, Scholl M, Voisard A. Spatial Databases with Application to GIS. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.; 2002.

    Google Scholar 

  25. 25.

    Rivest S, Bedard Y, Proulx MJ, Hubert MNF and Pastor J.SOLAP Technology: Merging Business Intelligence with Geospatial Technology for Interactive Spatio-Temporal Exploration and Analysis of Data.Journal of Photogrammetry and Remote Sensing 2005; 26(1):17–33.

    Article  Google Scholar 

  26. 26.

    Sampaio MC, Souza AG and Baptista CS. Towards a Logical Multidimensional Model for Spatial Data Warehousing and OLAP.Proceedings of the 9 th International Workshop on Data Warehousing and OLAP; 2006. p. 83–90.

  27. 27.

    Silva J, Times VC, Salgado AC, Souza C, Fidalgo RN, Oliveira AG. A Set of Aggregation Functions for Spatial Measures.Proceedings of the 11 th International Workshop on Data Warehousing and OLAP; 2008. p. 25–32.

  28. 28.

    Siqueira TLL, Ciferri RR, Times VC and Ciferri CDA. Investigating the Effects of Spatial Data Redundancy in Query Performance over Geographical Data Warehouses.Proceedings of the X Brazilian Symposium on GeoInformatics; 2008. p. 1–12.

  29. 29

    Siqueira TLL, Ciferri RR, Times VC and Ciferri CDA. A Spatial Bitmap-Based Index for Geographical Data Warehouses.Proceedings of the 24 th ACM Symposium on Applied Computing; 2009. p. 1336–1342.

  30. 30

    Stefanovic N, Han J and Koperski K. Object-Based Selective Materialization for Efficient Implementation of Spatial Data Cubes.IEEE Transactions on Knowledge and Data Engineering 2000;12(6): 938–958.

    Article  Google Scholar 

  31. 31.

    Stockinger K and Wu K. Bitmap Indices for Data Warehouses.In Data Warehouses and OLAP: Concepts, Architectures and Solutions. IRM Press; 2007. p. 157–178.

  32. 32.

    Whitehorn M, Zare R, Pasumansky M.Fast Track to MDX. Springer; 2005.

  33. 33.

    Wrembel R and Koncilia C.Data Warehouses and OLAP: Concepts, Architectures and Solutions. IRM Press; 2006.

  34. 34.

    Wu MC and Buchmann AP. Research Issues in Data Warehousing.In Proceedings of the German Database Conference; 1997. p. 61–82.

  35. 35.

    Wu K, Otoo EJ and Shoshani A. Optimizing Bitmap Indices with Efficient Compression.ACM Transactions on Database Systems 2006;31(1):1–38.

    Article  Google Scholar 

  36. 36.

    Wu K, Stockinger K and Shoshani A. Breaking the Curse of Cardinality on Bitmap Indexes.Proceedings of the 20 th International Conference on Scientific and Statistical Database Management; 2008. p. 348–365.

Download references

Author information

Affiliations

Authors

Additional information

A previous version of this paper appeared at GEOINFO 2008(X Brazilian Symposium on Geoinformatics)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Siqueira, T.L.L., Ciferri, C.D.d.A., Times, V.C. et al. The impact of spatial data redundancy on SOLAP query performance. J Braz Comp Soc 15, 19–34 (2009). https://doi.org/10.1007/BF03194499

Download citation

Keywords

  • geographic data warehouse
  • index structure
  • SOLAP query performance
  • spatial data redundancy