Show simple item record

dc.contributor.authorRudra, Amit
dc.contributor.authorGopalan, Raj
dc.contributor.authorAchuthan, Narasimaha
dc.contributor.editorJosé Cordeiro
dc.contributor.editorLeszek Maciaszek
dc.contributor.editorAlfredo Cuzzocrea
dc.date.accessioned2017-01-30T13:06:25Z
dc.date.available2017-01-30T13:06:25Z
dc.date.created2014-02-03T20:02:09Z
dc.date.issued2012
dc.identifier.citationRudra, Amit and Gopalan, Raj P. and Achuthan, N.R. 2012. An efficient sampling scheme for approximate processing of decision support queries, in Cordeiro, J., Maciaszek, L., Cuzzocrea, A. (ed), 14th International Conference on Enterprise Information Systems, Jun 28 2012, pp. 16-26. Wroclaw, Poland: INSTICC.
dc.identifier.urihttp://hdl.handle.net/20.500.11937/28648
dc.description.abstract

Decision support queries usually involve accessing enormous amount of data requiring significant retrieval time. Faster retrieval of query results can often save precious time for the decision maker. Pre-computation of materialised views and sampling are two ways of achieving significant speed up. However, drawing random samples for queries on range restricted attributes has two problems: small random samples may miss relevant records and drawing larger samples from disk can be inefficient due to the large number of disk accesses required. In this paper, we propose an efficient indexing scheme for quickly drawing relevant samples for data warehouse queries as well as propose the concepts of database and sample relevancy ratios. We describe a method for estimating query results for range restricted queries using this index and experimentally evaluate the scheme using a relatively large real dataset. Further, we compute the confidence intervals for the estimates to investigate whether the results can be guaranteed to be within the desired level of confidence. Our experiments on data from a retail data warehouse show promising results. We also report the levels of accuracy achieved for various types of aggregate queries and relate them to the database relevancy ratios of the queries.

dc.publisherINSTICC
dc.subjectData Warehousing
dc.subjectApproximate Query Processing
dc.subjectSampling
dc.titleAn efficient sampling scheme for approximate processing of decision support queries
dc.typeConference Paper
dcterms.source.startPage16
dcterms.source.endPage26
dcterms.source.titleProceedings of ICEIS
dcterms.source.seriesProceedings of ICEIS
dcterms.source.conference14th International Conference on Enterprise Information Systems
dcterms.source.conference-start-dateJun 28 2012
dcterms.source.conferencelocationWroclaw, Poland
dcterms.source.placePortugal
curtin.note

Publisher: SciTePress. (2012). ISBN: 9789898565105

curtin.department
curtin.accessStatusOpen access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record