Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View

Inoue, T.; Krishna, Aneesh; Gopalan, Raj

doi:10.17706/jsw.11.1.80-93

Access Status

Fulltext not available

Authors

Inoue, T.

Krishna, Aneesh

Gopalan, Raj

Date

2016

Type

Journal Article

Metadata

Show full item record

Citation

Inoue, T. and Krishna, A. and Gopalan, R. 2016. Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View. JSW. 11: pp. 80-93.

Source Title

JSW

DOI

10.17706/jsw.11.1.80-93

ISSN

1796-217X

Faculty

Faculty of Science and Engineering

School

Department of Computing

URI

http://hdl.handle.net/20.500.11937/25472

Collection

Curtin Research Publications

Abstract

Approximate query processing based on random sampling is one of the most useful methods for the efficient computation of large quantities of data kept in databases. However, small samples obtained through random sampling methods might lack the appropriate data relevant to query conditions because the samples do not adequately represent the entire dataset. The Multidimensional Cluster Sampling View has been proposed to support efficient and effective approximate query processing on common database tables. This view provides random sample records to be drawn from a database in SQL efficiently and effectively. The effectiveness of approximate query processing in this view was demonstrated on a large database table with only four dimensions. This differed from the usual number of dimensions in decision support systems, which is most commonly over ten. Therefore, further examinations and evaluations focusing on dimensionality, such as ten-dimensional data and over, are required in order to demonstrate its practicality. This paper evaluates whether the number of dimensions have an impact on the accuracy of the approximation and on the performance of the Multidimensional Cluster Sampling View. The results of the evaluation show that the effects of dimensionality are not visible.