Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View
MetadataShow full item record
Approximate query processing based on random sampling is one of the most useful methods for the efficient computation of large quantities of data kept in databases. However, small samples obtained through random sampling methods might lack the appropriate data relevant to query conditions because the samples do not adequately represent the entire dataset. The Multidimensional Cluster Sampling View has been proposed to support efficient and effective approximate query processing on common database tables. This view provides random sample records to be drawn from a database in SQL efficiently and effectively. The effectiveness of approximate query processing in this view was demonstrated on a large database table with only four dimensions. This differed from the usual number of dimensions in decision support systems, which is most commonly over ten. Therefore, further examinations and evaluations focusing on dimensionality, such as ten-dimensional data and over, are required in order to demonstrate its practicality. This paper evaluates whether the number of dimensions have an impact on the accuracy of the approximation and on the performance of the Multidimensional Cluster Sampling View. The results of the evaluation show that the effects of dimensionality are not visible.
Showing items related by title, author, creator and subject.
Techniques for improving clustering and association rules mining from very large transactional databasesLi, Yanrong (2009)Clustering and association rules mining are two core data mining tasks that have been actively studied by data mining community for nearly two decades. Though many clustering and association rules mining algorithms have ...
Inoue, Tomohiro (2015)This dissertation studies efficient and effective approximate query processing for decision support systems. A novel method that enables fast query processing and reliable approximation even in highly selective queries ...
Rudra, Amit; Gopalan, Raj; Achuthan, Narasimaha (2012)Decision support queries usually involve accessing enormous amount of data requiring significant retrieval time. Faster retrieval of query results can often save precious time for the decision maker. Pre-computation of ...