OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics
- Laboratoire de la Communication dans les Systmes Informatiques, Ecole nationale Sup´erieure d’Informatique
BP 68M, 16309, Oued-Smar, Algiers, Algeria
{d_belayadi,w_hidouci}@esi.dz - LIAS/ISAE-ENSMA – Poitiers University
86960 Futuroscope, France
bellatreche@ensma.fr
Abstract
Modern database systems can achieve high throughput main-memory query execution by being aware of the dynamics of highly parallel hardware. In such systems, data is partitioned into smaller pieces to reach a better parallelism. Unfortunately, data skew is one of the main problems faced during parallel processing in a parallel main memory database. In some data-intensive applications, parallel range queries over a dynamic range partitioned system are important. Continuous insertions/deletions can lead to a very high degree of data skew and consequently a poor performance of parallel range queries. In this paper, we propose an approach for maintaining balanced loads over a set of nodes as in a system of communicating vessels, by migrating tuples between neighboring nodes. These frequent (or even continuous) data transfers inevitably involve dynamic changes in the partition statistics. To avoid the performance degradation typically associated with this dynamism, we provide a solution based on an approximate Partition Statistics Table. The basic idea behind this table is that both clients and nodes may have an imperfect knowledge about the effective load distribution. They can nevertheless locate any data with almost the same efficiency as using exact partition statistics. Furthermore, maintaining load distribution statistics do not require exchanging additional messages as opposed to the cost of efficient solutions from the state-of-art (which requires at least O(logn) messages). We show through intensive experiments that our proposal supports efficient range queries, while simultaneously guaranteeing storage balance even in the presence of numerous concurrent insertions/deletions generating a heavy skewed data distribution.
Key words
Load balancing, Parallel main memory database systems (MMBD), Range partitioning, Data skew, Range query
Digital Object Identifier (DOI)
https://doi.org/10.2298/CSIS170320007B
Publication information
Volume 15, Issue 2 (June 2018)
Year of Publication: 2018
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium
Full text
Available in PDF
Portable Document Format
How to cite
Belayadi, D., Hidouci, K., Bellatreche, L.: OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics. Computer Science and Information Systems, Vol. 15, No. 2. (2018), https://doi.org/10.2298/CSIS170320007B