HMIBase: An Hierarchical Indexing System for Storing and Querying Big Data

Release Date:2015-01-19 Author:Shengmei Luo, Di Zhao, Wei Ge, Rong Gu, Chunfeng Yuan, and Yihua Huang Click:

[Abstract] Relational database management systems are usually deployed on single⁃node machines and have strict limitations in terms of data structure. This means they do not work well with big data, and NoSQL has been proposed as a solution. To make data querying more efficient, indexes and memory cache techniques are used in NoSQL databases. In this paper, we propose a hierarchical indexing mechanism and a prototype distributed data⁃storage system, called HMIBase, which has hierarchical indexes for non⁃primary keys in tables and makes data querying more efficient. HMIBase uses HBase as the lower data storage and creates a memory cache for more efficient data transmission. HMIBase supports coprocessor⁃to⁃process update requests. It also provides a client with query and update APIs and a server to support RPCs from the client and finish jobs. To improve the cache hit ratio, we propose a memory cache replacement strategy, called Hot Score algorithm, in HMIBase. The experimental results show that Hot Score algorithm is better than other cache⁃replacement strategies.

[Keywords] NoSQL; In⁃Memory Index; HMIBase; Hot Score