CarbonData Distributed Cache Mechanism

Vikram Ahuja
4 min readFeb 26, 2021

CarbonData uses caching to increase the query performance by caching block/blocklet index information and prunes using the cache. Using caching, the number of files that are to be read are reduced thereby reducing the IO time and improving the overall query performance.

Cache Management System: Carbon prunes and caches all block/blocklet index information into the driver for table to increase the query performance by reducing the files which are read. This caching mechanism causes the driver to become a bottleneck in the following ways:

  1. If the cache size becomes huge(70–80% of the driver memory) then there can be excessive GC in the driver which can slow down the query and the driver may even go OutOfMemory.
  2. There will be a hard limit to how much LRU can be saved. In case when the cache is full and it needs to accommodate new objects it has to evict elements from the cache which would in turn slow down the queries.
  3. Multiple JDBC drivers need to maintain their own copy of the cache.

The idea behind a Distributed Index Cache Server is to solve the above mentioned problems using just one solution.

Distributed Index Server: is a separate Spark Application(with a driver and multiple executors) which can be used as a Distributed Cache sever. When it is enabled, it allows any query on a carbon table to be routed to the index server service in form of a request such that pruning happens in that centralised cache server. This will enable cache to be stored in a single place.

In the Index Server, the data or all the cache will be only stored in the executors. The segment details received from the CarbonData driver will be equally divided between the executors and a mapping would be maintained at the Index Server driver to determine the executor in which the Index would be cached. Since the Distributed Index Server is based on the Spark Driver and Executor model, it can be scaled as the user can always add more number of executors, thus solving all the mentioned problems in the previous sections as well as reducing chance of Out of Memory Issues.

JDBC connection with the Index Server

Since the JDBC Application has dynamic allocation of executors, the same application could not be extended to be used as Index server, as in the Index Server dynamic allocation of executors for the Index Server will not be allowed, as the cache information is stored in them and removing executors will actually remove the cache information. Thus is was necessary to use separate application for this task.

Hadoop RPC is used to send the information from the JDBC Server to the Index server as well as receive information from the Index Server. Since data is Serialised before sending and deserialised after receiving which can take up large amount of time for large data, the Distributed Index Server avoids sending large data back to JDBC Server in the form of Serialised data. Instead, we write the data to a file and send the path of the file instead back to the JDBC Server.

The JDBC server can connect to Index Server by using various number of APIs each designed for specific purpose depending on the query that has been fired by the user. The APIs are as follows:

  1. getSplits(): Will be used in case of full scan query or a query with a filter
  2. getCount(): Will be used in case of count(*) query and will directly return the total count.
  3. showCache(): Used to see the size of cache in the index server. Used in show cache command.
  4. invalidateCache() : Used to remove the cache of segments from the Index Server. Used during drop table command.

Using this solution would allow multiple JDBC drivers to connect to the index server to use the centralised cache.

Fallback Mechanism

In case of any failure, the index server would fallback to embedded mode which means that the JDBCServer would take care of distributed pruning. A similar job would be fired by the JDBCServer which would take care of pruning using its own executors. If for any reason the embedded mode also fails to prune the indexes then the job would be passed on to driver.

--

--