Follow

K-BP02: Using Dictionary Encoding to save memory in Kinetica

This document provides information about using dictionary encoding feature to save memory on charN columns

Affects versions: 6.1, 6.2

OVERVIEW

Dictionary Encoding is a feature that applies to charN columns that reside in memory. It works by encoding this columns into smaller values to save memory.

It is most effective against relatively low cardinality columns (distinct values) and it has a very low impact on query performance, in fact in some cases it can even improve the query performance.

Dictionary encoded columns can be used normally in both SELECT and WHERE clauses, and only work in charN types specifically it doesn't apply to other types stored as string like date or datetime.

ACTION

Dictionary encoding can be achieved at ingestion time:

CSV Headers: To set dictionary encoding on the CSV headers prior to a data import, the "dict" property has to be set. I.e:

text_column|string|data|char32|dict

 

SQL: Dictionary encoding can be set in a table through SQL either at table creation time or via an alter table command. I.e:

Create table:

CREATE REPLICATED TABLE dict_encode_table
(
text_col VARCHAR(30, DICT), -- char32 using dictionary-encoding of values
)

Alter table:

ALTER TABLE dict_encode_table
ALTER COLUMN text_col VARCHAR(30, DICT)

 

API's: Dictionary encoding can be set via any of the available API's.

I.e using Rest API:

curl -X POST http://localhost:9191/alter/table --header "Content-Type: application/json" -d '{"table_name":"dict_encode_table","action":"change_column","value":"text_col","options":{"column_properties":"char32, dict"}}'

 


Should you have any questions or concerns, please visit our support page, official documentation page or email us at support@kinetica.com

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.