Skip to main content

Data types binary encoding specification

This specification describes the binary format that can be used for binary encoding and decoding of ClickHouse data types. This format is used in Dynamic column binary serialization and can be used in input/output formats RowBinaryWithNamesAndTypes and Native under corresponding settings.

The table below describes how each data type is represented in binary format. Each data type encoding consist of 1 byte that indicates the type and some optional additional information. var_uint in the binary encoding means that the size is encoded using Variable-Length Quantity compression.

ClickHouse data typeBinary encoding
Nothing0x00
UInt80x01
UInt160x02
UInt320x03
UInt640x04
UInt1280x05
UInt2560x06
Int80x07
Int160x08
Int320x09
Int640x0A
Int1280x0B
Int2560x0C
Float320x0D
Float640x0E
Date0x0F
Date320x10
DateTime0x11
DateTime(time_zone)0x12<var_uint_time_zone_name_size><time_zone_name_data>
DateTime64(P)0x13<uint8_precision>
DateTime64(P, time_zone)0x14<uint8_precision><var_uint_time_zone_name_size><time_zone_name_data>
String0x15
FixedString(N)0x16<var_uint_size>
Enum80x17<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><int8_value_1>...<var_uint_name_size_N><name_data_N><int8_value_N>
Enum160x18<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><int16_little_endian_value_1>...><var_uint_name_size_N><name_data_N><int16_little_endian_value_N>
Decimal32(P, S)0x19<uint8_precision><uint8_scale>
Decimal64(P, S)0x1A<uint8_precision><uint8_scale>
Decimal128(P, S)0x1B<uint8_precision><uint8_scale>
Decimal256(P, S)0x1C<uint8_precision><uint8_scale>
UUID0x1D
Array(T)0x1E<nested_type_encoding>
Tuple(T1, ..., TN)0x1F<var_uint_number_of_elements><nested_type_encoding_1>...<nested_type_encoding_N>
Tuple(name1 T1, ..., nameN TN)0x20<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><nested_type_encoding_1>...<var_uint_name_size_N><name_data_N><nested_type_encoding_N>
Set0x21
Interval0x22<interval_kind> (see interval kind binary encoding)
Nullable(T)0x23<nested_type_encoding>
Function0x24<var_uint_number_of_arguments><argument_type_encoding_1>...<argument_type_encoding_N><return_type_encoding>
AggregateFunction(function_name(param_1, ..., param_N), arg_T1, ..., arg_TN)0x25<var_uint_version><var_uint_function_name_size><function_name_data><var_uint_number_of_parameters><param_1>...<param_N><var_uint_number_of_arguments><argument_type_encoding_1>...<argument_type_encoding_N> (see aggregate function parameter binary encoding)
LowCardinality(T)0x26<nested_type_encoding>
Map(K, V)0x27<key_type_encoding><value_type_encoding>
IPv40x28
IPv60x29
Variant(T1, ..., TN)0x2A<var_uint_number_of_variants><variant_type_encoding_1>...<variant_type_encoding_N>
Dynamic(max_types=N)0x2B<uint8_max_types>
Custom type (Ring, Polygon, etc)0x2C<var_uint_type_name_size><type_name_data>
Bool0x2D
SimpleAggregateFunction(function_name(param_1, ..., param_N), arg_T1, ..., arg_TN)0x2E<var_uint_function_name_size><function_name_data><var_uint_number_of_parameters><param_1>...<param_N><var_uint_number_of_arguments><argument_type_encoding_1>...<argument_type_encoding_N> (see aggregate function parameter binary encoding)
Nested(name1 T1, ..., nameN TN)0x2F<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><nested_type_encoding_1>...<var_uint_name_size_N><name_data_N><nested_type_encoding_N>
JSON(max_dynamic_paths=N, max_dynamic_types=M, path Type, SKIP skip_path, SKIP REGEXP skip_path_regexp)0x30<uint8_serialization_version><var_int_max_dynamic_paths><uint8_max_dynamic_types><var_uint_number_of_typed_paths><var_uint_path_name_size_1><path_name_data_1><encoded_type_1>...<var_uint_number_of_skip_paths><var_uint_skip_path_size_1><skip_path_data_1>...<var_uint_number_of_skip_path_regexps><var_uint_skip_path_regexp_size_1><skip_path_data_regexp_1>...

For type JSON byte uint8_serialization_version indicates the version of the serialization. Right now the version is always 0 but can change in future if new arguments will be introduced for JSON type.

Interval kind binary encoding

The table below describes how different interval kinds of Interval data type are encoded.

Interval kindBinary encoding
Nanosecond0x00
Microsecond0x01
Millisecond0x02
Second0x03
Minute0x04
Hour0x05
Day0x06
Week0x07
Month0x08
Quarter0x09
Year0x1A

Aggregate function parameter binary encoding

The table below describes how parameters of AggregateFunction and SimpleAggregateFunction are encoded. The encoding of a parameter consists of 1 byte indicating the type of the parameter and the value itself.

Parameter typeBinary encoding
Null0x00
UInt640x01<var_uint_value>
Int640x02<var_int_value>
UInt1280x03<uint128_little_endian_value>
Int1280x04<int128_little_endian_value>
UInt1280x05<uint128_little_endian_value>
Int1280x06<int128_little_endian_value>
Float640x07<float64_little_endian_value>
Decimal320x08<var_uint_scale><int32_little_endian_value>
Decimal640x09<var_uint_scale><int64_little_endian_value>
Decimal1280x0A<var_uint_scale><int128_little_endian_value>
Decimal2560x0B<var_uint_scale><int256_little_endian_value>
String0x0C<var_uint_size><data>
Array0x0D<var_uint_size><value_encoding_1>...<value_encoding_N>
Tuple0x0E<var_uint_size><value_encoding_1>...<value_encoding_N>
Map0x0F<var_uint_size><key_encoding_1><value_encoding_1>...<key_encoding_N><value_encoding_N>
IPv40x10<uint32_little_endian_value>
IPv60x11<uint128_little_endian_value>
UUID0x12<uuid_value>
Bool0x13<bool_value>
Object0x14<var_uint_size><var_uint_key_size_1><key_data_1><value_encoding_1>...<var_uint_key_size_N><key_data_N><value_encoding_N>
AggregateFunctionState0x15<var_uint_name_size><name_data><var_uint_data_size><data>
Negative infinity0xFE
Positive infinity0xFF