Data Serialization
Data serialization is the process of converting data into a format that can be easily stored or transmitted over a network. It typically involves encoding data into a sequence of bytes, which can later be decoded to reconstruct the original data structure. Pactus has two methods for data serialization: Deterministic serialization and CBOR serialization:
Deterministic Serialization
Pactus uses a deterministic serialization for the deterministic data like blocks and transactions. The serialization format for different types of data is listed in the table below:
Data Type | Size (bytes) | Description |
---|---|---|
uint8 | 1 | An 8 bits unsigned integer |
int8 | 1 | An 8 bits signed integer |
uint16 | 2 | A 16 bits unsigned integer |
int16 | 2 | A 16 bits signed integer |
uint32 | 4 | A 32 bits unsigned integer |
int32 | 4 | A 32 bits signed signed integer |
uint64 | 8 | A 64 bits unsigned integer |
int64 | 8 | A 64 bits signed signed integer |
VarInt | Variable | A compact representation of an unsigned integer. |
VarByte | Variable | A variable length bytes |
VarString | Variable | A variable length string |
Address | 21 | 21 bytes of address data |
Hash32 | 32 | 32 bytes of hash data |
VarInt
Variable length integer (VarInt) is encoded by 7-bit chunks. The MSB indicates whether there are
more octets (1) or it is the last one (0). It means 0x00
to 0x7f
encoded in 1 byte, 0x80
to
0x3fff
encoded in 2 bytes, …
Example:
0x0f -> 0f
0x1000 -> 8020
0xffff -> ffff03
0xffffff -> ffffff07
VarByte
Variable length byte (VarByte) is encoded as a variable length integer (VarInt) containing the
length of the array followed by the bytes themselves: VarInt(len(bytes)) || bytes
VarString
Variable length string (VarString) is encoded as a variable length integer (VarInt) containing the
length of the string followed by the bytes that represent the string
itself:VarInt(len(str)) || str
Byte Order
All the internal number representation are in little-endian byte order.
Example
Here is an example of a block header data that encoded using deterministic serialization:
CBOR Serialization
For non-deterministic data, such as networking messages, Pactus uses “Concise Binary Object Representation” or CBOR. CBOR is a binary data serialization format that is widely used in various application, including IoT, web services, security, and automotive, due to its compact representation and efficient parsing.
CBOR Me!
cbor.me is an online tool for encoding and decoding CBOR data, offering developers an easy way to test and validate their CBOR data without having to set up a local environment.
Example
Here is an example of a vote message that encoded using CBOR.