Wire Format & Encoding
Understanding ZAP's zero-copy wire format and serialization
Wire Format & Encoding
ZAP C++ provides zero-copy serialization, meaning data can be read directly from the wire without parsing or copying.
Zero-Copy Design
The ZAP wire format stores data in a way that maps directly to in-memory structures. When you receive a message:
- No parsing is required
- No memory allocation for the message itself
- Fields can be accessed in any order
- Unused fields are never touched
This makes ZAP ideal for:
- Memory-mapped databases
- High-frequency trading
- Real-time systems
- Resource-constrained environments
Message Structure
Messages consist of:
- Segments - Contiguous blocks of memory
- Pointers - References between objects
- Data - Primitive values and blob data
Segment Layout
Word Alignment
All data is 8-byte (64-bit) aligned. This ensures efficient access on all modern architectures.
Building Messages
MallocMessageBuilder
The primary class for building messages:
Builder Methods
| Method | Description |
|---|---|
setField(value) | Set a scalar field |
initField() | Initialize a struct field |
initField(size) | Initialize a list field with given size |
getField() | Get a builder for a struct field |
hasField() | Check if a pointer field is set |
adoptField(orphan) | Adopt an orphan into this field |
disownField() | Remove and return field as orphan |
Text and Data Fields
List Fields
Reading Messages
Reader Interface
Reader Methods
| Method | Description |
|---|---|
getField() | Get field value (returns default if not set) |
hasField() | Check if pointer field is set |
isField() | Check which union member is set |
which() | Get the active union member |
Serialization Formats
Standard Format
Best for local IPC and memory-mapped files:
Packed Format
Compresses zero bytes for smaller messages - best for network transfer:
Format Comparison
| Format | Size | Speed | Use Case |
|---|---|---|---|
| Standard | Larger | Fastest | Local IPC, memory-mapped files |
| Packed | Smaller | Fast | Network, storage |
Flat Arrays
For in-memory handling:
Memory Mapping
For maximum performance with files, use memory mapping:
Streaming Multiple Messages
Writing a Stream
Reading a Stream
Unions
Building Unions
Reading Unions
Orphans
Orphans are objects not yet attached to a message:
Security Considerations
Traversal Limits
Protect against malicious messages that could cause excessive memory use:
Handling Untrusted Input
Text Format
For debugging and configuration, use the text format:
Performance Tips
- Reuse MessageBuilder - Call
message.clear()instead of creating new builders - Use packed format for network - Reduces bandwidth at minimal CPU cost
- Memory map large files - Avoids copying data entirely
- Pre-size lists - Use
initList(size)to avoid reallocations - Avoid dynamic API for hot paths - Compile-time types are faster
- Batch writes - Multiple small writes are slower than one large write
Next Steps
- Learn about the RPC System
- Explore the KJ Library
- See Examples