Lytok v3.0.1 – A high-density data protocol (JSON alternative)
LLM token savings angle is interesting, but binary formats are crowded.
Binary JSON with table reuse, but CBOR and MessagePack already own this space.
Backend developers building high-throughput APIs and data pipelines
MessagePack · CBOR · Protocol Buffers
It is not just an implementation; it is a full protocol that can be implemented in any language.
The major gains in performance and size come from highly compact binary data formats: packing numbers into fewer bytes, avoiding repeated strings and schemas, and applying other low-level optimizations.
Here are just a few optimizations I implemented:
Encoding integers: - JSON represents integers (and everything else) as textual tokens, depending on parsers, digit decoding such as char - 48, and delimiters. - Bytery represents integers as LUINT: a single byte, if value is up to 246. Values 247..254 indicate that the following additional 1..8 bytes should be used to build a big-endian integer, and 255 means null.
Encoding strings: - JSON represents strings with delimiters, escapes, and a pair of quotes, and it costs CPU to scan until the closing quote while also handling escape sequences. - Bytery represents strings as a pair [length:LUINT,data]. It reads the length as a LUINT and then reads exactly that amount of bytes. Fast, no parsing, no delimiters, no quote scanning, no escape processing.
* These are just a few examples. There are a lot more, like string cache table, schema cache, field types, etc. * The full spec.md has around 4k lines of specification, all written with care.
The protocol is fully lossless and can handle any standalone JSON object without requiring prior knowledge of schemas or data structures.
Bytery can also transport files in native binary format, without converting them to Base64 and paying the ~30% size overhead.
Bytery can also be combined with GZIP and other post-processing tools to make the payload even smaller.
My focus here is the protocol and wire format itself.
The project is free to use. My goal is to free the internet from the heavy overhead cost of parsing, storing, and transporting JSON through the wire, while allowing data to be decoded at high speed on the client.
LLM token savings angle is interesting, but binary formats are crowded.
22x faster startup than jc with full parser compatibility and zero Python dependencies.
Rust-powered BeautifulSoup with 10x speed and full API compatibility.
Static Go binary coding agent that skips Docker and runs anywhere.
Finally handles circular refs and Maps where JSON and MessagePack both fail.
Unified serialization API for six formats without codegen or macros using Zig comptime.