> Introducing the ‘packed’ data format, a binary format that allows using data as it is, without the need for a deserialisation step. A notable perk of this format is that traversals on packed trees is proven to be faster than on ‘unpacked’ trees: as the fields of data structures are inlines, there are no pointer jumps, thus making the most of the L1 cache.
That is, a "memory dump -> zero-copy memory read" of a subgraph of Haskell objects, allowing to pass such trees / subgraphs directly over a network. Slightly reminiscent of Cap'n Proto.
Zolomon 6 hours ago [-]
They mention this in the article.
spockz 5 hours ago [-]
It reminds me more of flat buffers though. Does protobuf also have zero allocation (beyond initial ingestion) and no pointer jumps?
90s_dev 8 hours ago [-]
We are always reinventing wheels. If we didn't, they'd all still be made of wood.
tlb 2 hours ago [-]
> the serialised version of the data is usually bigger than its in-memory representation
I don’t think this is common. Perhaps for arrays of floats serialized as JSON or something. But I can’t think of a case where binary serialization is bigger. Data types like maps are necessarily larger in memory to support fast lookup and mutability.
nine_k 2 hours ago [-]
I suppose all self-describing formats, like protobuf, or thrift or, well, JSON are bigger than the efficient machine representation, because they carry the schema in every message, one way or another.
gitroom 40 minutes ago [-]
honestly i wish more stuff worked this way - fewer hops in memory always makes me happy
That is, a "memory dump -> zero-copy memory read" of a subgraph of Haskell objects, allowing to pass such trees / subgraphs directly over a network. Slightly reminiscent of Cap'n Proto.
I don’t think this is common. Perhaps for arrays of floats serialized as JSON or something. But I can’t think of a case where binary serialization is bigger. Data types like maps are necessarily larger in memory to support fast lookup and mutability.