protostuff

A java serialization library
- efficient, both in speed and memory
- flexible, supporting pluggable formats
- schema evolution support

built-in forward-backward compatibility to previous/next versions of your data

Usecase
- messaging layer in RPC
- storage format in the datastore or cache
In-house deployments
- inventory and sales system (2012, early version of protostuffdb)
- payroll and timesheet on a face recognition system (2016, with flatbuffers and protostuffdb)
3rd-party users
The world's largest online gaming and sports betting software supplier
Code review tool that Understands your code
jadice document solutions
Flexible products and up-to-date viewing solutions for all document formats
The world leader in booking accommodation online

Opensource users
Object Relational Mapping, Persistence and Caching for Java
A distributed in-memory key/value data store with optional schema
A domain independent back-end framework for rolling out software updates to constrained edge devices as well as more powerful controllers and gateways connected to IP based networking infrastructure
Provides a cloud management platform for Redis Standalone, Redis Sentinel, Redis Cluster.
Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage

Media

protostuffdb

A c++ datastore based on leveldb
build both desktop and server apps in one common interface
- stored procedures via JNI

transactional, serialized operations

- http rpc

the stored procedures are exposed via http(s)

- cross platform

linux, windows

- replication

single master, multiple slaves (only linux machines can be master)

- hot-backup

linux-only, can be used to provision/fast-track a new slave by sending the checkpoint/backup as it's initial data

- realtime updates

the changes/updates received by a slave can be forwarded to clients (desktop/mobile/browser) via websockets

- portable

can be deployed as a self-contained linux executable with a bundled 12MB JRE

- embeddable as a shared library

for migration tools and testing

- automatic views

visitors for your declared indexes are generated from the DSL (rpc included)

- automatic ops

simple documents can have their create/update/delete ops fully generated from the DSL (rpc included)

4 types of secondary indexes
- simple (based on a single field)
- compound (based on multiple fields)
- unique (simple or compound)
- clustered (simple or compound)

“A clustering index maintains a copy of the entire row, not just the primary key. As a result, when querying on a clustering key lookups in the primary index are always avoided.
A clustering index is a covering index for any query. Another way to think of a clustering index is that it is a materialization of the table sorted in another order.”

History
Inspired by couchdb, whose author Damien Katz mentioned about being able to "touch your data" back in a 2009 presentation.
“I was so excited when I first started testing it out. Because a big goal with CouchDB was that you would be able to feel like you could touch your data, like it was right there in your hand. There’re certain tools where you have this responsiveness and you don’t feel like there’s all these layers between you that are obfuscating what you’re trying to get at.”

Couchdb was an erlang stack with javascript stored procedures (business logic, validation, map-reduce secondary indexes) and stored your data in json format.

It had distributed multi-master sync which meant not having support for multi-document transactions.

For my usecase, having the latter was required.

And so protostuffdb was born with the goal of being able to touch your data (like couchdb) while supporting multi-document transactions.

This stack is c/c++ with java stored procedures and stores your data in protostuff binary format.

Both validation and secondary indexes are implemented declaratively using a custom DSL on top of .proto files, where the schema of your data is defined. These files are compiled by fbsgen-ds.

fbsgen-ds

A compiler for .proto files
a custom DSL that combines protobuf with flatbuffers semantics
Available targets
- cpp (flatbuffers)
- java (protostuffdb entities, messages and services)
- typescript (protostuffdb messages and services)
- dart (protostuffdb messages and services)
- json (service definition)
- nginx.conf (upstream service definition)