Repository MicroKernel API
December 2010 I started drafting an API representing the core MVCC-based persistence engine of a next-gen Jackrabbit version:
API Design Goals
- sessionless (there's no concept of sessions; an implementation doesn't need to track/manage session state)
- small set of operations
- simple data model
- manage huge trees of nodes and properties efficiently
- MVCC-based
- highly scalable concurrent read & write operations
- easily portable to other programming languages
- easy to remote
- efficient support for large number of sibling child nodes
- integrated API for storing/retrieving large binaries
- human-readable data serialization (JSON)
Data Model
- simple JSON-inspired data model: just nodes and properties
- a node consists of an unordered set of name -> item mappings.
each property and child node is uniquely named and a single name
can only refer to a property or a child node, not both at the same time. - properties are represented as name/value pairs
- supported property types: string, number, boolean, array
- a property value is stored and used as an opaque, unparsed character sequence
Implementation
- GIT/SVN-inspired revision model
- writes don't interfere with readers
- single and very narrow point of synchronization on commit: concurrent commits are persisted in parallel and only synchronized while the symbolic HEAD reference is rewritten; interfering commits are merged top-down using a 3-way merge algorithm, i.e. large concurrent commits at different paths should be processed in parallel with only minimal synchronization overhead.
- DAG-based content-addressable loose object store (DAG: Directed Acyclic Graph)
Source Code
[http://svn.apache.org/repos/asf/jackrabbit/oak/trunk/oak-mk/]
Documentation
Attachments:
MicroKernel Revision Model.pdf (application/pdf)