Apache Jackrabbit : RepositoryMicroKernel

Repository MicroKernel API

December 2010 I started drafting an API representing the core MVCC-based persistence engine of a next-gen Jackrabbit version:

MicroKernel.java

API Design Goals

  • sessionless (there's no concept of sessions; an implementation doesn't need to track/manage session state)
  • small set of operations
  • simple data model
  • manage huge trees of nodes and properties efficiently
  • MVCC-based
  • highly scalable concurrent read & write operations
  • easily portable to other programming languages
  • easy to remote
  • efficient support for large number of sibling child nodes
  • integrated API for storing/retrieving large binaries
  • human-readable data serialization (JSON)

Data Model

  • simple JSON-inspired data model: just nodes and properties
  • a node consists of an unordered set of name -> item mappings.
    each property and child node is uniquely named and a single name
    can only refer to a property or a child node, not both at the same time.
  • properties are represented as name/value pairs
  • supported property types: string, number, boolean, array
  • a property value is stored and used as an opaque, unparsed character sequence

Implementation

  • GIT/SVN-inspired revision model
  • writes don't interfere with readers
  • single and very narrow point of synchronization on commit: concurrent commits are persisted in parallel and only synchronized while the symbolic HEAD reference is rewritten; interfering commits are merged top-down using a 3-way merge algorithm, i.e. large concurrent commits at different paths should be processed in parallel with only minimal synchronization overhead.
  • DAG-based content-addressable loose object store (DAG: Directed Acyclic Graph)

Source Code

[http://svn.apache.org/repos/asf/jackrabbit/oak/trunk/oak-mk/]

Documentation

MicroKernel Revision Model.pdf

Attachments:

MicroKernel Revision Model.pdf (application/pdf)