Apache Jackrabbit : MicroKernelPrototype

MicroKernelPrototype

Source Code

http://svn.apache.org/repos/asf/jackrabbit/sandbox/jackrabbit-j3/

Overview

http://svn.apache.org/viewvc/jackrabbit/sandbox/jackrabbit-j3/src/site/resources/images/drawing.png?revision=939734&view=co

The Jackrabbit 3 MicroKernel prototype consists of the following components:

  • Test Application - uses the JCR 2.0 API to talk to Jackrabbit.
  • Package j3 contains the JCR API implementation and core classes.
  • Package j3.mc contains the MicroKernel implementation.
  • Packages j3.mc.jdbc and j3.mc.mem contain Storage engines implementations.

Core

  • Each RepositoryImpl has a read-only cache of NodeData objects. It also references the MicroKernel Storage and keeps a list of open SessionImpl.
  • Each SessionImpl has a weak-reference map of NodeState objects, to ensure multiple NodeImpl point to the same data. The transient space changes are currently kept in a (hard reference) map of NodeState as well as a list of Change objects. The plan is: if there are too many transient changes, the least recently used ones are stored in the MicroKernel (possibly in the same storage as the regular persistent data, so that a commit will only update pointers; another option is to store transient changes and other parts of the cache in a "temporary repository" if MicroKernel remoting is used). A session references one StorageSession. It also references the RepositoryImpl.
  • A NodeImpl object represents a node with a path. The object contains no data, but it references a parent (for the path) and a NodeState. Multiple NodeImpl objects can reference the same NodeState (shareable nodes, and if the client application requested the same node multiple times).
  • A NodeState is a wrapper for an (unmodified or modified) NodeData object. If the data gets modified, a new NodeData object is created (read-only NodeData objects are immutable). For each shareable node, there is at most one NodeState and one NodeData object per session. NodeState objects also keep a reference to a SessionImpl.
  • A PropertyImpl contains the NodeImpl reference and the property name. It does not contain data. Almost each method calls a method in NodeImpl or NodeState.
  • For each modification, a Change object is created. There are multiple Change subclasses (ChangeAddNode, ChangeRemoveNode, ChangeSetProperty). After each modification, a Change object is added to list in the session, and then the change is applied. If storing (Session.save) fails because of a concurrent update, the list of changes can be re-applied automatically if required (so that changes are automatically merged). A Change object is also the base for event logging and observation.

Micro Kernel

  • There is one Storage object per repository.
  • Each StorageSession represents a session. A session may or may not have 'state'. For example, a session of a JDBC-based storage engine may reference a JDBC connection, but it doesn't have to (possibly there is only one connection, or each request could open a new connection from a connection pool).
  • NodeData objects contain the actual data in form of Val objects. A NodeData object is mainly a collection of Val objects.
  • A Val object represents a value and a property type. It is immutable.
  • A Bundle object contains the serialized form of one or multiple NodeData objects (a byte array) for those storage engines that serialize the data (which is probably all except for in-memory storage and possibly plain-text persistence if somebody wants to implement that). If the MicroKernel is accessed remotely (which is just an idea for now), the Bundle will most likely also be used to serialize data between the Core and the MicroKernel.
  • There are currently two Storage / StorageSession implementations: an in-memory storage, and a prove-of-concept JDBC storage.