Apache Jackrabbit : Index Implementations

Oak Index Implementations

Lucene Index

  • State: works
  • Was originally planned to be mainly a fulltext index, but might be used to index properties as well
  • Theoretical limitations:
    • Not clear how it works in a clustered environment
    • Updates might need to be synchronized (no concurrent updates)

Property Index

  • State: works
  • Theoretical limitations:
    • All index data is stored in one node, and oak-core keeps as well as all current MicroKernel implementations nodes fully in memory, so indexes have to fix in memory
    • Concurrent updates might be a problem (maybe not)

Property Index with Child Nodes

  • State: doesn't exist yet
  • Uses a flat hierarchy (child nodes instead of properties)
  • Should solve the 'single node' problem of the Property Index

'Old' B-Tree Index

  • State: works
  • Is a MicroKernel wrapper, doesn't understand oak property types
  • Theoretical limitations:
    • Concurrent updates might be a problem (maybe not)

MongoDb Index

  • State: doesn't exist yet
  • Theoretical limitations:
    • MongoDb indexes can not be sharded

Solr

  • State: prototype
  • Source code is on: [https://github.com/tteofili/jackrabbit-oak/tree/trunk/oak-solr]

Virtual Index Implementations

Node Type Index

  • State: works
  • Internally uses a Property Index (see above), but could use any other index implementation

Traversing Index

  • Status: works
  • Traverses the repository below a given node

Test Cases

Concurrent Content Creation

  • See OAK-442

UUID Index

Currently, only save (committed) data is indexed. Therefore, queries of the form "jcr:uuid = ?" will only return saved nodes, and transient nodes are not returned.