Class RDBDocumentStore

  • All Implemented Interfaces:
    DocumentStore

    public class RDBDocumentStore
    extends Object
    implements DocumentStore
    Implementation of DocumentStore for relational databases.

    Supported Databases

    The code is supposed to be sufficiently generic to run with a variety of database implementations. However, the tables are created when required to simplify testing, and that code specifically supports these databases:

    • H2DB
    • Apache Derby
    • IBM DB2
    • PostgreSQL
    • MariaDB (MySQL)
    • Microsoft SQL Server
    • Oracle

    Table Layout

    Data for each of the DocumentStore's Collections is stored in its own database table (with a name matching the collection).

    The tables essentially implement key/value storage, where the key usually is derived from an Oak path, and the value is a serialization of a Document (or a part of one). Additional fields are used for queries, debugging, and concurrency control:

    Column Type Description
    ID varchar(512) not null primary key The document's key (for databases that can not handle 512 character primary keys, such as MySQL, varbinary is possible as well).
    MODIFIED bigint Low-resolution timestamp.
    HASBINARY smallint Flag indicating whether the document has binary properties.
    DELETEDONCE smallint Flag indicating whether the document has been deleted once.
    MODCOUNT bigint Modification counter, used for avoiding overlapping updates.
    DSIZE bigint The approximate size of the document's JSON serialization (for debugging purposes).
    VERSION smallint The schema version the code writing to a row (or inserting it) was aware of (introduced with schema version 1). Not set for rows written by version 0 client code.
    SDTYPE smallint Split Document type.
    SDMAXREVTIME bigint Split document max revision time..
    DATA varchar(16384) The document's JSON serialization (only used for small document sizes, in which case BDATA (below) is not set), or a sequence of JSON serialized update operations to be applied against the last full serialization.
    BDATA blob The document's JSON serialization (usually GZIPped, only used for "large" documents).

    The names of database tables can be prefixed; the purpose is mainly for testing, as tables can also be dropped automatically when the store is disposed (this only happens for those tables that have been created on demand).

    Versioning

    The initial database layout used in OAK 1.0 through 1.6 is version 0.

    Version 1 introduces an additional "version" column, which records the schema version of the code writing to the database (upon insert and update). This is in preparation of future layout changes which might introduce new columns.

    Version 2 introduces an additional "sdtype" and "sdmaxrevtime".

    The code deals with both version 0, version 1 and version 2 table layouts. By default, it tries to create version 2 tables, and also tries to upgrade existing version 0 and 1 tables to version 2.

    DB-specific information

    Databases need to be configured so that:

    • Text fields support all Unicode code points,
    • Collation of text fields happens by Unicode code point,
    • and BLOBs need to support at least 16 MB.

    See the RDBDocumentStore documentation for more information.

    Table Creation

    The code tries to create the tables when they are not present. Likewise, it tries to upgrade to a newer schema when needed.

    Users/Administrators who prefer to stay in control over table generation can create them "manually". The oak-run "rdbddldump" command can be used to print out the DDL statements that would have been used for auto-creation and/or automatic schema updates.

    Caching

    The cache borrows heavily from the MongoDocumentStore implementation.

    Queries

    The implementation currently supports only three indexed properties: "_bin", "deletedOnce", and "_modified". Attempts to use a different indexed property will cause a DocumentStoreException.

    • Method Detail

      • find

        public <T extends Document> T find​(Collection<T> collection,
                                           String id)
        Description copied from interface: DocumentStore
        Get the document with the given key. This is a convenience method and equivalent to DocumentStore.find(Collection, String, int) with a maxCacheAge of Integer.MAX_VALUE.

        The returned document is immutable.

        Specified by:
        find in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        id - the key
        Returns:
        the document, or null if not found
      • find

        public <T extends Document> T find​(Collection<T> collection,
                                           String id,
                                           int maxCacheAge)
        Description copied from interface: DocumentStore
        Get the document with the key. The implementation may serve the document from a cache, but the cached document must not be older than the given maxCacheAge in milliseconds. An implementation must invalidate a cached document when it detects it is outdated. That is, a subsequent call to DocumentStore.find(Collection, String) must return the newer version of the document.

        The returned document is immutable.

        Specified by:
        find in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        id - the key
        maxCacheAge - the maximum age of the cached document (in ms)
        Returns:
        the document, or null if not found
      • query

        @NotNull
        public <T extends Document> @NotNull List<T> query​(Collection<T> collection,
                                                           String fromKey,
                                                           String toKey,
                                                           int limit)
        Description copied from interface: DocumentStore
        Get a list of documents where the key is greater than a start value and less than an end value.

        The returned documents are sorted by key and are immutable.

        Specified by:
        query in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        fromKey - the start value (excluding)
        toKey - the end value (excluding)
        limit - the maximum number of entries to return (starting with the lowest key)
        Returns:
        the list (possibly empty)
      • query

        @NotNull
        public <T extends Document> @NotNull List<T> query​(Collection<T> collection,
                                                           String fromKey,
                                                           String toKey,
                                                           String indexedProperty,
                                                           long startValue,
                                                           int limit)
        Description copied from interface: DocumentStore
        Get a list of documents where the key is greater than a start value and less than an end value and the given "indexed property" is greater or equals the specified value.

        The indexed property can either be a Long value, in which case numeric comparison applies, or a Boolean value, in which case "false" is mapped to "0" and "true" is mapped to "1".

        The returned documents are sorted by key and are immutable.

        Specified by:
        query in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        fromKey - the start value (excluding)
        toKey - the end value (excluding)
        indexedProperty - the name of the indexed property (optional)
        startValue - the minimum value of the indexed property
        limit - the maximum number of entries to return
        Returns:
        the list (possibly empty)
      • remove

        public <T extends Document> void remove​(Collection<T> collection,
                                                String id)
        Description copied from interface: DocumentStore
        Remove a document. This method does nothing if there is no document with the given key.

        In case of a DocumentStoreException, the document with the given key may or may not have been removed from the store. It is the responsibility of the caller to check whether it still exists. The implementation however ensures that the result of the operation is properly reflected in the document cache. That is, an implementation could simply evict the document with the given key.

        Specified by:
        remove in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        id - the key
      • remove

        public <T extends Document> void remove​(Collection<T> collection,
                                                List<String> ids)
        Description copied from interface: DocumentStore
        Batch remove documents with given keys. Keys for documents that do not exist are simply ignored. If this method fails with an exception, then only some of the documents identified by keys may have been removed.

        In case of a DocumentStoreException, the documents with the given keys may or may not have been removed from the store. It may also be possible that only some have been removed from the store. It is the responsibility of the caller to check which documents still exist. The implementation however ensures that the result of the operation is properly reflected in the document cache. That is, an implementation could simply evict documents with the given keys from the cache.

        Specified by:
        remove in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        ids - list of keys
      • remove

        public <T extends Document> int remove​(Collection<T> collection,
                                               Map<String,​Long> toRemove)
        Description copied from interface: DocumentStore
        Batch remove documents with given keys and corresponding equal conditions on NodeDocument.MODIFIED_IN_SECS values. Keys for documents that do not exist are simply ignored. A document is only removed if the corresponding condition is met.

        In case of a DocumentStoreException, the documents with the given keys may or may not have been removed from the store. It may also be possible that only some have been removed from the store. It is the responsibility of the caller to check which documents still exist. The implementation however ensures that the result of the operation is properly reflected in the document cache. That is, an implementation could simply evict documents with the given keys from the cache.

        Specified by:
        remove in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection.
        toRemove - the keys of the documents to remove with the corresponding timestamps.
        Returns:
        the number of removed documents.
      • remove

        public <T extends Document> int remove​(Collection<T> collection,
                                               String indexedProperty,
                                               long startValue,
                                               long endValue)
                                        throws DocumentStoreException
        Description copied from interface: DocumentStore
        Batch remove documents where the given "indexed property" is within the given range (exclusive) - (startValue, endValue).

        The indexed property is a Long value and numeric comparison applies.

        In case of a DocumentStoreException, the documents with the given keys may or may not have been removed from the store. It may also be possible that only some have been removed from the store. It is the responsibility of the caller to check which documents still exist. The implementation however ensures that the result of the operation is properly reflected in the document cache. That is, an implementation could simply evict documents with the given keys from the cache.

        Specified by:
        remove in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection.
        indexedProperty - the name of the indexed property
        startValue - the minimum value of the indexed property (exclusive)
        endValue - the maximum value of the indexed property (exclusive)
        Returns:
        the number of removed documents.
        Throws:
        DocumentStoreException - if the operation failed. E.g. because of an I/O error.
      • create

        public <T extends Document> boolean create​(Collection<T> collection,
                                                   List<UpdateOp> updateOps)
        Description copied from interface: DocumentStore
        Try to create a list of documents. This method returns true iff none of the documents existed before and the create was successful. This method will return false if one of the documents already exists in the store. Some documents may still have been created in the store. An implementation does not have to guarantee an atomic create of all the documents described in the updateOps. It is the responsibility of the caller to check, which documents were created and take appropriate action. The same is true when this method throws DocumentStoreException (e.g. when a communication error occurs). In this case only some documents may have been created.
        Specified by:
        create in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        updateOps - the list of documents to add (where UpdateOp.Conditions are not allowed)
        Returns:
        true if this worked (if none of the documents already existed)
      • createOrUpdate

        public <T extends Document> T createOrUpdate​(Collection<T> collection,
                                                     UpdateOp update)
        Description copied from interface: DocumentStore
        Atomically checks if the document exists and updates it, otherwise the document is created (aka "upsert"), unless the update operation requires the document to be present (see UpdateOp.isNew()). The returned document is immutable.

        If this method fails with a DocumentStoreException, then the document may or may not have been created or updated. It is the responsibility of the caller to check the result e.g. by calling DocumentStore.find(Collection, String). The implementation however ensures that the result of the operation is properly reflected in the document cache. That is, an implementation could simply evict documents with the given keys from the cache.

        Specified by:
        createOrUpdate in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        update - the update operation (where UpdateOp.Conditions are not allowed)
        Returns:
        the old document or null if it either didn't exist before, or the UpdateOp required the document to be present but UpdateOp.isNew() was false.
      • createOrUpdate

        public <T extends DocumentList<T> createOrUpdate​(Collection<T> collection,
                                                           List<UpdateOp> updateOps)
        Description copied from interface: DocumentStore
        Create or unconditionally update a number of documents. An implementation does not have to guarantee that all changes are applied atomically, together.

        In case of a DocumentStoreException (e.g. when a communication error occurs) only some changes may have been applied. In this case it is the responsibility of the caller to check which UpdateOps were applied and take appropriate action. The implementation however ensures that the result of the operations are properly reflected in the document cache. That is, an implementation could simply evict documents related to the given update operations from the cache.

        Specified by:
        createOrUpdate in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        updateOps - the update operation list
        Returns:
        the list containing old documents or null values if they didn't exist before (see DocumentStore.createOrUpdate(Collection, UpdateOp)), where the order reflects the order in the "updateOps" parameter
      • findAndUpdate

        public <T extends Document> T findAndUpdate​(Collection<T> collection,
                                                    UpdateOp update)
        Description copied from interface: DocumentStore
        Performs a conditional update (e.g. using UpdateOp.Condition.Type.EXISTS and only updates the document if the condition is true. The returned document is immutable.

        In case of a DocumentStoreException (e.g. when a communication error occurs) the update may or may not have been applied. In this case it is the responsibility of the caller to check whether the update was applied and take appropriate action. The implementation however ensures that the result of the operation is properly reflected in the document cache. That is, an implementation could simply evict the document related to the given update operation from the cache.

        Specified by:
        findAndUpdate in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        update - the update operation with the condition
        Returns:
        the old document or null if the condition is not met or if the document wasn't found
        See Also:
        DocumentStore.createOrUpdate(Collection, List)
      • invalidateCache

        public CacheInvalidationStats invalidateCache()
        Description copied from interface: DocumentStore
        Invalidate the document cache. Calling this method instructs the implementation to invalidate each document from the cache, which is not up to date with the underlying storage at the time this method is called. A document is considered in the cache if DocumentStore.getIfCached(Collection, String) returns a non-null value for a key.

        An implementation is allowed to perform lazy invalidation and only check whether a document is up-to-date when it is accessed after this method is called. However, this also includes a call to DocumentStore.getIfCached(Collection, String), which must only return the document if it was up-to-date at the time this method was called. Similarly, a call to DocumentStore.find(Collection, String) must guarantee the returned document reflects all the changes done up to when invalidateCache() was called.

        In some implementations this method can be a NOP because documents can only be modified through a single instance of a DocumentStore.

        Specified by:
        invalidateCache in interface DocumentStore
        Returns:
        cache invalidation statistics or null if none are available.
      • determineServerTimeDifferenceMillis

        public long determineServerTimeDifferenceMillis()
        Specified by:
        determineServerTimeDifferenceMillis in interface DocumentStore
        Returns:
        the estimated time difference in milliseconds between the local instance and the (typically common, shared) document server system. The value can be zero if the times are estimated to be equal, positive when the local instance is ahead of the remote server and negative when the local instance is behind the remote server. An invocation is not cached and typically requires a round-trip to the server (but that is not a requirement).
      • getDroppedTables

        public String getDroppedTables()
      • getTableNames

        public static List<String> getTableNames()
      • getIfCached

        public <T extends Document> T getIfCached​(Collection<T> collection,
                                                  String id)
        Description copied from interface: DocumentStore
        Fetches the cached document. If the document is not present in the cache null will be returned. This method is consistent with other find methods that may return cached documents and will return null even when the implementation has a negative cache for documents that do not exist. This method will never return NodeDocument.NULL.
        Specified by:
        getIfCached in interface DocumentStore
        Type Parameters:
        T - the document type
        Parameters:
        collection - the collection
        id - the key
        Returns:
        cached document if present. Otherwise null.
      • getStats

        @NotNull
        public @NotNull Map<String,​String> getStats()
        Statistics are generated for each table. The following fields are always added:
        tableName.ns
        fully qualified name of the database table
        tableName.schemaInfo
        DDL information for table, as obtained during startup
        tableName.indexInfo
        DDL information for associated indexes, as obtained during startup
        tableName.count
        exact number of rows
        In addition, some statistics information for Collection.CLUSTER_NODES is added:
        clusterNodes.updates
        Writes to the table, counted by cluster node ID
        Finally, additional database-specific statistics may be added; see descriptions in RDBDocumentStoreDB.getAdditionalStatistics(RDBConnectionHandler, String, String) for details.
        Specified by:
        getStats in interface DocumentStore
        Returns:
        statistics about this document store.
      • isReadOnly

        public boolean isReadOnly()
      • getTable

        @NotNull
        protected <T extends Document> @NotNull org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.RDBTableMetaData getTable​(Collection<T> collection)
      • asBytes

        public static byte[] asBytes​(@NotNull
                                     @NotNull String data)
      • setReadWriteMode

        public void setReadWriteMode​(String readWriteMode)
        Description copied from interface: DocumentStore
        Set the level of guarantee for read and write operations, if supported by this backend.
        Specified by:
        setReadWriteMode in interface DocumentStore
        Parameters:
        readWriteMode - the read/write mode
      • convertFromDBObject

        @NotNull
        protected <T extends Document> T convertFromDBObject​(@NotNull
                                                             @NotNull Collection<T> collection,
                                                             @NotNull
                                                             @NotNull RDBRow row)