FileVault Document View (DocView) Format
Overview
FileVault uses a slightly different format than the Document View specified by JCR 2.0.
In general all nodes that cannot be serialized as plain directories or as plain files are
serialized into DocView XML files. If the node can only be partially mapped to a directory or file,
it will be accompanied with a .content.xml
containing the residual content.
For example, a full coverage content tree, starting at the node example
will be serialized into
an example.xml
file, using the (FileVault) DocView format.
For example, a sling:Folder node, named libs
will be serialized into a directory libs
and a
libs/.content.xml
file, using the (FileVault) DocView format.
Also see the Vault FS article about this.
Deviations from the JCR Document View
Root Element
The root element of the FileVault DocView is always jcr:root
no matter of the node name it serializes.
Because the node name is implicitly given by either the file name or the directory name, it would be
redundant to repeat the node name in the document.
Example:
<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0"
jcr:primaryType="sling:Folder"
title="Libraries"/>
As by that the filename implicitly sets the root element name, its namespace is also supposed to be declared in the XML in case it is using a prefix. This is similar to all other elements in the XML.
Empty Elements
The deserialization treats empty elements different than the default JCR 2.0 DocView Import, as empty elements never create a new node in the repository but are merely used to define the child node sort order. Nodes/properties below empty elements will never be removed during import (JCRVLT-251)
Property Values
The probably biggest different to the JCR 2.0 DocView is the handling of the property values. All properties are serialized as XML attributes as in JCR, but their values have the property type encoded. The format of the attribute value is:
property-value := [ "{" property-type "}" ] ( value | "[" [ value { "," value } ] "]"
If no type is specified, it defaults to STRING.
As types all arguments accepted by PropertyType.valueFromName(String) are valid.
This is all strings defined by the constants whose names start with TYPENAME_
in PropertyType and BinaryRef
for binary reference values (see below).
Multi-value properties contain the values as comma-separated list enclosed by [
and ]
. The special value \0
must be used to for a singleton multi-value property containing only the empty string.
Examples:
Type | Value | Serialized |
---|---|---|
String | “Hello, world!” | “Hello, world!” |
Long | 42 | “{Long}42” |
Boolean | true | “{Boolean}true” |
Double[] | {1.0, 2.5, 3.0} | “{Double}[1.0,2.5,3.0]” |
Binary Properties
In contrast to JCR 2.0 DocView binary properties are not supported inline via base64 encoding. Instead either a dedicated .binary file must be used or a regular file aggregate which sets the jcr:content/jcr:data
binary property implicitly.
Only for binary values which also implement org.apache.jackrabbit.api.ReferenceBinary
the string identifier of the binary reference is optionally given directly inside in a FileVault DocView (JCR-3534). This only happens in case the package property useBinaryReferences
is set to true
. Package imports use this approach whenever a binary reference property in FileVault DocView XML is found (with property-type
= BinaryRef
). This only works if the source and the destination repository of the package share the same data store.
Empty binary attributes (with values "{Binary}"
) will always leave the according property in the repository untouched during import (in case such a property already exists). Every other value for binary properties leads to an exception.
Escaping
The raw attribute value is escaped in order to preserve the special semantics:
Character | Escape Sequence | Comment |
---|---|---|
\ |
\\ |
|
, |
\, |
Only necessary for multi-value properties |
[ |
\[ |
Only necessary at the start of single-value properties |
{ |
\{ |
Only necessary at the start of single-value properties |
invalid xml character | \uXXXX |
Unicode code points |
<empty string> |
\0 |
Only necessary within singleton multi-value properties to indicate an empty string item (JCRVLT-4). |
Please note, that this escaping only concerns the raw attribute value. If the value contains
characters that cannot be used in XML attributes, like &
or <
, the according XML entities need to be used.