Apache Jackrabbit : Oakathon September 2018

Where and When

  • September 3rd - 7th 2018
  • Location: Adobe Basel, Rhein meeting room (5th floor), guests please register at reception on 2nd floor.

Attendees

Who

When

Marcel Reutegger

3. - 6.

Michael Dürig

3. - 7. except for Wed.

Matt Ryan

3. - 6.

Robert Munteanu

5. - 7.

Andrei Dulceanu

3. - 7.

Axel Hanikel

3. - 7.

Bogdan Ieran

3. - 7.

Thomas Mueller

4. - 7.

Julian Reschke

3. - 7. (remotely, except Tuesday morning)

Topics/Discussions/Goals

Title

Summary

Effort

Participants

Proposed by

CI Testing for cloud-based Oak components

While we currently have good test coverage for Oak cloud-based data stores (!S3DataStore and AzureDataStore), these tests require that the person running them have an account with the service provider and that their environment is configured properly to use their cloud service account for the tests. This makes them ill-suited for CI. We should come up with a good plan to allow CI testing on these components in a way that doesn't risk exposing any individual's cloud account credentials.

1-2h discussion

Matt Ryan, Amit Jain, Tomek Rekawek, Bogdan Ieran, others??

Matt Ryan

Direct Binary Access contribution post-mortem

I'd like to discuss this latest contribution to Oak, to see how I can do this better in the future.

1h discussion

Matt Ryan, Marcel Reutegger, Michael Duerig, Amit Jain, Julian Reschke, Bogdan Ieran, anyone else who wants to attend

Matt Ryan

Golden Master: variants and alternatives

Can we do a way with the golden master entirely removing a potential single point of failure? Can we leverage the pipeline to move segments around? Can we avoid copying "old" segments to new publish instances?

 

Francesco Mari, Andrei Dulceanu, Axel Hanikel, Michael Dürig, Bogdan Ieran, Tomek Rekawek

Michael Dürig

Modularization: single module release

Evaluate modularizing a single leaf module from the Oak codebase and start releasing it independently

1d

Robert Muntenau

Robert Munteanu

JCR Locking deprecation

State in Oak, plans for Sling and AEM?

1h

?

Julian Reschke

Agenda Proposal

Monday

  • 9:30 - 10:30 Present proposals and schedule sessions
  • 13:30 - 14:30 Golden Master: variants and alternatives kick off

Tuesday

  • 10:30 - 11:30 GM alternative topologies. Sync up and discussion of Tomek's ideas
  • 1:30 - 2:30 Offline text extraction POC (Matt & Thomas)

Wednesday

  • 13:30 - 16:00 Modularization: single module releases

Thursday

  • 9:15 - 10:30 Direct binary access post-mortem
  • 14:30 - 16:00 CI Testing for cloud-based Oak components
  • 16:00 - 17:00 Wrap-up part one

Friday

  • 16:00 - 17:00 Wrap-up part two

Prep Work

Notes from the Oakathon

CI Testing for cloud-based Oak components

Oak already has tests using docker and azurite. See module oak-segment-azure.

The Apache build infrastructure (builds.apache.org) seems to have docker installed, though it is not clear whether all slave nodes support it. See notes about old Jenkins node labels.

Since we already are using docker and azurite for oak-segment-store, the proposal is to also use azurite for oak-blob-cloud-azure and to use something similar (s3mock) for oak-blob-cloud. Likewise, the proposal is to also take a similar approach for other modules requiring external infrastructure for complete test coverage, like MongoMK and RDBMK for example - using appropriate docker containers.

Tests would need to be modified to leverage the capability, and should be aware whether the external infrastructure is available or not.

By default, tests should simply skip the external dependencies, so that new users running the tests don't get failures due to not having the supported infrastructure set up. We should define a flag so that it is possible to signify that you want to run all of the tests.

OAK-7740 was created to track this issue.

Modularization

The proposal is to build on top of the current modularization work ( OAK-6069 - Modularisation of Oak ) and prototype releasing a single module independently. The aim is to get a sense of what is needed for single module releases and assess whether we want to do that for the whole Oak project.

The following topics were discussed

  • Dependencies
    • Currently dependencies to other oak modules use the same version as the one in the reactor, e.g. 1.10-SNAPSHOT now. If we release modules independently that will no longer work - they need to be released versions. The question is - which version? We can use either of a) the latest release, b) the first release of the latest stable branch or c) the version minimally required from an API + implementation point of view
    • A special case of the dependencies is the parent pom. We use it for plugin settings and configurations, but also for dependency versions via the dependencyManagement pom element. We can no longer depend on SNAPSHOT versions of the parent, so this should be a release. The downside is that for each and every version bump in the oak project we would need to release a new parent pom version. A compromise would be to define versions only for truly global dependencies ( e.g. junit/mockito ).
  • Supported Oak versions
    • Once we modularise a project we need to decide which Oak versions it would support. It may happen that a module is tied to a certain Oak stream ( e.g. 1.6.x ) or that it supports all maintained streams ( e.g. 1.2.x → trunk ).
  • Which version is the first modularized release going to be:
    • Currently all Oak versions are in lock-step. With a new modularized release versions numbers will probably drift away so we need to make a conscious decision regarding versioning policy.
      • Continue from the next version, e.g. after 1.9.8 release 1.9.9, 1.9.10, etc
      • Change the bundle name, start from 1.0
        • Addtionally, encode the supported Oak stream name, e.g. 1.0.0-1.8
      • Bump the major version as a one-time change, e.g. start from 2.0
    • The discussion favoured the major version bump
  • Dependency on unstable releases
    • A point was raised that we should not depend on unstable Oak release from an independently released modules
    • However, this means that for the whole lifetime of an Oak unstable release we could not depend on new changes, only after making a stable release
  • How to run integration tests with a separately released module?
    • Currently we exercise the whole Oak codebase in a reactor build, so for instance a change in oak-api will get tested with all other modules
    • We should be able to run tests with adjusted dependency versions, similar to the AEM evergreen setup
      • Package + deploy all test artifacts
      • Run the tests with a different classpath, e.g. run with latest SNAPSHOT versions
  • What is the future expected release cadence?
    • This ties in to the 'dependency on unstable releases' point.
    • What do we expect/want the Oak release schedule to be in the future?
      • More frequent releases?
      • Give up on unstable releases?
      • What would a fully modularized release look like?
    • How do we make it easier for projects outside Oak to consume API/feature changes? Currently the 'unstable' label discourages API dependencies from e.g. Sling
  • Which modules do we want to use for the m12n prototype?
    • modules with few outgoing dependencies, such as oak-api?
    • modules with few incoming dependencies, such as oak-blob-cloud-azure?
    • stable modules from Jackrabbit, such as jackrabbit-webdav?
  • How can we ensure that we can revert the m12n release work?
    • Current changes would be to remove a module from the reactor and change all dependencies to releases?
    • What to do if we release and then decide it was not a good idea?
      • one proposal was to change the artifactId
  • How can we make sure that contributing does not become harder after starting with modularized releases?
  • We need to update our CI environment to make sure independent modules are tested, since they will no longer be a part of the reactor.
  • The release check tooling needs to be updated