Class MongoDocumentFilter


  • public class MongoDocumentFilter
    extends Object
    Implements a filter to decide if a given Mongo document should be processed or ignored based on its path. The filter has two configuration parameters:
    • filteredPath - The path where the filter is applied. Only the documents inside this path will be considered for filtering. Documents in other paths will all be accepted.
    • suffixesToSkip - A list of suffixes to filter. That is, any document whose path ends in one of these suffixes will be filtered.

    The intent of this filter is to be applied as close as possible to the download/decoding of the documents from Mongo, in order to filter unnecessary documents early and avoid spending resources processing them.

    • Constructor Detail

      • MongoDocumentFilter

        public MongoDocumentFilter​(String filteredPath,
                                   List<String> suffixesToSkip)
    • Method Detail

      • shouldSkip

        public boolean shouldSkip​(String fieldName,
                                  String idOrPathValue)
        Parameters:
        fieldName - Name of the Mongo document field. Expected to be either _id or _path
        idOrPathValue - The value of the field
        Returns:
        true if the document should be skipped, false otherwise
      • isFilteringDisabled

        public boolean isFilteringDisabled()
      • getSkippedFields

        public long getSkippedFields()
      • getLongPathSkipped

        public long getLongPathSkipped()
      • formatTopK

        public String formatTopK​(int k)