Class CheckIndex

  • public class CheckIndex
    extends Object
    Basic tool and API to check the health of an index and write a new segments file that removes reference to problematic segments.

    As this tool checks every byte in the index, on a large index it can take quite a long time to run.

    • Constructor Detail

      • CheckIndex

        public CheckIndex​(Directory dir)
        Create a new CheckIndex on the directory.
    • Method Detail

      • setCrossCheckTermVectors

        public void setCrossCheckTermVectors​(boolean v)
        If true, term vectors are compared against postings to make sure they are the same. This will likely drastically increase time it takes to run CheckIndex!
      • setInfoStream

        public void setInfoStream​(PrintStream out,
                                  boolean verbose)
        Set infoStream where messages should go. If null, no messages are printed. If verbose is true then more details are printed.
      • checkIndex

        public CheckIndex.Status checkIndex()
                                     throws IOException
        Returns a CheckIndex.Status instance detailing the state of the index.

        As this method checks every byte in the index, on a large index it can take quite a long time to run.

        WARNING: make sure you only call this when the index is not opened by any writer.

      • checkIndex

        public CheckIndex.Status checkIndex​(List<String> onlySegments)
                                     throws IOException
        Returns a CheckIndex.Status instance detailing the state of the index.
        onlySegments - list of specific segment names to check

        As this method checks every byte in the specified segments, on a large index it can take quite a long time to run.

        WARNING: make sure you only call this when the index is not opened by any writer.

      • fixIndex

        public void fixIndex​(CheckIndex.Status result)
                      throws IOException
        Repairs the index using previously returned result from checkIndex(). Note that this does not remove any of the unreferenced files after it's done; you must separately open an IndexWriter, which deletes unreferenced files when it's created.

        WARNING: this writes a new segments file into the index, effectively removing all documents in broken segments from the index. BE CAREFUL.

        WARNING: Make sure you only call this when the index is not opened by any writer.

      • main

        public static void main​(String[] args)
                         throws IOException,
        Command-line interface to check and fix an index.

        Run it like this:

            java -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex pathToIndex [-fix] [-verbose] [-segment X] [-segment Y]
        • -fix: actually write a new segments_N file, removing any problematic segments
        • -segment X: only check the specified segment(s). This can be specified multiple times, to check more than one segment, eg -segment _2 -segment _a. You can't use this with the -fix option.

        WARNING: -fix should only be used on an emergency basis as it will cause documents (perhaps many) to be permanently removed from the index. Always make a backup copy of your index before running this! Do not run this tool on an index that is actively being written to. You have been warned!

        Run without -fix, this tool will open the index, report version information and report any exceptions it hits and what action it would take if -fix were specified. With -fix, this tool will remove any segments that have issues and write a new segments_N file. This means all documents contained in the affected segments will be removed.

        This tool exits with exit code 1 if the index cannot be opened or has any corruption, else 0.
