Class MergePolicy
- java.lang.Object
-
- org.apache.lucene.index.MergePolicy
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Cloneable
- Direct Known Subclasses:
LogMergePolicy
,NoMergePolicy
,TieredMergePolicy
,UpgradeIndexMergePolicy
public abstract class MergePolicy extends Object implements Closeable, Cloneable
Expert: a MergePolicy determines the sequence of primitive merge operations.
Whenever the segments in an index have been altered by
IndexWriter
, either the addition of a newly flushed segment, addition of many segments from addIndexes* calls, or a previous merge that may now need to cascade,IndexWriter
invokesfindMerges(org.apache.lucene.index.MergePolicy.MergeTrigger, org.apache.lucene.index.SegmentInfos)
to give the MergePolicy a chance to pick merges that are now required. This method returns aMergePolicy.MergeSpecification
instance describing the set of merges that should be done, or null if no merges are necessary. When IndexWriter.forceMerge is called, it callsfindForcedMerges(SegmentInfos,int,Map)
and the MergePolicy should then return the necessary merges.Note that the policy can return more than one merge at a time. In this case, if the writer is using
SerialMergeScheduler
, the merges will be run sequentially but if it is usingConcurrentMergeScheduler
they will be run concurrently.The default MergePolicy is
TieredMergePolicy
.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
MergePolicy.DocMap
A map of doc IDs.static class
MergePolicy.MergeAbortedException
Thrown when a merge was explicity aborted becauseIndexWriter.close(boolean)
was called withfalse
.static class
MergePolicy.MergeException
Exception thrown if there are any problems while executing a merge.static class
MergePolicy.MergeSpecification
A MergeSpecification instance provides the information necessary to perform multiple merges.static class
MergePolicy.MergeTrigger
MergeTrigger is passed tofindMerges(MergeTrigger, SegmentInfos)
to indicate the event that triggered the merge.static class
MergePolicy.OneMerge
OneMerge provides the information necessary to perform an individual primitive merge operation, resulting in a single new segment.
-
Field Summary
Fields Modifier and Type Field Description protected static long
DEFAULT_MAX_CFS_SEGMENT_SIZE
Default max segment size in order to use compound file system.protected static double
DEFAULT_NO_CFS_RATIO
Default ratio for compound file system usage.protected long
maxCFSSegmentSize
If the size of the merged segment exceeds this value then it will not use compound file format.protected double
noCFSRatio
If the size of the merge segment exceeds this ratio of the total index size then it will remain in non-compound formatprotected SetOnce<IndexWriter>
writer
IndexWriter
that contains this instance.
-
Constructor Summary
Constructors Modifier Constructor Description MergePolicy()
Creates a new merge policy instance.protected
MergePolicy(double defaultNoCFSRatio, long defaultMaxCFSSegmentSize)
Creates a new merge policy instance with default settings for noCFSRatio and maxCFSSegmentSize.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description MergePolicy
clone()
abstract void
close()
Release all resources for the policy.abstract MergePolicy.MergeSpecification
findForcedDeletesMerges(SegmentInfos segmentInfos)
Determine what set of merge operations is necessary in order to expunge all deletes from the index.abstract MergePolicy.MergeSpecification
findForcedMerges(SegmentInfos segmentInfos, int maxSegmentCount, Map<SegmentCommitInfo,Boolean> segmentsToMerge)
Determine what set of merge operations is necessary in order to merge to <= the specified segment count.abstract MergePolicy.MergeSpecification
findMerges(MergePolicy.MergeTrigger mergeTrigger, SegmentInfos segmentInfos)
Determine what set of merge operations are now necessary on the index.double
getMaxCFSSegmentSizeMB()
Returns the largest size allowed for a compound file segmentdouble
getNoCFSRatio()
Returns currentnoCFSRatio
.protected boolean
isMerged(SegmentInfos infos, SegmentCommitInfo info)
Returns true if this single info is already fully merged (has no pending deletes, is in the same dir as the writer, and matches the current compound file settingvoid
setIndexWriter(IndexWriter writer)
Sets theIndexWriter
to use by this merge policy.void
setMaxCFSSegmentSizeMB(double v)
If a merged segment will be more than this value, leave the segment as non-compound file even if compound file is enabled.void
setNoCFSRatio(double noCFSRatio)
If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled.protected long
size(SegmentCommitInfo info)
Return the byte size of the providedSegmentCommitInfo
, pro-rated by percentage of non-deleted documents is set.boolean
useCompoundFile(SegmentInfos infos, SegmentCommitInfo mergedInfo)
Returns true if a new segment (regardless of its origin) should use the compound file format.
-
-
-
Field Detail
-
DEFAULT_NO_CFS_RATIO
protected static final double DEFAULT_NO_CFS_RATIO
Default ratio for compound file system usage. Set to 1.0, always use compound file system.- See Also:
- Constant Field Values
-
DEFAULT_MAX_CFS_SEGMENT_SIZE
protected static final long DEFAULT_MAX_CFS_SEGMENT_SIZE
Default max segment size in order to use compound file system. Set toLong.MAX_VALUE
.- See Also:
- Constant Field Values
-
writer
protected SetOnce<IndexWriter> writer
IndexWriter
that contains this instance.
-
noCFSRatio
protected double noCFSRatio
If the size of the merge segment exceeds this ratio of the total index size then it will remain in non-compound format
-
maxCFSSegmentSize
protected long maxCFSSegmentSize
If the size of the merged segment exceeds this value then it will not use compound file format.
-
-
Constructor Detail
-
MergePolicy
public MergePolicy()
Creates a new merge policy instance. Note that if you intend to use it without passing it toIndexWriter
, you should callsetIndexWriter(IndexWriter)
.
-
MergePolicy
protected MergePolicy(double defaultNoCFSRatio, long defaultMaxCFSSegmentSize)
Creates a new merge policy instance with default settings for noCFSRatio and maxCFSSegmentSize. This ctor should be used by subclasses using different defaults than theMergePolicy
-
-
Method Detail
-
clone
public MergePolicy clone()
-
setIndexWriter
public void setIndexWriter(IndexWriter writer)
Sets theIndexWriter
to use by this merge policy. This method is allowed to be called only once, and is usually set by IndexWriter. If it is called more than once,SetOnce.AlreadySetException
is thrown.- See Also:
SetOnce
-
findMerges
public abstract MergePolicy.MergeSpecification findMerges(MergePolicy.MergeTrigger mergeTrigger, SegmentInfos segmentInfos) throws IOException
Determine what set of merge operations are now necessary on the index.IndexWriter
calls this whenever there is a change to the segments. This call is always synchronized on theIndexWriter
instance so only one thread at a time will call this method.- Parameters:
mergeTrigger
- the event that triggered the mergesegmentInfos
- the total set of segments in the index- Throws:
IOException
-
findForcedMerges
public abstract MergePolicy.MergeSpecification findForcedMerges(SegmentInfos segmentInfos, int maxSegmentCount, Map<SegmentCommitInfo,Boolean> segmentsToMerge) throws IOException
Determine what set of merge operations is necessary in order to merge to <= the specified segment count.IndexWriter
calls this when itsIndexWriter.forceMerge(int)
method is called. This call is always synchronized on theIndexWriter
instance so only one thread at a time will call this method.- Parameters:
segmentInfos
- the total set of segments in the indexmaxSegmentCount
- requested maximum number of segments in the index (currently this is always 1)segmentsToMerge
- contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is True for a given SegmentInfo, that means this segment was an original segment present in the to-be-merged index; else, it was a segment produced by a cascaded merge.- Throws:
IOException
-
findForcedDeletesMerges
public abstract MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos segmentInfos) throws IOException
Determine what set of merge operations is necessary in order to expunge all deletes from the index.- Parameters:
segmentInfos
- the total set of segments in the index- Throws:
IOException
-
close
public abstract void close()
Release all resources for the policy.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
-
useCompoundFile
public boolean useCompoundFile(SegmentInfos infos, SegmentCommitInfo mergedInfo) throws IOException
Returns true if a new segment (regardless of its origin) should use the compound file format. The default implementation returnstrue
iff the size of the given mergedInfo is less or equal togetMaxCFSSegmentSizeMB()
and the size is less or equal to the TotalIndexSize *getNoCFSRatio()
otherwisefalse
.- Throws:
IOException
-
size
protected long size(SegmentCommitInfo info) throws IOException
Return the byte size of the providedSegmentCommitInfo
, pro-rated by percentage of non-deleted documents is set.- Throws:
IOException
-
isMerged
protected final boolean isMerged(SegmentInfos infos, SegmentCommitInfo info) throws IOException
Returns true if this single info is already fully merged (has no pending deletes, is in the same dir as the writer, and matches the current compound file setting- Throws:
IOException
-
getNoCFSRatio
public final double getNoCFSRatio()
Returns currentnoCFSRatio
.- See Also:
setNoCFSRatio(double)
-
setNoCFSRatio
public final void setNoCFSRatio(double noCFSRatio)
If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled. Set to 1.0 to always use CFS regardless of merge size.
-
getMaxCFSSegmentSizeMB
public final double getMaxCFSSegmentSizeMB()
Returns the largest size allowed for a compound file segment
-
setMaxCFSSegmentSizeMB
public final void setMaxCFSSegmentSizeMB(double v)
If a merged segment will be more than this value, leave the segment as non-compound file even if compound file is enabled. Set this to Double.POSITIVE_INFINITY (default) and noCFSRatio to 1.0 to always use CFS regardless of merge size.
-
-