Class XPostingsHighlighter

    • Field Summary

      Fields

      Modifier and Type Field and Description
      static int DEFAULT_MAX_LENGTH
      Default maximum content size to process.
    • Constructor Summary

      Constructors

      Constructor and Description
      XPostingsHighlighter()
      Creates a new highlighter with default parameters.
      XPostingsHighlighter(int maxLength)
      Creates a new highlighter, specifying maximum content length.
    • Method Summary

      Modifier and Type Method and Description
      protected BreakIterator getBreakIterator(String field)
      Returns the BreakIterator to use for dividing text into passages.
      protected int getContentLength(String field, int docId)
       
      protected org.apache.lucene.search.postingshighlight.Passage[] getEmptyHighlight(String fieldName, BreakIterator bi, int maxPassages)
      Called to summarize a document when no hits were found.
      protected org.apache.lucene.search.postingshighlight.PassageFormatter getFormatter(String field)
      Returns the PassageFormatter to use for formatting passages into highlighted snippets.
      protected char getMultiValuedSeparator(String field)
      Returns the logical separator between values for multi-valued fields.
      protected int getOffsetForCurrentValue(String field, int docId)
       
      protected org.apache.lucene.search.postingshighlight.PassageScorer getScorer(String field)
      Returns the PassageScorer to use for ranking passages.
      String[] highlight(String field, org.apache.lucene.search.Query query, org.apache.lucene.search.IndexSearcher searcher, org.apache.lucene.search.TopDocs topDocs)
      Highlights the top passages from a single field.
      String[] highlight(String field, org.apache.lucene.search.Query query, org.apache.lucene.search.IndexSearcher searcher, org.apache.lucene.search.TopDocs topDocs, int maxPassages)
      Highlights the top-N passages from a single field.
      protected Map<Integer,Object> highlightField(String field, String[] contents, BreakIterator bi, org.apache.lucene.util.BytesRef[] terms, int[] docids, List<org.apache.lucene.index.AtomicReaderContext> leaves, int maxPassages)
       
      Map<String,String[]> highlightFields(String[] fieldsIn, org.apache.lucene.search.Query query, org.apache.lucene.search.IndexSearcher searcher, int[] docidsIn, int[] maxPassagesIn)
      Highlights the top-N passages from multiple fields, for the provided int[] docids.
      Map<String,String[]> highlightFields(String[] fields, org.apache.lucene.search.Query query, org.apache.lucene.search.IndexSearcher searcher, org.apache.lucene.search.TopDocs topDocs)
      Highlights the top passages from multiple fields.
      Map<String,String[]> highlightFields(String[] fields, org.apache.lucene.search.Query query, org.apache.lucene.search.IndexSearcher searcher, org.apache.lucene.search.TopDocs topDocs, int[] maxPassages)
      Highlights the top-N passages from multiple fields.
      Map<String,Object[]> highlightFieldsAsObjects(String[] fieldsIn, org.apache.lucene.search.Query query, org.apache.lucene.search.IndexSearcher searcher, int[] docidsIn, int[] maxPassagesIn)
       
      protected String[][] loadFieldValues(org.apache.lucene.search.IndexSearcher searcher, String[] fields, int[] docids, int maxLength)
      Loads the String values for each field X docID to be highlighted.
    • Field Detail

      • DEFAULT_MAX_LENGTH

        public static final int DEFAULT_MAX_LENGTH
        Default maximum content size to process. Typically snippets closer to the beginning of the document better summarize its content
        See Also:
        Constant Field Values
    • Constructor Detail

      • XPostingsHighlighter

        public XPostingsHighlighter()
        Creates a new highlighter with default parameters.
      • XPostingsHighlighter

        public XPostingsHighlighter(int maxLength)
        Creates a new highlighter, specifying maximum content length.
        Parameters:
        maxLength - maximum content size to process.
        Throws:
        IllegalArgumentException - if maxLength is negative or Integer.MAX_VALUE
    • Method Detail

      • getOffsetForCurrentValue

        protected int getOffsetForCurrentValue(String field,
                                               int docId)
      • getContentLength

        protected int getContentLength(String field,
                                       int docId)
      • getFormatter

        protected org.apache.lucene.search.postingshighlight.PassageFormatter getFormatter(String field)
        Returns the PassageFormatter to use for formatting passages into highlighted snippets. This returns a new PassageFormatter by default; subclasses can override to customize.
      • getScorer

        protected org.apache.lucene.search.postingshighlight.PassageScorer getScorer(String field)
        Returns the PassageScorer to use for ranking passages. This returns a new PassageScorer by default; subclasses can override to customize.
      • highlight

        public String[] highlight(String field,
                                  org.apache.lucene.search.Query query,
                                  org.apache.lucene.search.IndexSearcher searcher,
                                  org.apache.lucene.search.TopDocs topDocs)
                           throws IOException
        Highlights the top passages from a single field.
        Parameters:
        field - field name to highlight. Must have a stored string value and also be indexed with offsets.
        query - query to highlight.
        searcher - searcher that was previously used to execute the query.
        topDocs - TopDocs containing the summary result documents to highlight.
        Returns:
        Array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first sentence for the field will be returned.
        Throws:
        IOException - if an I/O error occurred during processing
        IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
      • highlight

        public String[] highlight(String field,
                                  org.apache.lucene.search.Query query,
                                  org.apache.lucene.search.IndexSearcher searcher,
                                  org.apache.lucene.search.TopDocs topDocs,
                                  int maxPassages)
                           throws IOException
        Highlights the top-N passages from a single field.
        Parameters:
        field - field name to highlight. Must have a stored string value and also be indexed with offsets.
        query - query to highlight.
        searcher - searcher that was previously used to execute the query.
        topDocs - TopDocs containing the summary result documents to highlight.
        maxPassages - The maximum number of top-N ranked passages used to form the highlighted snippets.
        Returns:
        Array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first maxPassages sentences from the field will be returned.
        Throws:
        IOException - if an I/O error occurred during processing
        IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
      • highlightFields

        public Map<String,String[]> highlightFields(String[] fields,
                                                    org.apache.lucene.search.Query query,
                                                    org.apache.lucene.search.IndexSearcher searcher,
                                                    org.apache.lucene.search.TopDocs topDocs)
                                             throws IOException
        Highlights the top passages from multiple fields.

        Conceptually, this behaves as a more efficient form of:

         Map m = new HashMap();
         for (String field : fields) {
           m.put(field, highlight(field, query, searcher, topDocs));
         }
         return m;
         
        Parameters:
        fields - field names to highlight. Must have a stored string value and also be indexed with offsets.
        query - query to highlight.
        searcher - searcher that was previously used to execute the query.
        topDocs - TopDocs containing the summary result documents to highlight.
        Returns:
        Map keyed on field name, containing the array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first sentence from the field will be returned.
        Throws:
        IOException - if an I/O error occurred during processing
        IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
      • highlightFields

        public Map<String,String[]> highlightFields(String[] fields,
                                                    org.apache.lucene.search.Query query,
                                                    org.apache.lucene.search.IndexSearcher searcher,
                                                    org.apache.lucene.search.TopDocs topDocs,
                                                    int[] maxPassages)
                                             throws IOException
        Highlights the top-N passages from multiple fields.

        Conceptually, this behaves as a more efficient form of:

         Map m = new HashMap();
         for (String field : fields) {
           m.put(field, highlight(field, query, searcher, topDocs, maxPassages));
         }
         return m;
         
        Parameters:
        fields - field names to highlight. Must have a stored string value and also be indexed with offsets.
        query - query to highlight.
        searcher - searcher that was previously used to execute the query.
        topDocs - TopDocs containing the summary result documents to highlight.
        maxPassages - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
        Returns:
        Map keyed on field name, containing the array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first maxPassages sentences from the field will be returned.
        Throws:
        IOException - if an I/O error occurred during processing
        IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
      • highlightFields

        public Map<String,String[]> highlightFields(String[] fieldsIn,
                                                    org.apache.lucene.search.Query query,
                                                    org.apache.lucene.search.IndexSearcher searcher,
                                                    int[] docidsIn,
                                                    int[] maxPassagesIn)
                                             throws IOException
        Highlights the top-N passages from multiple fields, for the provided int[] docids.
        Parameters:
        fieldsIn - field names to highlight. Must have a stored string value and also be indexed with offsets.
        query - query to highlight.
        searcher - searcher that was previously used to execute the query.
        docidsIn - containing the document IDs to highlight.
        maxPassagesIn - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
        Returns:
        Map keyed on field name, containing the array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first maxPassages from the field will be returned.
        Throws:
        IOException - if an I/O error occurred during processing
        IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
      • highlightFieldsAsObjects

        public Map<String,Object[]> highlightFieldsAsObjects(String[] fieldsIn,
                                                             org.apache.lucene.search.Query query,
                                                             org.apache.lucene.search.IndexSearcher searcher,
                                                             int[] docidsIn,
                                                             int[] maxPassagesIn)
                                                      throws IOException
        Throws:
        IOException
      • loadFieldValues

        protected String[][] loadFieldValues(org.apache.lucene.search.IndexSearcher searcher,
                                             String[] fields,
                                             int[] docids,
                                             int maxLength)
                                      throws IOException
        Loads the String values for each field X docID to be highlighted. By default this loads from stored fields, but a subclass can change the source. This method should allocate the String[fields.length][docids.length] and fill all values. The returned Strings must be identical to what was indexed.
        Throws:
        IOException
      • getMultiValuedSeparator

        protected char getMultiValuedSeparator(String field)
        Returns the logical separator between values for multi-valued fields. The default value is a space character, which means passages can span across values, but a subclass can override, for example with U+2029 PARAGRAPH SEPARATOR (PS) if each value holds a discrete passage for highlighting.
      • getEmptyHighlight

        protected org.apache.lucene.search.postingshighlight.Passage[] getEmptyHighlight(String fieldName,
                                                                                         BreakIterator bi,
                                                                                         int maxPassages)
        Called to summarize a document when no hits were found. By default this just returns the first maxPassages sentences; subclasses can override to customize.