• <small id='T4pIl'></small><noframes id='T4pIl'>

    <i id='T4pIl'><tr id='T4pIl'><dt id='T4pIl'><q id='T4pIl'><span id='T4pIl'><b id='T4pIl'><form id='T4pIl'><ins id='T4pIl'></ins><ul id='T4pIl'></ul><sub id='T4pIl'></sub></form><legend id='T4pIl'></legend><bdo id='T4pIl'><pre id='T4pIl'><center id='T4pIl'></center></pre></bdo></b><th id='T4pIl'></th></span></q></dt></tr></i><div id='T4pIl'><tfoot id='T4pIl'></tfoot><dl id='T4pIl'><fieldset id='T4pIl'></fieldset></dl></div>

    1. <legend id='T4pIl'><style id='T4pIl'><dir id='T4pIl'><q id='T4pIl'></q></dir></style></legend>
      <tfoot id='T4pIl'></tfoot>
        <bdo id='T4pIl'></bdo><ul id='T4pIl'></ul>

        如何获取 Lucene 模糊搜索结果的匹配项?

        How to get Lucene Fuzzy Search result #39;s matching terms?(如何获取 Lucene 模糊搜索结果的匹配项?)

      1. <small id='7fqOm'></small><noframes id='7fqOm'>

      2. <legend id='7fqOm'><style id='7fqOm'><dir id='7fqOm'><q id='7fqOm'></q></dir></style></legend>
          <tfoot id='7fqOm'></tfoot>
          • <bdo id='7fqOm'></bdo><ul id='7fqOm'></ul>

          • <i id='7fqOm'><tr id='7fqOm'><dt id='7fqOm'><q id='7fqOm'><span id='7fqOm'><b id='7fqOm'><form id='7fqOm'><ins id='7fqOm'></ins><ul id='7fqOm'></ul><sub id='7fqOm'></sub></form><legend id='7fqOm'></legend><bdo id='7fqOm'><pre id='7fqOm'><center id='7fqOm'></center></pre></bdo></b><th id='7fqOm'></th></span></q></dt></tr></i><div id='7fqOm'><tfoot id='7fqOm'></tfoot><dl id='7fqOm'><fieldset id='7fqOm'></fieldset></dl></div>

                  <tbody id='7fqOm'></tbody>

                  本文介绍了如何获取 Lucene 模糊搜索结果的匹配项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  在使用 Lucene Fuzzy Search 时如何获得匹配的模糊词及其偏移量?

                  how do you get the matching fuzzy term and its offset when using Lucene Fuzzy Search?

                      IndexSearcher mem = ....(some standard code)
                  
                      QueryParser parser = new QueryParser(Version.LUCENE_30, CONTENT_FIELD, analyzer);
                  
                      TopDocs topDocs = mem.search(parser.parse("wuzzy~"), 1);
                      // the ~ triggers the fuzzy search as per "Lucene In Action" 
                  

                  模糊搜索工作正常.如果文档包含术语fuzzy"或luzzy",则匹配.如何获得匹配的术语以及它们的偏移量是多少?

                  The fuzzy search works fine. If a document contains the term "fuzzy" or "luzzy", it is matched. How do I get which term matched and what are their offsets?

                  我已确保所有 CONTENT_FIELD 都添加了带有位置和偏移量的 termVectorStored.

                  I have made sure that all CONTENT_FIELDs are added with termVectorStored with positions and offsets .

                  推荐答案

                  没有直接的方法可以做到这一点,但是我重新考虑了 Jared 的建议并且能够使解决方案发挥作用.

                  There was no straight forward way of doing this, however I reconsidered Jared's suggestion and was able to get the solution working.

                  我在这里记录一下,以防其他人遇到同样的问题.

                  I am documenting this here just in case someone else has the same issue.

                  创建一个实现org.apache.lucene.search.highlight.Formatter的类

                  public class HitPositionCollector implements Formatter
                  {
                      // MatchOffset is a simple DTO
                      private List<MatchOffset> matchList;
                      public HitPositionCollector(
                      {
                          matchList = new ArrayList<MatchOffset>();
                      }
                  
                      // this ie where the term start and end offset as well as the actual term is captured
                      @Override
                      public String highlightTerm(String originalText, TokenGroup tokenGroup)
                      {
                          if (tokenGroup.getTotalScore() <= 0)
                          {
                          }
                          else
                          {
                              MatchOffset mo= new MatchOffset(tokenGroup.getToken(0).toString(), tokenGroup.getStartOffset(),tokenGroup.getEndOffset());
                              getMatchList().add(mo);
                          }
                  
                          return originalText;
                      }
                  
                      /**
                      * @return the matchList
                      */
                      public List<MatchOffset> getMatchList()
                      {
                          return matchList;
                      }
                  }
                  

                  主代码

                  public void testHitsWithHitPositionCollector() throws Exception
                  {
                      System.out.println(" .... testHitsWithHitPositionCollector");
                      String fuzzyStr = "bro*";
                  
                      QueryParser parser = new QueryParser(Version.LUCENE_30, "f", analyzer);
                      Query fzyQry = parser.parse(fuzzyStr);
                      TopDocs hits = searcher.search(fzyQry, 10);
                  
                      QueryScorer scorer = new QueryScorer(fzyQry, "f");
                  
                      HitPositionCollector myFormatter= new HitPositionCollector();
                  
                      //Highlighter(Formatter formatter, Scorer fragmentScorer)
                      Highlighter highlighter = new Highlighter(myFormatter,scorer);
                      highlighter.setTextFragmenter(
                          new SimpleSpanFragmenter(scorer)
                      );
                  
                      Analyzer analyzer2 = new SimpleAnalyzer();
                  
                      int loopIndex=0;
                      //for (ScoreDoc sd : hits.scoreDocs) {
                          Document doc = searcher.doc( hits.scoreDocs[0].doc);
                          String title = doc.get("f");
                  
                          TokenStream stream = TokenSources.getAnyTokenStream(searcher.getIndexReader(),
                                                      hits.scoreDocs[0].doc,
                                                      "f",
                                                      doc,
                                                      analyzer2);
                  
                          String fragment = highlighter.getBestFragment(stream, title);
                  
                          System.out.println(fragment);
                          assertEquals("the quick brown fox jumps over the lazy dog", fragment);
                          MatchOffset mo= myFormatter.getMatchList().get(loopIndex++);
                  
                          assertTrue(mo.getEndPos()==15);
                          assertTrue(mo.getStartPos()==10);
                          assertTrue(mo.getToken().equals("brown"));
                  }
                  

                  这篇关于如何获取 Lucene 模糊搜索结果的匹配项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Lucene Porter Stemmer not public(Lucene Porter Stemmer 未公开)
                  How to index pdf, ppt, xl files in lucene (java based or python or php any of these is fine)?(如何在 lucene 中索引 pdf、ppt、xl 文件(基于 java 或 python 或 php 中的任何一个都可以)?)
                  KeywordAnalyzer and LowerCaseFilter/LowerCaseTokenizer(KeywordAnalyzer 和 LowerCaseFilter/LowerCaseTokenizer)
                  How to search between dates (Hibernate Search)?(如何在日期之间搜索(休眠搜索)?)
                  How to get positions from a document term vector in Lucene?(如何从 Lucene 中的文档术语向量中获取位置?)
                  Java Lucene 4.5 how to search by case insensitive(Java Lucene 4.5如何按不区分大小写进行搜索)
                    <tbody id='FZR5U'></tbody>
                  • <small id='FZR5U'></small><noframes id='FZR5U'>

                    <legend id='FZR5U'><style id='FZR5U'><dir id='FZR5U'><q id='FZR5U'></q></dir></style></legend>

                    • <bdo id='FZR5U'></bdo><ul id='FZR5U'></ul>
                      1. <i id='FZR5U'><tr id='FZR5U'><dt id='FZR5U'><q id='FZR5U'><span id='FZR5U'><b id='FZR5U'><form id='FZR5U'><ins id='FZR5U'></ins><ul id='FZR5U'></ul><sub id='FZR5U'></sub></form><legend id='FZR5U'></legend><bdo id='FZR5U'><pre id='FZR5U'><center id='FZR5U'></center></pre></bdo></b><th id='FZR5U'></th></span></q></dt></tr></i><div id='FZR5U'><tfoot id='FZR5U'></tfoot><dl id='FZR5U'><fieldset id='FZR5U'></fieldset></dl></div>

                        <tfoot id='FZR5U'></tfoot>