问题描述
目前我正在使用 Lucene.net 2.9.2 框架.作为我的搜索结果,我想获得带有突出显示的文本片段的结果页面(asp.net).我希望所选片段是一个整个句子,而不仅仅是几个单词.
Currently I'm working with the Lucene.net 2.9.2 framework. As a result of my search I would like to achieve result page (asp.net) with highlighted text fragment. I would like that the selected fragment is a whole sentence and not only few words.
例如,如果我有文字:
Lorem ipsum dolor sit amet,consectetur adipisicing elit,sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.Exceptioneur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
我正在搜索 cupidatat 我想获取片段:
and I'm searching for cupidatat I would like to get fragment:
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
我现在的代码是:
var scorer = new QueryScorer(q);
var formatter = new SimpleHTMLFormatter("<div>", "</div>");
var highlighter = new Highlighter(formatter, scorer);
highlighter.SetTextFragmenter(new SimpleFragmenter(100));
var fragments = highlighter.GetBestFragments(stream, text, 1);
但它只返回大小为 100 的文本范围.
but it returns only text range of size 100.
如有任何建议,我将不胜感激.
I will be thankful for any suggestion.
推荐答案
你想创建一个新的 Fragmenter(类似于 SimpleFragmenter).您需要调整的功能是:
You want to create a new Fragmenter (Similar to SimpleFragmenter). The function you need to adjust is:
public virtual bool IsNewFragment(Token token)
{
bool isNewFrag = token.EndOffset() >= (fragmentSize * currentNumFrags);
if (isNewFrag)
{
currentNumFrags++;
}
return isNewFrag;
}
这可能需要一些调整,直到你得到正确的逻辑,但这应该会给你一个很好的开端
This will likely need some adjustment until you get the correct logic, but that should give you a pretty good head start
这篇关于在 Lucene.net 2.9.2 中突出显示整个句子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!