问题描述
我有一些包含两个字段的文档:文本、计数.
I have some docs with two fields: text, count.
我使用 Lucene
来索引文档,现在我想在文本中搜索并获得按计数降序排序的结果.我该怎么做?
I've used Lucene
to index docs and now I want to search in text and get the result sorted by count in descending order. How can I do that?
推荐答案
Apache Lucene 的默认搜索实现返回按 score 排序的结果(最相关的结果在前),然后按 id 排序(最旧的结果在前).
The default search implementation of Apache Lucene returns results sorted by score (the most relevant result first), then by id (the oldest result first).
可以在查询时使用附加的排序参数自定义此行为.
This behavior can be customized at query time with an additionnal Sort parameter .
TopFieldDocs Searcher#search(Query query, Filter filter, int n, Sort sort)
TopFieldDocs Searcher#search(Query query, Filter filter, int n, Sort sort)
Sort 参数指定用于排序的字段或属性.默认实现是这样定义的:
The Sort parameter specifies the fields or properties used for sorting. The default implementation is defined this way :
new Sort(new SortField[] { SortField.FIELD_SCORE, SortField.FIELD_DOC });
要更改排序,您只需将字段替换为您想要的字段:
To change sorting, you just have to replace fields with the ones you want :
new Sort(new SortField[] {
SortField.FIELD_SCORE,
new SortField("field_1", SortField.STRING),
new SortField("field_2", SortField.STRING) });
这听起来很简单,但在满足以下条件之前不会起作用:
This sounds simple, but will not work until the following conditions are met :
- 必须指定 SortField(String field, int) 的类型参数type) 让 Lucene 找到你的字段,即使这很正常可选.
排序字段必须被索引但不能被标记:
- You have to specify the type parameter of SortField(String field, int type) to make Lucene find your field, even if this is normaly optional.
The sort fields must be indexed but not tokenized :
document.add (new Field ("byNumber", Integer.toString(x), Field.Store.NO, Field.Index.NOT_ANALYZED));
排序字段内容只能是纯文本.如果只有一个元素在使用的字段之一中具有特殊字符或重音对于排序,整个搜索将返回未排序的结果.
The sort fields content must be plain text only. If only one single element has a special character or accent in one of the fields used for sorting, the whole search will return unsorted results.
查看此教程.
这篇关于基于数字字段在Lucene中对搜索结果进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!