<i id='LyF2l'><tr id='LyF2l'><dt id='LyF2l'><q id='LyF2l'><span id='LyF2l'><b id='LyF2l'><form id='LyF2l'><ins id='LyF2l'></ins><ul id='LyF2l'></ul><sub id='LyF2l'></sub></form><legend id='LyF2l'></legend><bdo id='LyF2l'><pre id='LyF2l'><center id='LyF2l'></center></pre></bdo></b><th id='LyF2l'></th></span></q></dt></tr></i><div id='LyF2l'><tfoot id='LyF2l'></tfoot><dl id='LyF2l'><fieldset id='LyF2l'></fieldset></dl></div>

    <legend id='LyF2l'><style id='LyF2l'><dir id='LyF2l'><q id='LyF2l'></q></dir></style></legend>

    <small id='LyF2l'></small><noframes id='LyF2l'>

      <bdo id='LyF2l'></bdo><ul id='LyF2l'></ul>

        <tfoot id='LyF2l'></tfoot>

        Lucene.Net 模糊搜索速度

        Lucene.Net fuzzy search speed(Lucene.Net 模糊搜索速度)

          <bdo id='GONgX'></bdo><ul id='GONgX'></ul>
              <tbody id='GONgX'></tbody>

            1. <i id='GONgX'><tr id='GONgX'><dt id='GONgX'><q id='GONgX'><span id='GONgX'><b id='GONgX'><form id='GONgX'><ins id='GONgX'></ins><ul id='GONgX'></ul><sub id='GONgX'></sub></form><legend id='GONgX'></legend><bdo id='GONgX'><pre id='GONgX'><center id='GONgX'></center></pre></bdo></b><th id='GONgX'></th></span></q></dt></tr></i><div id='GONgX'><tfoot id='GONgX'></tfoot><dl id='GONgX'><fieldset id='GONgX'></fieldset></dl></div>

              <small id='GONgX'></small><noframes id='GONgX'>

                • <tfoot id='GONgX'></tfoot>
                • <legend id='GONgX'><style id='GONgX'><dir id='GONgX'><q id='GONgX'></q></dir></style></legend>
                  本文介绍了Lucene.Net 模糊搜索速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  很抱歉,希望能得到有Lucene经验的人的帮助.

                  Sorry for the concern, but I hope to get any help from Lucene-experienced people.

                  现在我们在应用程序中使用 Lucene.Net 3.0.3 来索引和搜索约 2.500.000 个项目.每个实体包含27个可搜索字段,以这种方式添加到索引中:new Field(key, value, Field.Store.YES, Field.Index.ANALYZED))

                  Now we use in our application Lucene.Net 3.0.3 to index and search by ~2.500.000 items. Each entity contains 27 searchable field, which added to index in this way: new Field(key, value, Field.Store.YES, Field.Index.ANALYZED))

                  现在我们有两个搜索选项:

                  Now we have two search options:

                  1. 使用模糊搜索仅搜索 4 个字段
                  2. 使用精确搜索按 4-27 个字段进行搜索

                  我们有一项搜索服务,每周自动搜索大约 53000 人,例如Bob Huston"、Sara Conor"、Sujan Hong Uin Ho"等.

                  We have a search service that every week automatically searches by about 53000 people such "Bob Huston", "Sara Conor", "Sujan Hong Uin Ho", etc.

                  所以我们在选项 1 中遇到了缓慢的搜索速度,在 searcher.Search 中平均需要 4-8 秒,这是我们的主要问题.

                  So we experience slow search speed in option 1, its an average 4-8 sec in searcher.Search and its our major problem.

                  搜索示例代码:

                                  var index = FSDirectory.Open(indexPath);
                                  var searcher = new IndexSearcher(index, true);
                                  this.analyzer = new StandardAnalyzer(Version.LUCENE_30, new HashSet<string>())
                                  var queryParser = new MultiFieldQueryParser(Version.LUCENE_30, queryFields, this.analyzer);
                                  queryParser.AllowLeadingWildcard = false;
                                  Query query;
                                  query = queryParser.Parse(token);
                                  var results = searcher.Search(query, NumberOfResults);// NumberOfResults==500
                  

                  我们的模糊搜索查询在 4 个字段中找到bob cong hong":

                  Our fuzzy search query to find "bob cong hong" in 4 fields:

                  ((((PersonFirstName:bob~0.6) OR (PersonLastName:bob~0.6) OR (PersonAliases:bob~0.6) OR (PersonAlternativeSpellings:bob~0.6)) AND ((PersonFirstName:cong~0.6) OR (PersonLastName:cong~0.6) OR (PersonAliases:cong~0.6) OR (PersonAlternativeSpellings:cong~0.6)) AND ((PersonFirstName:hong~0.6) OR (PersonLastName:hong~0.6) OR (PersonAliases:hong~0.6) OR (PersonAlternativeSpellings:hong~0.6)))

                  (((PersonFirstName:bob~0.6) OR (PersonLastName:bob~0.6) OR (PersonAliases:bob~0.6) OR (PersonAlternativeSpellings:bob~0.6)) AND ((PersonFirstName:cong~0.6) OR (PersonLastName:cong~0.6) OR (PersonAliases:cong~0.6) OR (PersonAlternativeSpellings:cong~0.6)) AND ((PersonFirstName:hong~0.6) OR (PersonLastName:hong~0.6) OR (PersonAliases:hong~0.6) OR (PersonAlternativeSpellings:hong~0.6)))

                  当前的改进:

                  1. 我们将这 4 个字段合并为 1 个搜索字段
                  2. 我们决定在服务中使用单个 IndexSearcher,而不是在每个搜索请求中都打开
                  3. MergeFactor=2

                  综合改进带来大约30-40% 的速度提升.

                  根据这篇文章,我们做了大部分可能的优化:

                  Following this article we`ve made most of possible optimizations:

                  • 索引放置在速度非常快的 SAS 驱动器上:http://accessories.euro.dell.com/sna/productdetail.aspx?c=ie&l=en&s=dhs&cs=iedhs1&sku=400-AHWT#Overview
                  • 我们有足够的 RAM 内存
                  • 合并因子 2
                  • 尝试将索引移动到RAMDirectory,但测试结果不稳定,有时速度相同

                  您有其他建议如何在我们的情况下提高搜索速度?

                  Do you have other suggestions how to improve search speed in our situation?

                  谢谢.

                  推荐答案

                  您可以通过将模糊查询的前缀长度设置为非零值来提高模糊查询的速度.这将允许 lucene 有效地缩小可能结果的范围.像这样:

                  You can improve the speed of Fuzzy Queries by setting their prefix length to a non-zero value. This will allow lucene to narrow the set of possible results efficiently. Like this:

                  queryParser.FuzzyPrefixLength = 2;
                  

                  此外,它不会影响您作为示例提供的查询,但如果您完全关心性能,则应删除 queryParser.AllowLeadingWildcard = false; 行.领先的通配符绝对会影响性能.

                  Also, it doesn't affect the query you've provided as an example, but if you care at all about performance, you should remove the line queryParser.AllowLeadingWildcard = false;. Leading wildcards will absolutely kill performance.

                  这篇关于Lucene.Net 模糊搜索速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  C# namespace alias - what#39;s the point?(C# 命名空间别名 - 有什么意义?)
                  Using Xpath With Default Namespace in C#(在 C# 中使用具有默认命名空间的 Xpath)
                  Generating an EDMX from a DB2 Database(从 DB2 数据库生成 EDMX)
                  IBM .NET Data Provider Connection String issue with Library List(库列表的 IBM .NET 数据提供程序连接字符串问题)
                  .NET DB2 OLEDB pre-requisites(.NET DB2 OLEDB 先决条件)
                  Referring to Code in IBM.Data.DB2 makes that Assembly Unavailable to the rest of my Solution(引用 IBM.Data.DB2 中的代码使该程序集对我的解决方案的其余部分不可用)

                    <tfoot id='tV0CA'></tfoot>
                    <i id='tV0CA'><tr id='tV0CA'><dt id='tV0CA'><q id='tV0CA'><span id='tV0CA'><b id='tV0CA'><form id='tV0CA'><ins id='tV0CA'></ins><ul id='tV0CA'></ul><sub id='tV0CA'></sub></form><legend id='tV0CA'></legend><bdo id='tV0CA'><pre id='tV0CA'><center id='tV0CA'></center></pre></bdo></b><th id='tV0CA'></th></span></q></dt></tr></i><div id='tV0CA'><tfoot id='tV0CA'></tfoot><dl id='tV0CA'><fieldset id='tV0CA'></fieldset></dl></div>

                  1. <legend id='tV0CA'><style id='tV0CA'><dir id='tV0CA'><q id='tV0CA'></q></dir></style></legend>

                    <small id='tV0CA'></small><noframes id='tV0CA'>

                          <bdo id='tV0CA'></bdo><ul id='tV0CA'></ul>

                            <tbody id='tV0CA'></tbody>