1. <i id='jyIqW'><tr id='jyIqW'><dt id='jyIqW'><q id='jyIqW'><span id='jyIqW'><b id='jyIqW'><form id='jyIqW'><ins id='jyIqW'></ins><ul id='jyIqW'></ul><sub id='jyIqW'></sub></form><legend id='jyIqW'></legend><bdo id='jyIqW'><pre id='jyIqW'><center id='jyIqW'></center></pre></bdo></b><th id='jyIqW'></th></span></q></dt></tr></i><div id='jyIqW'><tfoot id='jyIqW'></tfoot><dl id='jyIqW'><fieldset id='jyIqW'></fieldset></dl></div>

      <tfoot id='jyIqW'></tfoot>
      • <bdo id='jyIqW'></bdo><ul id='jyIqW'></ul>
      <legend id='jyIqW'><style id='jyIqW'><dir id='jyIqW'><q id='jyIqW'></q></dir></style></legend>

      <small id='jyIqW'></small><noframes id='jyIqW'>

    1. 边界匹配器正则表达式 () 上的以下片段问题

      Issue with below snippet on boundary matchers regex ()(边界匹配器正则表达式 () 上的以下片段问题)

      • <i id='naXSV'><tr id='naXSV'><dt id='naXSV'><q id='naXSV'><span id='naXSV'><b id='naXSV'><form id='naXSV'><ins id='naXSV'></ins><ul id='naXSV'></ul><sub id='naXSV'></sub></form><legend id='naXSV'></legend><bdo id='naXSV'><pre id='naXSV'><center id='naXSV'></center></pre></bdo></b><th id='naXSV'></th></span></q></dt></tr></i><div id='naXSV'><tfoot id='naXSV'></tfoot><dl id='naXSV'><fieldset id='naXSV'></fieldset></dl></div>
          <tfoot id='naXSV'></tfoot>

          <legend id='naXSV'><style id='naXSV'><dir id='naXSV'><q id='naXSV'></q></dir></style></legend>
            <bdo id='naXSV'></bdo><ul id='naXSV'></ul>
          • <small id='naXSV'></small><noframes id='naXSV'>

              <tbody id='naXSV'></tbody>

                本文介绍了边界匹配器正则表达式 () 上的以下片段问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                问题描述

                我的意见:

                 1. end 
                 2. end of the day or end of the week 
                 3. endline
                 4. something 
                 5. "something" end
                

                基于上述讨论,如果我尝试使用此代码段替换单个字符串,它会成功从该行中删除相应的单词

                Based on the above discussions, If I try to replace a single string using this snippet, it removes the appropriate words from the line successfully

                public class DeleteTest {
                
                    public static void main(String[] args) {
                
                        // TODO Auto-generated method stub
                        try {
                        File file = new File("C:/Java samples/myfile.txt");
                        File temp = File.createTempFile("myfile1", ".txt", file.getParentFile());
                        String delete="end";
                        BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
                        PrintWriter writer = new PrintWriter(new OutputStreamWriter(new FileOutputStream(temp)));
                
                        for (String line; (line = reader.readLine()) != null;) {
                            line = line.replaceAll("\b"+delete+"\b", "");
                       writer.println(line);
                        }
                        reader.close();
                        writer.close();
                        }
                        catch (Exception e) {
                            System.out.println("Something went Wrong");
                        }
                    }
                }
                

                我的输出如果我使用上面的片段:(也是我的预期输出)

                My output If I use the above snippet:(Also my expected output)

                 1.  
                 2. of the day or of the week
                 3. endline
                 4. something
                 5. "something"
                

                但是当我包含更多要删除的单词时,并且为此我使用 Set 时,我使用以下代码片段:

                But when I include more words to delete, and for that purpose when I use Set, I use the below code snippet:

                public static void main(String[] args) {
                
                    // TODO Auto-generated method stub
                    try {
                
                    File file = new File("C:/Java samples/myfile.txt");
                    File temp = File.createTempFile("myfile1", ".txt", file.getParentFile());
                    BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
                    PrintWriter writer = new PrintWriter(new OutputStreamWriter(new FileOutputStream(temp)));
                
                        Set<String> toDelete = new HashSet<>();
                        toDelete.add("end");
                        toDelete.add("something");
                
                    for (String line; (line = reader.readLine()) != null;) {
                        line = line.replaceAll("\b"+toDelete+"\b", "");
                    writer.println(line);
                    }
                    reader.close();
                    writer.close();
                    }
                    catch (Exception e) {
                        System.out.println("Something went Wrong");
                    }
                }
                

                我的输出是:(它只是删除了空间)

                I get my output as: (It just removes the space)

                 1. end
                 2. endofthedayorendoftheweek
                 3. endline
                 4. something
                 5. "something" end 
                

                你们能帮我解决这个问题吗?

                Can u guys help me on this?

                点击这里关注线程

                推荐答案

                你需要创建一个 交替组出组与

                You need to create an alternation group out of the set with

                String.join("|", toDelete)
                

                并用作

                line = line.replaceAll("\b(?:"+String.join("|", toDelete)+")\b", "");
                

                图案看起来像

                (?:end|something)
                

                请参阅 正则表达式演示.这里,(?:...) 是一个非捕获组,用于分组几个备选方案,而不为捕获(您不需要它,因为您删除了匹配项).

                See the regex demo. Here, (?:...) is a non-capturing group that is used to group several alternatives without creating a memory buffer for the capture (you do not need it since you remove the matches).

                或者,更好的是,在进入循环之前编译正则表达式:

                Or, better, compile the regex before entering the loop:

                Pattern pat = Pattern.compile("\b(?:" + String.join("|", toDelete) + ")\b");
                ...
                    line = pat.matcher(line).replaceAll("");
                

                更新:

                要允许匹配可能包含特殊字符的整个单词",您需要 Pattern.quote 这些单词以转义这些特殊字符,然后您需要使用明确的单词边界,(?<!w) 而不是初始的  以确保之前没有单词 char 和 (?!w) 负前瞻而不是最后的  以确保匹配后没有单词 char.

                To allow matching whole "words" that may contain special chars, you need to Pattern.quote those words to escape those special chars, and then you need to use unambiguous word boundaries, (?<!w) instead of the initial  to make sure there is no word char before and (?!w) negative lookahead instead of the final  to make sure there is no word char after the match.

                在 Java 8 中,您可以使用以下代码:

                In Java 8, you may use this code:

                Set<String> nToDel = new HashSet<>();
                nToDel = toDelete.stream()
                    .map(Pattern::quote)
                    .collect(Collectors.toCollection(HashSet::new));
                String pattern = "(?<!\w)(?:" + String.join("|", nToDel) + ")(?!\w)";
                

                正则表达式看起来像 (?<!w)(?:Q+endE|Qsomething-E)(?!w).请注意,QE 之间的符号被解析为 文字符号.

                The regex will look like (?<!w)(?:Q+endE|Qsomething-E)(?!w). Note that the symbols between Q and E is parsed as literal symbols.

                这篇关于边界匹配器正则表达式 () 上的以下片段问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                相关文档推荐

                Compiling C++ for the JVM(为 JVM 编译 C++)
                Compile to java bytecode (without using Java)(编译成java字节码(不使用Java))
                How to drive C#, C++ or Java compiler to compute 1+2+3+...+1000 at compile time?(如何在编译时驱动 C#、C++ 或 Java 编译器计算 1+2+3+...+1000?)
                Java ClassLoader: load same class twice(Java ClassLoader:两次加载相同的类)
                How to debug .class files in ECLIPSE?(如何在 ECLIPSE 中调试 .class 文件?)
                Java quot;The blank final field may not have been initializedquot; Anonymous Interface vs Lambda Expression(Java“可能尚未初始化空白的最终字段匿名接口与 Lambda 表达式)
                <i id='AQkw3'><tr id='AQkw3'><dt id='AQkw3'><q id='AQkw3'><span id='AQkw3'><b id='AQkw3'><form id='AQkw3'><ins id='AQkw3'></ins><ul id='AQkw3'></ul><sub id='AQkw3'></sub></form><legend id='AQkw3'></legend><bdo id='AQkw3'><pre id='AQkw3'><center id='AQkw3'></center></pre></bdo></b><th id='AQkw3'></th></span></q></dt></tr></i><div id='AQkw3'><tfoot id='AQkw3'></tfoot><dl id='AQkw3'><fieldset id='AQkw3'></fieldset></dl></div>

                1. <small id='AQkw3'></small><noframes id='AQkw3'>

                    <bdo id='AQkw3'></bdo><ul id='AQkw3'></ul>
                      <tbody id='AQkw3'></tbody>
                  • <legend id='AQkw3'><style id='AQkw3'><dir id='AQkw3'><q id='AQkw3'></q></dir></style></legend><tfoot id='AQkw3'></tfoot>