如何从 JDOM 获取节点内容

How to get node contents from JDOM(如何从 JDOM 获取节点内容)
本文介绍了如何从 JDOM 获取节点内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在使用 import org.jdom.* 编写一个 java 应用程序;

I'm writing an application in java using import org.jdom.*;

我的 XML 是有效的,但有时它包含 HTML 标记.例如,像这样:

My XML is valid,but sometimes it contains HTML tags. For example, something like this:

  <program-title>Anatomy &amp; Physiology</program-title>
  <overview>
       <content>
              For more info click <a href="page.html">here</a>
              <p>Learn more about the human body.  Choose from a variety of Physiology (A&amp;P) designed for complementary therapies.&amp;#160; Online studies options are available.</p>
       </content>
  </overview>
  <key-information>
     <category>Health &amp; Human Services</category>

所以我的问题在于 <p > overview.content 节点内的标签.

So my problem is with the < p > tags inside the overview.content node.

我希望这段代码可以工作:

I was hoping that this code would work :

        Element overview = sds.getChild("overview");
        Element content = overview.getChild("content");

        System.out.println(content.getText());

但它返回空白.

如何从 overview.content 节点返回所有文本(嵌套标签和所有)?

How do I return all the text ( nested tags and all ) from the overview.content node ?

谢谢

推荐答案

content.getText() 提供即时文本,该文本仅对带有文本内容的叶子元素有用.

content.getText() gives immediate text which is only useful fine with the leaf elements with text content.

技巧是使用 org.jdom.output.XMLOutputter (带文本模式 CompactFormat )

Trick is to use org.jdom.output.XMLOutputter ( with text mode CompactFormat )

public static void main(String[] args) throws Exception {
    SAXBuilder builder = new SAXBuilder();
    String xmlFileName = "a.xml";
    Document doc = builder.build(xmlFileName);

    Element root = doc.getRootElement();
    Element overview = root.getChild("overview");
    Element content = overview.getChild("content");

    XMLOutputter outp = new XMLOutputter();

    outp.setFormat(Format.getCompactFormat());
    //outp.setFormat(Format.getRawFormat());
    //outp.setFormat(Format.getPrettyFormat());
    //outp.getFormat().setTextMode(Format.TextMode.PRESERVE);

    StringWriter sw = new StringWriter();
    outp.output(content.getContent(), sw);
    StringBuffer sb = sw.getBuffer();
    System.out.println(sb.toString());
}

输出

For more info click<a href="page.html">here</a><p>Learn more about the human body. Choose from a variety of Physiology (A&amp;P) designed for complementary therapies.&amp;#160; Online studies options are available.</p>

请探索其他 格式化 选项并在上面进行修改根据您的需要编写代码.

Do explore other formatting options and modify above code to your need.

封装XMLOutputter格式选项的类.典型用户可以使用getRawFormat()(不改变空白)、getPrettyFormat()(空白美化)、getCompactFormat()(空白归一化)得到的标准格式配置."

"Class to encapsulate XMLOutputter format options. Typical users can use the standard format configurations obtained by getRawFormat() (no whitespace changes), getPrettyFormat() (whitespace beautification), and getCompactFormat() (whitespace normalization). "

这篇关于如何从 JDOM 获取节点内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Reliable implementation of PBKDF2-HMAC-SHA256 for JAVA(PBKDF2-HMAC-SHA256 for JAVA 的可靠实现)
Correct way to sign and verify signature using bouncycastle(使用 bouncycastle 签名和验证签名的正确方法)
Creating RSA Public Key From String(从字符串创建 RSA 公钥)
Why java.security.NoSuchProviderException No such provider: BC?(为什么 java.security.NoSuchProviderException 没有这样的提供者:BC?)
Generating X509 Certificate using Bouncy Castle Java(使用 Bouncy Castle Java 生成 X509 证书)
How can I get a PublicKey object from EC public key bytes?(如何从 EC 公钥字节中获取 PublicKey 对象?)