Skip to main content

Printing first node

8 replies [Last post]
mohitanchlia
Offline
Joined: 2006-04-24
Points: 0

Here is my xml, just sample (please don't try to make much sense out of it):

a

b
c
d

I am trying to print value "a". Below is the code:

import java.io.File;
import javax.xml.parsers.*;
import org.w3c.dom.*;

// using DOM
public class Dom {

public static void main(String ...s){
try{
File f = new File("abc.xml");
if (!f.exists()){ System.out.println("File doesn't exist\n"); return;}
DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(f);

NodeList nodes = doc.getElementsByTagName("ABC");
for (int i = 0; i < nodes.getLength(); i++){
Element titleElem = (Element)nodes.item(i);
Node childNode = titleElem.getFirstChild();
System.out.println("Book title is: " + childNode.getNodeValue());
System.out.println("Book title is: " + childNode.getNodeName());
}

} catch(Exception e){ e.printStackTrace();}

}
-----
NodeValue prints blank lines. NodeName prints "text#". I would expect it to print "a".

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
mohitanchlia
Offline
Joined: 2006-04-24
Points: 0

Have more questions

Thanks. Couple of things I don't understand:
1. Why would I get whitespace text node and how ?
2. I set setIgnoringElementContentWhitespace(true) still I get no result. I am using java 1.5.0.03. Is there a bug ?
3. I changed my XML to have everything in one line, now getNodeValue returns null. Contents of xml file:
--
11111aa11111b11111c11111
--

Message was edited by: mohitanchlia

joehw
Offline
Joined: 2004-12-15
Points: 0

To answer your question:
1. In XML, the carriage return (\r or ch(13)), the linefeed (\n or ch(10)), the tab(\t), and the spacebar (' ') are considered whitespace. When a DOM document is constructed, they will be added as text nodes.

2. Whitespace can be significant, that is, legitimate part of the XML document, for example, in the following element:

street
city, state zipcode

Without DTD or XML schema definition, the DOM parser will have to treat all whitespace as legitimate part of the document. That's why I mentioned that the parser needed to be validating in order for setIgnoringElementContentWhitespace(true) to work. Again, please take a look at this thread: http://forums.java.net/jive/thread.jspa?threadID=37658&tstart=0.

3. Refer to the table in the Node api as for when the nodeValue is null
http://java.sun.com/javase/6/docs/api/org/w3c/dom/Node.html
For element nodes, you may also use getTextContent() to return the text content within the elements.

--Joe

Message was edited by: joehw

mohitanchlia
Offline
Joined: 2006-04-24
Points: 0

Thanks. Regarding point 3 above when I use getTextContent it returns the text value. When I use getNodeValue it returns null. I don't know why because according to http://java.sun.com/javase/6/docs/api/org/w3c/dom/Node.html for nodeName "#text" only attribute is supposed to be returned as null.

joehw
Offline
Joined: 2004-12-15
Points: 0

Yes, when you're on an Element node, getTextContent returns the text content of the node and its descendants. That's why getNodeValue returns null because it's an element node. There's no value for an element.

For example, if the current node is "DEF", node.getNodeName will return DEF, getTextContent will return 11111a, getNodeValue return null. If the current node is Text node "11111a", getNodeName will return "#text", getTextContent and getNodeValue both return "11111a".

If your purpose is to display text by element, just walk through all of the nodes, check node type, get node name if it's ELEMENT_NODE, get node value if it's TEXT_NODE.

--Joe

mohitanchlia
Offline
Joined: 2006-04-24
Points: 0

thanks

joehw
Offline
Joined: 2004-12-15
Points: 0

You're welcome.

joehw
Offline
Joined: 2004-12-15
Points: 0

That's because the first child was a whitespace text node. You'd need to check for those text node(s) and skip them.

If you are validating, you could tell the factory to ignore whitespaces by setting setIgnoringElementContentWhitespace(true); Refer to thread http://forums.java.net/jive/thread.jspa?threadID=37658&tstart=0.

mohitanchlia
Offline
Joined: 2006-04-24
Points: 0

Thanks. Couple of things I don't understand:
1. Why would I get whitespace text node and how ?
2. I set setIgnoringElementContentWhitespace(true) still I get no result. I am using java 1.5.0.03. Is there a bug ?
3. I changed my XML to have everything in one line, now getNodeValue returns null. Contents of xml file:
--
11111aa11111b11111c11111
--