Reputation: 4106
I was trying to get the XML output with some Unicode characters. I couldn't read the complete string inside the tag but just one.
here is my XML output
<item>
<id>1</id>
<name>ලොල්</name>
<cost>155</cost>
<description>ලො</description>
</item>
This is my java code which I use to parse XML string.
public Document getDomElement(String xml) {
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setEncoding("UTF-16");
is.setCharacterStream(new StringReader(xml));
doc = db.parse(is);
} catch (ParserConfigurationException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (SAXException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (IOException e) {
Log.e("Error: ", e.getMessage());
return null;
}
// return DOM
return doc;
}
When I use normal English characters it gives the complete string.
Upvotes: 0
Views: 3093
Reputation: 4106
This is the code I used to solve my problem.
NodeList idlist = doc.getElementsByTagName(KEY_ID);
NodeList namelist = doc.getElementsByTagName(KEY_NAME);
NodeList costlist = doc.getElementsByTagName(KEY_COST);
NodeList desclist = doc.getElementsByTagName(KEY_DESC);
for (int i=0; i<idlist.getLength(); i++)
{
Item item = new Item();
item.setCost(costlist.item(i).getTextContent());
item.setDescription(desclist.item(i).getTextContent());
item.setName(namelist.item(i).getTextContent());
itemarray.add(item);
}
Upvotes: 0
Reputation: 13841
I've tried your code and there's no problem. If I evaluate the nodes with non-English chars the exists and have the correct number of chars. They're not printable because I don't have that glyphs in the font used, but value.codePointAt(i)
returns the correct codepoint.
NodeList list = doc.getDocumentElement().getChildNodes();
for (int i=0; i<list.getLength(); i++)
{
String value = list.item(i).getTextContent();
for (int j=0; j<value.length(); j++)
System.out.print(" " + value.codePointAt(j));
System.out.println();
}
outputs:
49
3517 3548 3517 3530
49 53 53
3517 3548
which correspond to the decimal representation of your codepoints.
I've created the xml string by hand. You already have it in memory right?
Upvotes: 1