Reputation: 1253
I want to be able to display a list of entity names and values in a C#/.NET 4.0 application.
I am able to retrieve the entity names easily enough using XmlDocument.DocumentType.Entities
, but is there a good way to retrieve the values of those entities?
I noticed that I can retrieve the value for text only entities using InnerText
, but this doesn't work for entities that contain XML tags.
Is the best way to resort to a regex?
Let's say that I have a document like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document [
<!ENTITY test "<para>only a test</para>">
<!ENTITY wwwc "World Wide Web Corporation">
<!ENTITY copy "©">
]>
<document>
<!-- The following image is the World Wide Web Corporation logo. -->
<graphics image="logo" alternative="&wwwc; Logo"/>
</document>
I want to present a list to the user containing the three entity names (test, wwwc, and copy), along with their values (the text in quotes following the name). I had not thought through the question of entities nested within other entities, so I would be interested in a solution that either completely expands the entity values or shows the text just as it is in the quotes.
Upvotes: 3
Views: 4157
Reputation: 5245
I ran into problems using the accepted solution. In particular:
private IEnumerable<KeyValuePair<string, string>> AllEntityExpansions(XmlDocument doc)
{
var entities = doc.DocumentType.Entities;
foreach (var entity in entities.OfType<XmlEntity>()
.OrderBy(e => e.Name, StringComparer.OrdinalIgnoreCase))
{
var xmlString = default(string);
try
{
var element = doc.CreateElement("e");
element.AppendChild(doc.CreateEntityReference(entity.Name));
using (var r = new XmlNodeReader(element))
{
var elem = XElement.Load(r);
xmlString = elem.ToString();
}
}
catch (XmlException) { }
if (xmlString?.Length > 7)
yield return new KeyValuePair<string, string>(entity.Name, xmlString.Substring(3, xmlString.Length - 7));
}
}
Upvotes: 0
Reputation: 1253
Although it’s not likely the most elegant solution possible, I came up with something that seems to work well enough for my purposes. First, I parsed the original document and retrieved the entity nodes from that document. Then I created a small in-memory XML document, to which I added all the entity nodes. Next, I added entity references to all of the entities within the temporary XML. Finally, I retrieved the InnerXml from all of the references.
Here's some sample code:
// parse the original document and retrieve its entities
XmlDocument parsedXmlDocument = new XmlDocument();
XmlUrlResolver resolver = new XmlUrlResolver();
resolver.Credentials = CredentialCache.DefaultCredentials;
parsedXmlDocument.XmlResolver = resolver;
parsedXmlDocument.Load(path);
// create a temporary xml document with all the entities and add references to them
// the references can then be used to retrieve the value for each entity
XmlDocument entitiesXmlDocument = new XmlDocument();
XmlDeclaration dec = entitiesXmlDocument.CreateXmlDeclaration("1.0", null, null);
entitiesXmlDocument.AppendChild(dec);
XmlDocumentType newDocType = entitiesXmlDocument.CreateDocumentType(parsedXmlDocument.DocumentType.Name, parsedXmlDocument.DocumentType.PublicId, parsedXmlDocument.DocumentType.SystemId, parsedXmlDocument.DocumentType.InternalSubset);
entitiesXmlDocument.AppendChild(newDocType);
XmlElement root = entitiesXmlDocument.CreateElement("xmlEntitiesDoc");
entitiesXmlDocument.AppendChild(root);
XmlNamedNodeMap entitiesMap = entitiesXmlDocument.DocumentType.Entities;
// build a dictionary of entity names and values
Dictionary<string, string> entitiesDictionary = new Dictionary<string, string>();
for (int i = 0; i < entitiesMap.Count; i++)
{
XmlElement entityElement = entitiesXmlDocument.CreateElement(entitiesMap.Item(i).Name);
XmlEntityReference entityRefElement = entitiesXmlDocument.CreateEntityReference(entitiesMap.Item(i).Name);
entityElement.AppendChild(entityRefElement);
root.AppendChild(entityElement);
if (!string.IsNullOrEmpty(entityElement.ChildNodes[0].InnerXml))
{
// do not add parameter entities or invalid entities
// this can be determined by checking for an empty string
entitiesDictionary.Add(entitiesMap.Item(i).Name, entityElement.ChildNodes[0].InnerXml);
}
}
Upvotes: 2
Reputation: 44916
You can easily display a representation of an XML document simply by walking the tree recursively.
This small class happens to use a Console, but you could easily modify it to your needs.
public static class XmlPrinter {
private const Int32 SpacesPerIndent = 3;
public static void Print(XDocument xDocument) {
if (xDocument == null) {
Console.WriteLine("No XML Document Provided");
return;
}
PrintElementRecursive(xDocument.Root);
}
private static void PrintElementRecursive(XElement element, Int32 indentationLevel = 0) {
if(element == null) return;
PrintIndentation(indentationLevel);
PrintElement(element);
PrintNewline();
foreach (var xAttribute in element.Attributes()) {
PrintIndentation(indentationLevel + 1);
PrintAttribute(xAttribute);
PrintNewline();
}
foreach (var xElement in element.Elements()) {
PrintElementRecursive(xElement, indentationLevel+1);
}
}
private static void PrintAttribute(XAttribute xAttribute) {
if (xAttribute == null) return;
Console.Write("[{0}] = \"{1}\"", xAttribute.Name, xAttribute.Value);
}
private static void PrintElement(XElement element) {
if (element == null) return;
Console.Write("{0}", element.Name);
if(!String.IsNullOrWhiteSpace(element.Value))
Console.Write(" : {0}", element.Value);
}
private static void PrintIndentation(Int32 level) {
Console.Write(new String(' ', level * SpacesPerIndent));
}
private static void PrintNewline() {
Console.Write(Environment.NewLine);
}
}
Using the class is trivial. Here is an example that prints out your current config file:
static void Main(string[] args) {
XmlPrinter.Print(XDocument.Load(
ConfigurationManager.OpenExeConfiguration(ConfigurationUserLevel.None).FilePath
));
Console.ReadKey();
}
Try it for yourself, and you should be able to quickly modify to get what you want.
Upvotes: 0
Reputation: 2256
This is one way (untested), it uses XMLReader and the ResolveEntity() method of this class:
private Dictionary<string, string> GetEntities(XmlReader xr)
{
Dictionary<string, string> entityList = new Dictionary<string, string>();
while (xr.Read())
{
HandleNode(xr, entityList);
}
return entityList;
}
StringBuilder sbEntityResolver;
int extElementIndex = 0;
int resolveEntityNestLevel = -1;
string dtdCurrentTopEntity = "";
private void HandleNode(XmlReader inReader, Dictionary<string, string> entityList)
{
if (inReader.NodeType == XmlNodeType.Element)
{
if (resolveEntityNestLevel < 0)
{
while (inReader.MoveToNextAttribute())
{
HandleNode(inReader, entityList); // for namespaces
while (inReader.ReadAttributeValue())
{
HandleNode(inReader, entityList); // recursive for resolving entity refs in attributes
}
}
}
else
{
extElementIndex++;
sbEntityResolver.Append(inReader.ReadOuterXml());
resolveEntityNestLevel--;
if (!entityList.ContainsKey(dtdCurrentTopEntity))
{
entityList.Add(dtdCurrentTopEntity, sbEntityResolver.ToString());
}
}
}
else if (inReader.NodeType == XmlNodeType.EntityReference)
{
if (inReader.Name[0] != '#' && !entityList.ContainsKey(inReader.Name))
{
if (resolveEntityNestLevel < 0)
{
sbEntityResolver = new StringBuilder(); // start building entity
dtdCurrentTopEntity = inReader.Name;
}
// entityReference can have contents that contains other
// entityReferences, so keep track of nest level
resolveEntityNestLevel++;
inReader.ResolveEntity();
}
}
else if (inReader.NodeType == XmlNodeType.EndEntity)
{
resolveEntityNestLevel--;
if (resolveEntityNestLevel < 0)
{
if (!entityList.ContainsKey(dtdCurrentTopEntity))
{
entityList.Add(dtdCurrentTopEntity, sbEntityResolver.ToString());
}
}
}
else if (inReader.NodeType == XmlNodeType.Text)
{
if (resolveEntityNestLevel > -1)
{
sbEntityResolver.Append(inReader.Value);
}
}
}
Upvotes: 1
Reputation: 721
If you have an XmlDocument
object, perhaps it would be easier to recursively step through each XmlNode
object (from XmlDocument.ChildNodes
), and for each node you can use the Name
property to get the name of the node. Then "getting the value" depends on what you want (InnerXml
for a string representation, ChildNodes
for programmatic access to the XmlNode
objects which can be cast to XmlEntity
/XmlAttribute
/XmlText
).
Upvotes: 0