dmark
dmark

Reputation: 320

Is there a well-defined way to measure size and/or complexity of XML files?

Usually LOC is one of widely used metrics for measuring source code of programs. It works perfectly for measuring size of Java or C code. However, in one of our current research projects, we need to measure the size of code in XML files. LOC seems not a good fit for this purpose, due to the flexibility of XML format.

I was wondering whether there is a good way to measure size or complexity of XML code. I have searched online, and most published research work focus on defining complexity of XML schema, DTD, instead of XML files. Such as: Metrics for XML Document Collections

I also find that there are tools/libraries can count/list nodes or elements based on different tag names. Such as: Counting number of element in xml file and Simplest way to get XML node count

However, our research does not care about names of tags or elements. We only need a well-defined metric to measure size or complexity of code in XML files, especially Android layout files and AndroidManifest.xml files.

Upvotes: 1

Views: 1410

Answers (1)

kjhughes
kjhughes

Reputation: 111686

Well-defined ways to measure XML files

Size

  • XML file byte count
  • Text content character count
  • {Element|Attribute|DOM node} count
  • Aggregates of above measures

Complexity

  • Unique {element|attribute} name count
  • Maximum or average {depth|width} of element tree hierarchy
  • Directed Acyclic Graph measures for ID/IDREF DAG structures
  • Size of smallest schema that would validate the XML
    • Limited to a specific schema standard {XSD|DTD|RelaxNG|...}
    • Limited to a specific schema feature subset (eg: no xsd:any, ...)
  • Kolmogorov complexity of XML file as a string

Upvotes: 1

Related Questions