milad moradi
milad moradi

Reputation: 1

how to find out all possible hierarchies of elements in a xsd file?

I am working with some domain-specific GML files that are in fact XML files capable of storing geometries of spatial features in GIS domain.

I have these two xsd files that describe the data model of the International Hydrographic Organization (IHO), S100 data model:

https://schemas.s100dev.net/schemas/S100/5.0.0/S100GML/20220620/S100_gmlProfile.xsd

https://schemas.s100dev.net/schemas/S100/5.0.0/S100GML/20220620/s100gmlbase.xsd

In these files, sometimes the possible child tags, that are defined by extension and sequence tags, are inherited from complex types or abstract classes.

I want to find out a tree kind of model that shows only the hierarchy of elements. For example, I want to see tag1 can have tag2 and tag3 as children and then tag 2 can have ...

I mean I want to apply all inheritences for possible children and also attributes all the way down to element objects.

Then, I want to have a diagram or a tree showing all the possible ways from root to the last element.

I have access to Enterprise Architect (EA) license. I want to know what is the best way with or without EA to create that diagram or tree.

Something similar to what we see in this example:

https://sparxsystems.com/enterprise_architect_user_guide/16.1/modeling_domains/abstract_xsd_models.html

from that xsd to only elements hierarchy but applying all inheritences.

I imported the xsd files in EA and it looks like this:

enter image description here

Where all abstract classes and enumeration and everything is displayed.

I want to see only elements, but with all inheritences applied.

I have tried also some online tools such as : https://myxml.in/xsd-treeview.html

but it also show abstract classes and types and everything.

This is an example of the valid hierarchy of S100 Geometry tags that are valid:

<geometry>
    <s100:surfaceProperty xlink:href="#s94">
        <s100:Surface srsName="http://www.opengis.net/def/crs/EPSG/0/4326" gml:id="s94">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:Ring>
                            <gml:curveMember xlink:href="#cc142">
                                <s100:CompositeCurve    srsName="http://www.opengis.net/def/crs/EPSG/0/4326" gml:id="cc142">
                                    <gml:curveMember xlink:href="#oc222">
                                        <s100:OrientableCurve srsName="http://www.opengis.net/def/crs/EPSG/0/4326" gml:id="oc222" orientation="+">
                                            <gml:baseCurve xlink:href="#c212">
                                                <s100:Curve srsName="http://www.opengis.net/def/crs/EPSG/0/4326" gml:id="c212">
                                                    <gml:segments>
                                                        <gml:LineStringSegment>
                                                            <gml:posList></gml:posList>
                                                        </gml:LineStringSegment>
                                                    </gml:segments>
                                                </s100:Curve>
                                            </gml:baseCurve>
                                        </s100:OrientableCurve>
                                    </gml:curveMember>
                                </s100:CompositeCurve>
                            </gml:curveMember>
                        </gml:Ring>
                    </gml:exterior>
                    <gml:interior>
                        <gml:Ring>
                            <gml:curveMember xlink:href="#cc143">
                                <s100:CompositeCurve srsName="http://www.opengis.net/def/crs/EPSG/0/4326" gml:id="cc143">
                                    <gml:curveMember xlink:href="#oc223">
                                        <s100:OrientableCurve srsName="http://www.opengis.net/def/crs/EPSG/0/4326" gml:id="oc223" orientation="-">
                                            <gml:baseCurve xlink:href="#c213">
                                                <s100:Curve srsName="http://www.opengis.net/def/crs/EPSG/0/4326" gml:id="c213">
                                                    <gml:segments>
                                                        <gml:LineStringSegment>
                                                            <gml:posList></gml:posList>
                                                        </gml:LineStringSegment>
                                                    </gml:segments>
                                                </s100:Curve>
                                            </gml:baseCurve>
                                        </s100:OrientableCurve>
                                    </gml:curveMember>
                                </s100:CompositeCurve>
                            </gml:curveMember>
                        </gml:Ring>
                    </gml:interior>
                </gml:PolygonPatch>
            </gml:patches>
        </s100:Surface>
    </s100:surfaceProperty>
</geometry>

In order to read them, I need to know the possible hierarchies. That is why I need to create a tree from it. or somehow find all hierarchies.


Upvotes: 0

Views: 62

Answers (1)

Michael Kay
Michael Kay

Reputation: 163262

It's a tough challenge. You don't mention some of the problems: there can be wildcards, and substitution groups, and types derived by extension, and of course complex types can be recursive. And that's without considering xs:redefine.

You're probably best off trying to find a schema processor that offers an API giving access to the compiled schema model. Saxon [my company's product] allows you to dive in to extract such information, though it's not a well documented public API. But for example from the Configuration object you can find a complex type (class UserComplexType) and that class has a method gatherAllPermittedChildren which allows you to determine the names of all the elements that can appear in the content model - with caveats about wildcards.

That's a bit rough-and-ready because these elements might be globally or locally declared, so knowing the element name isn't enough to determine the element's type and therefore its permitted children. Ideally you would go from the complex type to its particles, and from the particles to the element declarations, which would give you the types of the children.

As an alternative to using the Java API, Saxon also allows you to generate an SCM file which is an XML representation of the schema component model, which you can then process with XSLT or XQuery.

I don't know the Enterprise Architect product.

I believe Xerces also has some kind of API giving access to the schema structure, but I'm not familiar with the details.

Upvotes: 0

Related Questions