Egon Willighagen
Egon Willighagen

Reputation: 1690

How to automatically extract Shape Expressions from RDF triples?

I've started using shape expression do describe Resource Description Framework (RDF) models. How can I extract a draft ShEx from an RDF sample, such as:

<http://identifiers.org/ensembl/ENSG00000174358>
        a                   wp:DataNode , wp:GeneProduct ;
        rdfs:label          "SLC6A19"^^xsd:string ;
        dc:identifier       <http://identifiers.org/ensembl/ENSG00000174358> ;
        dc:source           "Ensembl"^^xsd:string ;
        dcterms:identifier  "ENSG00000174358"^^xsd:string ;
        dcterms:isPartOf    <http://rdf.wikipathways.org/Pathway/WP4846_r111364/Complex/dca52> , <http://identifiers.org/wikipathways/WP4846_r111364> ;
        wp:bdbEnsembl       <http://identifiers.org/ensembl/ENSG00000174358> ;
        wp:bdbEntrezGene    <http://identifiers.org/ncbigene/340024> ;
        wp:bdbHgncSymbol    <http://identifiers.org/hgnc.symbol/SLC6A19> ;
        wp:bdbUniprot       <http://identifiers.org/uniprot/E9PD72> , <http://identifiers.org/uniprot/Q695T7> ;
        wp:isAbout          <http://rdf.wikipathways.org/Pathway/WP4846_r111364/DataNode/b57e7> 

Upvotes: 1

Views: 140

Answers (1)

You could use sheXer. There is an online demo available at http://shexer.weso.es/. Your example won't work as it is due to undefined prefixes, but it should work after adding prefix declarations.

sheXer, by default, builds a shape for each class in the graph provided. If you want to get the shape of a single node, you may want to mark "Shape map" in the "Target Shapes" section, and provide something like this: <http://identifiers.org/ensembl/ENSG00000174358>@<ShapeLabelForYourNode>.

You can find instructions at the bottom of the page. The python library that is being used by the demo can be found at the sheXer repository. I am the main developer and maintainer. Please, contact me if you find any issue.

Upvotes: 3

Related Questions