Howard Dierking
Howard Dierking

Reputation: 1224

How to query for cyclical relationships in an RDF graph?

I have the following sample graph of services that includes dependencies information:

@base <https://meta.acme.com/> .
@prefix : <http://schema.meta.acme.com/> .
@prefix dc: <http://purl.org/dc/terms/> .

</service/c84acffd-944a-43c1-8f06-956a4a6033da>
  a :Service ;
  dc:title "Service 1" ;
  :serviceDependency 
    </service/70987802-9157-4881-ab0c-049b04b7798d>, 
    </service/2b47109e-26e3-4d06-b245-98730bdb7d43>.

</service/70987802-9157-4881-ab0c-049b04b7798d>
  a :Service ;
  dc:title "Service 2" ;
  :serviceDependency 
    </service/c84acffd-944a-43c1-8f06-956a4a6033da>, 
    </service/f45b998c-b496-4318-be40-46c1aafaf6cd> .

</service/2b47109e-26e3-4d06-b245-98730bdb7d43>
  a :Service ;
  dc:title "Service 3" ;
  :serviceDependency </service/> .

</service/f45b998c-b496-4318-be40-46c1aafaf6cd>
  a :Service ;
  dc:title "Service 4" ;
  :serviceDependency </service/> .

I'm writing a SPARQL query to find cycles in the dependency relationships. I have a query that is producing correct results, but it yields pseudo-duplicates.

For example:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix c: <http://schema.meta.acme.com/>
prefix dc: <http://purl.org/dc/terms/>

select ?a ?aname ?b ?bname
where 
{ 
  {
      ?a a c:Service ;
        dc:title ?aname ;
        c:serviceDependency ?b .

      ?b dc:title ?bname .

  } filter ( EXISTS { ?b c:serviceDependency ?a } )
}

yields the following output:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| a                                                                    | aname       | b                                                                    | bname       |
===========================================================================================================================================================================
| <https://meta.acme.com/service/70987802-9157-4881-ab0c-049b04b7798d> | "Service 2" | <https://meta.acme.com/service/c84acffd-944a-43c1-8f06-956a4a6033da> | "Service 1" |
| <https://meta.acme.com/service/c84acffd-944a-43c1-8f06-956a4a6033da> | "Service 1" | <https://meta.acme.com/service/70987802-9157-4881-ab0c-049b04b7798d> | "Service 2" |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Again, this is as expected - however, the information that I want the query to reflect is the cycle itself and not both sides of the cycle.

My thinking for how to solve this is to calculate an identifier by sorting and concatenating the 2 IDs, then grouping by those IDs, but I wanted to ask whether there's a more natural way to do this in SPARQL?

thanks!

Upvotes: 0

Views: 264

Answers (1)

TallTed
TallTed

Reputation: 9444

Are these always one-step loops?

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix c: <http://schema.meta.acme.com/>
prefix dc: <http://purl.org/dc/terms/>

select ?a ?aname ?b ?bname
where 
  {
    ?a a                   c:Service ;
       dc:title            ?aname ;
       c:serviceDependency ?b .

    ?b dc:title            ?bname ;
       c:serviceDependency ?a 
  }

Upvotes: 0

Related Questions