Peter Krauss
Peter Krauss

Reputation: 13930

How to filter label by regular expression?

The filter by regex clause is ignored, FILTER(!REGEX(STR(?aLabel), "^Q[0-9]+$"))... How to use "filter by label?"


Real case

SELECT ?a ?aLabel ?lat ?long WHERE {
  ?a wdt:P31 wd:Q274393 .   # bakery or scholl or etc.
  ?a p:P625 ?statement .    # that has coordinate-location statement

  ?statement psv:P625 ?coordinate_node .
  ?coordinate_node wikibase:geoLatitude ?lat .
  ?coordinate_node wikibase:geoLongitude ?long .

  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" .
  }
  #FILTER(!REGEX(STR(?aLabel), "^Q[0-9]+$")) # not working, no items !  
  #FILTER(!REGEX(STR(?a), "^Q[0-9]+$")) # not working, ignored !
}
ORDER BY (?aLabel)  # need to eliminate ugly itens with no name

You can edit here.


PS: it is not the question, but another solution for the real-life problem, which is interesting to comment on, is a clause to check "no language labels" or "empty label".

Upvotes: 2

Views: 343

Answers (1)

Peter Krauss
Peter Krauss

Reputation: 13930

As commented by @UninformedUser,

the labels like your ?aLabel are magic vars that come from some special non-standard service, thus, happen after the query has been evaluated

so, to avoid magic, we can try to isolate it in a subquery... It is working fine!

SELECT *
WHERE {
  # no constraints here in the main query, bypass the subquery
  { # subquery:
    SELECT ?a ?aLabel ?lat ?long 
    WHERE {
      ?a wdt:P31 wd:Q274393 .   # bakery or scholl or etc.
      ?a p:P625 ?statement .    # that has coordinate-location statement
      ?statement psv:P625 ?coordinate_node .
      ?coordinate_node wikibase:geoLatitude ?lat .
      ?coordinate_node wikibase:geoLongitude ?long .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" . }
    }
    ORDER BY (?aLabel)  
  }
  FILTER(!REGEX(STR(?aLabel), "^Q[0-9]+$")) # to eliminate ugly itens with no name
}

See or edit here.

Optional solution to filter no-names

As commented in the end of the question, another solution for the real-life problem, is a clause to check "no language labels" or "empty label". No regex and no subquery need, only add the above FILTER EXISTS on original query:

SELECT ?a ?aLabel ?lat ?long WHERE {
  ?a wdt:P31 wd:Q274393 .   # bakery or scholl or etc.
  ?a p:P625 ?statement .    # that has coordinate-location statement

  ?statement psv:P625 ?coordinate_node .
  ?coordinate_node wikibase:geoLatitude ?lat .
  ?coordinate_node wikibase:geoLongitude ?long .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" .
  }
  FILTER EXISTS {
    ?a rdfs:label ?someLabel filter(langmatches(lang(?someLabel), "[AUTO_LANGUAGE]"))
  } 
}
ORDER BY (?aLabel)

See or edit here

Upvotes: 2

Related Questions