user4436707
user4436707

Reputation:

Most common datatypes in DBpedia

I need to know which are the most common datatypes in DBpedia. So I am asking a query to Virtuoso like this:

SELECT datatype(?d) (COUNT(?d) as ?dCount)
WHERE
{
  ?s ?p ?d
}
GROUP BY ?d
ORDER BY DESC(?dCount)

I am not sure if the query is correct and, above all, the transaction timed out. How can I get my answer or reduce my research space to "something relevant"? Or, for example, get anyway my result when the query times out?

Upvotes: 0

Views: 73

Answers (1)

UninformedUser
UninformedUser

Reputation: 8465

The query is not correct. You must group by the datatype, not the literal value:

SELECT (datatype(?d) as ?dt) (COUNT(?d) as ?dCount)
WHERE
{
  ?s ?p ?d
  FILTER(isLiteral(?d))
}
GROUP BY datatype(?d)
ORDER BY DESC(?dCount)

The query might still timeout.

You could restrict it to data properties of DBpedia, i.e.

SELECT (datatype(?d) as ?dt) (COUNT(*) as ?dCount)
WHERE
{
  ?p a owl:DatatypeProperty .
  ?s ?p ?d
}
GROUP BY datatype(?d)
ORDER BY DESC(?dCount)

but you would miss the triples with properties of http://dbpedia.org/property/ namespace.

Alternatives:

  1. load the data into a local more powerful server
  2. simply use the DBpedia ontology although this probably doesn't contain all datatypes used in the instance data

Upvotes: 2

Related Questions