Reputation: 11
The result is not the same if I use Jena vs the query form at http://dbpedia.org/sparql
My code in Jena (I try to return two lists that contain the types for the searched text name):
s1 = "Ketolide";
s2 = "Aminocoumarin";
String sparqlQueryString1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>" +
"SELECT distinct ?type1 " +
"WHERE { ?data rdfs:label ?label1. ?data rdf:type ?type1. FILTER contains(lcase(str(?label1)),'" + s1.toLowerCase() + "'). }";
Query query = QueryFactory.create(sparqlQueryString1);
QueryEngineHTTP objectToExec = QueryExecutionFactory.createServiceRequest("http://dbpedia.org/sparql", query);
objectToExec.addParam("timeout","3000");
ResultSet results = objectToExec.execSelect();
List<QuerySolution> s = ResultSetFormatter.toList(results);
ResultSetFormatter.out(System.out, results, query);
sparqlQueryString1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> " +
"SELECT distinct ?type1 " +
"WHERE {?data rdfs:label ?label1. ?data rdf:type ?type1. FILTER contains(lcase(str(?label1)),'" + s2.toLowerCase() + "'). }";
query = QueryFactory.create(sparqlQueryString1);
objectToExec = QueryExecutionFactory.createServiceRequest("http://dbpedia.org/sparql", query);
objectToExec.addParam("timeout","3000");
results = objectToExec.execSelect();
List<QuerySolution> s22 = ResultSetFormatter.toList(results);
ResultSetFormatter.out(System.out, results, query);
When I use the same query in the query form at http://dbpedia.org/sparql it gets results:
SELECT distinct ?type1 WHERE{ ?data rdf:type ?type1. ?data rdfs:label ?label1 . FILTER contains(lcase(str(?label1)), 'ketolide') .}
This returns:
type1
http://dbpedia.org/ontology/ChemicalCompound
http://dbpedia.org/class/yago/WikicatKetolideAntibiotics
http://dbpedia.org/class/yago/Agent114778436
http://dbpedia.org/class/yago/Antibacterial102716205
http://dbpedia.org/class/yago/Antibiotic102716866
http://dbpedia.org/class/yago/CausalAgent100007347
http://dbpedia.org/class/yago/Drug103247620
http://dbpedia.org/class/yago/Matter100020827
http://dbpedia.org/class/yago/Medicine103740161
http://dbpedia.org/class/yago/PhysicalEntity100001930
http://dbpedia.org/class/yago/Substance100020090
http://dbpedia.org/class/yago/WikicatAntibiotics
What is the reason and cause of this difference?
Upvotes: 1
Views: 83
Reputation: 9482
I can spot two differences.
Use of default graph IRI: First, the query form at http://dbpedia.org/sparql sets the default graph IRI to http://dbpedia.org
. Your code doesn't do that. So your code will run against all graphs in the database, and not just against the DBpedia graph. To add the same default graph to your query, this should work:
objectToExec.addDefaultGraph("http://dbpedia.org");
(I don't know what other graphs the endpoint has, so I don't know how much of a difference this actually makes.)
Different timeouts: Secondly, your code sets the timeout to 3000 while the query form sets it to 30000. This particular endpoint is configured to return whatever it has found so far when it hits a timeout, so if it hasn't found anything after 3 seconds, it will return with no results. The query form will let the query run for 30 seconds.
That being said, full-text search can be done much more efficiently by using bif:contains
:
FILTER bif:contains(?label1, 'ketolide')
This uses a full-text index, which is much faster than scanning all the strings in the database.
And finally, you should consider fixing the code's vulnerability to SPARQL injection.
Upvotes: 1