Reputation: 51
I want to setup DBpedia dataset locally, but I'm not sure how to do it. I have downloaded mappingbased_objects_en.ttl
and infobox_properties_mapped_en.ttl.bz2
, is there anything else I need to download,
now how can I query this using SPARQL ? do I need to install anything to make it queryable from sparql. is there any Database software for SPARQL like mysql ??
I tried http://dbpedia.org/sparql, but due to the restriction of 10000 query limit I want to setup DBpedia in my system
Any lead would be appreciated. Thanks
PS: This two files (mappingbased_objects_en.ttl
, infobox_properties_mapped_en.ttl.bz2
) doesn't seems to have all the entity information for ex: Steve Jobs is not there in those files but Tim Cook is there and I'm certain Steve jobs is present in DBpedia.
Upvotes: 4
Views: 1731
Reputation: 21
Although @firefly's answer is still correct, there is a much simpler way to setup dbpedia locally provided by dbpedia itself:
git clone https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart.git
cd virtuoso-sparql-endpoint-quickstart
COLLECTION_URI=https://databus.dbpedia.org/dbpedia/collections/latest-core VIRTUOSO_ADMIN_PASSWD=password docker-compose up
Source: https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart
Upvotes: 2
Reputation: 51
You need to install DBPedia on a local triplestore, such as Virtuoso. I explain this in this article but here is the gist on how to install and query DBPedia locally with Virtuoso Triplestore:
The Virtuoso Open Source Edition can be downloaded from here. Once Virtuoso is installed, run it and start VOS Database. Go to Virtuoso admin page in the browser (you may have to give it a bit of time to start): http://localhost:8890/conductor/ Login with default credentials (dba/dba) In tab “Quad Store Upload” for testing you can upload a ttl file to the specified named graph IRI, such as “http://localhost:8890/DBPedia”. Next you can test the triplestore in the SPARQL tab or directly at the local endpoint. For example:
SELECT count(*) WHERE
{?category skos:broader <http://dbpedia.org/resource/Category:Environmental_issues>}
However the upload might fail for bigger files. For bigger files and also for uploading multiple files, it is best to use the bulk upload.
In order to bulk upload files from anywhere (and not just the Virtuoso import folder), you must add your folder to the DirsAllowed property in the Virtuoso configuration file virtuoso.ini. You must restart Virtuoso for the changes in virtuoso.ini to be effective. For example, assuming that the dumps are in /tmp/virtuoso_db/dbpedia/ttl, you can add path /tmp/virtuoso_db to DirsAllowed.
Once Virtuoso is back and running, go the the Interactive SQL (ISQL) window and register the files to be loaded by typing in:
ld_dir('/tmp/virtuoso_db/dbpedia/ttl/','*.ttl','http://localhost:8890/DBPedia');
You can then perform the bulk load of all the registered files by typing in:
rdf_loader_run();
You can monitor the number of triples being uploaded by performing the following SPARQL query on the local endpoint:
select count(*) as ?c where {?a ?b ?c}
Upvotes: 3