Grzegorz Wierzowiecki
Grzegorz Wierzowiecki

Reputation: 10843

CLI tool ala csvsql for SPARQL and TTL, n3, ... files - Hello World example for teaching purposes

EDIT: to make this question more specific. Please provide "hello world" example of executing sparql query over .ttl file locally on Linux using tool of your choice.

csvsql allows to query .csv files directly (i.e., without importing) via SQL; for example:

$ csvsql --query  "select m.usda_id, avg(i.sepal_length) as
mean_sepal_length from iris as i join irismeta as m on (i.species =
m.species) group by m.species" examples/iris.csv examples/irismeta.csv

I would love to have similar ability to query with SPARQL over Turtle .ttl or other typical RDF files.

How to achieve similar "one-off" "direct query" functionality for SPARQL and Turtle or similar files? (e.g., small script that loads given file into memory of, e.g., running blazegraph instance, and runs query returns result and withdraws what's in memory; or maybe something using librdf, e.g., Rasqal/Redland or Neo4j or any other SPARQL implementation - preferably something without running background instance, one-off, KISS)

IMHO such tool would be great for hobbyists and enthusiasts who may want to play with storing data in triples and querying it without launching full server. It would be also VERY beneficial to education purposes.

Could you provide specific example, backed by snippet, showing how to do this? (locally on Linux)

Upvotes: 3

Views: 752

Answers (3)

Apache Jena SPARQL CLI setup

On Ubuntu 23.10, there is a package libapache-jena-java, but it does not expose the CLI tools which is annoying.

To install them, we have to download the prebuilt JAR binaries from https://jena.apache.org/download/index.cgi as partly documented at https://jena.apache.org/documentation/tools/

sudo apt install openjdk-22-jre
wget https://dlcdn.apache.org/jena/binaries/apache-jena-4.10.0.zip
unzip apache-jena-4.10.0.zip
cd apache-jena-4.10.0
export JENA_HOME="$(pwd)"
export PATH="$PATH:$(pwd)/bin"

and we can confirm it works with:

sparql -version

which outputs:

Apache Jena version 4.10.0

Just make sure you have a recent enough Java to run those binaries, and that JAVA_HOME points to it.

Then to actually make a query, we can look at the SPARQL tutorial: https://jena.apache.org/tutorials/sparql.html

Given data file in Turtle syntax:

mydata.ttl

@prefix vCard:   <http://www.w3.org/2001/vcard-rdf/3.0#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix :        <#> .

<http://somewhere/MattJones/>
    vCard:FN    "Matt Jones" ;
    vCard:N     [ vCard:Family
                              "Jones" ;
                  vCard:Given
                              "Matthew"
                ] .

<http://somewhere/RebeccaSmith/>
    vCard:FN    "Becky Smith" ;
    vCard:N     [ vCard:Family
                              "Smith" ;
                  vCard:Given
                              "Rebecca"
                ] .

<http://somewhere/JohnSmith/>
    vCard:FN    "John Smith" ;
    vCard:N     [ vCard:Family
                              "Smith" ;
                  vCard:Given
                              "John"
                ] .

<http://somewhere/SarahJones/>
    vCard:FN    "Sarah Jones" ;
    vCard:N     [ vCard:Family
                              "Jones" ;
                  vCard:Given
                              "Sarah"
                ] .

and a SPARQLE query file in that queries users with full name "John Smith:

myquery.rq

SELECT ?x
WHERE { ?x <http://www.w3.org/2001/vcard-rdf/3.0#FN> "John Smith" . }

we can run the query directly on the .ttl as:

sparql --data=mydata.ttl --query=myquery.rq

which outputs the desired:

---------------------------------
| x                             |
=================================
| <http://somewhere/JohnSmith/> |
---------------------------------

If you incorrectly download just the source without building it, or if the .JAR libraries are not found for some other reason, the current error message was:

Error: Could not find or load main class arq.sparql

which is probably found in the file lib/jena-arq-4.10.0.jar of the prebuilt distribution. I've also put this error message in a question title to help Google a bit: How to solve "Error: Could not find or load main class arq.sparql" when trhing to run the `sparql` CLI tool from Apache Jena?

Tested on Ubuntu 23.10.

Upvotes: 1

dajobe
dajobe

Reputation: 5036

Rasqal's command-line query tool roqet does this; see http://librdf.org/rasqal/roqet.html.

Or online at http://triplr.org/query.

Edit with example how it works and packages:

Rasqal example packages:

Let's try to perform "hello world" query from : https://wiki.blazegraph.com/wiki/index.php/Quick_Start tutorial.

Here is example data.ttl file:

PREFIX : <http://blazegraph.com/>
PREFIX schema: <http://schema.org/>

:systap a schema:Organization ;
        schema:owns :blazegraph .
:blazegraph a schema:Product ;
            schema:brand :systap;
            :productOf <http://systap.com/>;
            :implements <http://rdf4j.org>,<http://blueprints.tinkerpop.com> .

And example "hello world" Sparql queries:

$ roqet -i sparql -e 'SELECT * WHERE { <http://blazegraph.com/blazegraph> ?p ?o }' -D data.ttl
roqet: Running query 'SELECT * WHERE { <http://blazegraph.com/blazegraph> ?p ?o }'
roqet: Query has a variable bindings result
row: [p=uri<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, o=uri<http://schema.org/Product>]
row: [p=uri<http://schema.org/brand>, o=uri<http://blazegraph.com/systap>]
row: [p=uri<http://blazegraph.com/productOf>, o=uri<http://systap.com/>]
row: [p=uri<http://blazegraph.com/implements>, o=uri<http://rdf4j.org>]
row: [p=uri<http://blazegraph.com/implements>, o=uri<http://blueprints.tinkerpop.com>]
roqet: Query returned 5 results

or even more generic

$ roqet -i sparql -e 'SELECT * WHERE { ?s ?p ?o }' -D data.ttl  | xsel -b
roqet: Running query 'SELECT * WHERE { ?s ?p ?o }'
roqet: Query has a variable bindings result
row: [s=uri<http://blazegraph.com/systap>, p=uri<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, o=uri<http://schema.org/Organization>]
row: [s=uri<http://blazegraph.com/systap>, p=uri<http://schema.org/owns>, o=uri<http://blazegraph.com/blazegraph>]
row: [s=uri<http://blazegraph.com/blazegraph>, p=uri<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, o=uri<http://schema.org/Product>]
row: [s=uri<http://blazegraph.com/blazegraph>, p=uri<http://schema.org/brand>, o=uri<http://blazegraph.com/systap>]
row: [s=uri<http://blazegraph.com/blazegraph>, p=uri<http://blazegraph.com/productOf>, o=uri<http://systap.com/>]
row: [s=uri<http://blazegraph.com/blazegraph>, p=uri<http://blazegraph.com/implements>, o=uri<http://rdf4j.org>]
row: [s=uri<http://blazegraph.com/blazegraph>, p=uri<http://blazegraph.com/implements>, o=uri<http://blueprints.tinkerpop.com>]
roqet: Query returned 7 results

Upvotes: 3

AndyS
AndyS

Reputation: 16700

The specific Apache Jena command is sparql.

The commands come in the binary download from http://apache.org/dist/jena/binaries/. Unpack and there are bin/ and bat/ directories of scripts to run from the command line.

Upvotes: 1

Related Questions