S M Shamimul Hasan
S M Shamimul Hasan

Reputation: 6684

Virtuoso Large RDF Graph Removal Difficulty

I am using Virtuoso. It is installed on a server machine. I am trying to remove a large RDF graph from my Virtuoso. It contains 2,590,994,053 triples. I was trying to delete the graph with the following command.

SPARQL DROP SILENT GRAPH <http://ndssl.bi.vt.edu/chicago/>

However after running for a long time, Virtuoso is giving me following error.

*** Error 08S01: [Virtuoso Driver]CL065: Lost connection to server at line 6 of Top-Level:SPARQL DROP SILENT GRAPH <http://ndssl.bi.vt.edu/chicago/>

This delete command down my Virtuoso server as well. I also try with SPARQL CLEAR command. After running for a long time, it also terminates.

BTW, I have increased memory size to 128 GB and set the following configuration values. However, it does not work.

NumberOfBuffers          = 10900000
MaxDirtyBuffers          = 8000000
MaxCheckpointRemap       = 650000000

Please let me know how can I remove this large graph from my Virtuoso triple store. I have some other graphs in Virtuoso as well. I do not want to remove those.

Upvotes: 3

Views: 1366

Answers (1)

TallTed
TallTed

Reputation: 9444

As documented in How can I delete graphs containing large numbers of triples from the Virtuoso Quad Store? --

By default, triple deletion is performed as part of a transaction, which is stored in memory until the operation is completed and committed to the database. During typical server operation, deleting one or more graphs containing large numbers of triples (generally millions or more) can consume available memory to the point where the operation cannot be completed, and thus the graph(s) cannot be deleted.

Such large graphs can be cleared by changing the transaction log mode to autocommit before deleting the graph(s) or triples. This is easily done using the Virtuoso log_enable() function, with the settings log_enable(3,1).

This function may be called on its own, prior to the delete operation, via iSQL (either command-line or the Conductor variant), as shown:

SQL> log_enable(3,1);
SQL> SPARQL CLEAR GRAPH <graph-name>;

log_enable() may also be called as a pragma specified in a SPARQL/Update query (note: this query is written for execution through the SPARQL interface; if executed through an SQL interface, you must prepend the SPARQL keyword):

DEFINE sql:log-enable 3
CLEAR GRAPH <graph-name>

Triples can also be deleted directly from the RDF_QUAD table via SQL, but note that this method will not remove any free-text index data that might be associated with the graph, which CLEAR GRAPH ... would do automatically. The SQL operation would look something like this:

SQL> log_enable(3,1);
SQL> DELETE FROM rdf_quad WHERE g = iri_to_id ('http://mygraph.org');

Upvotes: 2

Related Questions