Reputation: 11
I would like to find which values from a list exist in a specific ontology using bert map algorithm.
One of the first steps is to load the ontology in order to use it in the process.
As I am newbie in this I ask chatgpt to create an example and it took me this. However as I check the rdflib has not the Graph()
command.
Here is the example:
install.packages(c("textTinyR", "rdflib"))
library(textTinyR)
library(rdflib)
uco_ontology <- "C:/Users/User/Desktop/uco_1_5.owl" # Replace with the path to your UCO ontology file
graph <- rdflib::Graph()
rdflib::parse(graph, file = uco_ontology)
Is there any way to load an ontology to R? From here you can donwload the ontology (owl) file which requests the snippet.
Upvotes: 0
Views: 239
Reputation: 56
Maybe try import_owl()
from Bioconductor package simona.
Upvotes: 0
Reputation: 581
There are a few ways to parse the .owl file that you have, the vignette of the package might include more information
require(jsonld)
require(rdflib)
require(xml2)
require(jsonlite)
require(textTinyR)
# the expected format based on the example .rdf file
doc <- system.file("extdata/example.rdf", package="redland")
tmp_doc = readLines(doc, warn = F)
tmp_doc
# [1] "<?xml version=\"1.0\"?>" " <rdf:RDF xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\""
# [3] " xmlns:dc=\"http://purl.org/dc/elements/1.1/\">" " <rdf:Description rdf:about=\"http://www.johnsmith.com/\">"
# [5] " <dc:title>John Smith's Home Page</dc:title>" " <dc:creator>John Smith</dc:creator>"
# [7] " <dc:description>The generic home page of John Smith</dc:description>" " </rdf:Description> "
# [9] "</rdf:RDF>"
# get the raw Github version of your mentioned .owl weblink
url_pth = 'https://raw.githubusercontent.com/Ebiquity/Unified-Cybersecurity-Ontology/master/uco_1_5.owl'
# The following does not work as it returns 0 'triples'
# read with xml2 and convert to json
dat_xml = xml2::read_xml(url_pth) |>
xml2::as_list() |>
jsonlite::toJSON()
# use format 'jsonld'
rdf = rdflib::rdf_parse(doc = dat_xml, format = "jsonld")
rdf
# Total of 0 triples, stored in hashes
# -------------------------------
# The following approach returns parsed data
# download the .owl to a temporary file
tmp_xml <- tempfile(fileext = '.xml')
download.file(url = url_pth, destfile = tmp_xml)
# then each line in the file is a separate item in the vector, trims item both sides and concatenate using an empty space (new line works too)
dat_xml = textTinyR::read_rows(input_file = tmp_xml) |>
sapply(function(x) {
x = trimws(x, which = 'both')
x
}) |>
paste(collapse = ' ')
# it returns output 'triples', I'm not sure if the total number is correct you must have to verify that, as there are many 'librdf errors'
rdf_out = rdflib::rdf_parse(doc = dat_xml, format = "guess")
# librdf error {��V - Using property attribute 'ontologyIRI' without a namespace is forbidden.
# librdf error {��V - Using property attribute 'versionIRI' without a namespace is forbidden.
# librdf error V - Using an attribute 'name' without a namespace is forbidden.
# ....
rdf_out
# Total of 57 triples, stored in hashes
# -------------------------------
# _:r1710140791r63392r12 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r17 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r25 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r18 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r32 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r30 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r29 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r13 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r24 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
# _:r1710140791r63392r15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
#
# ... with 47 more triples
You might have a better luck with 'bert map' using the Pytorch package (https://krr-oxford.github.io/DeepOnto/). If R is required you can use the 'reticulate' R package to work with it from within R
Upvotes: 0