Reputation: 156
I am new to PyTorch geometric and want to know how we can load our own knowledge-graph dataset into PyTorch geometric DataLoader. I have my data in the CSV file which looks like for example:
Dataset consists of 1000's of such triples
I went through PyTorch documentation but couldn't understand how this kind of data can be used with Pytorch geometric.
I was using this data earlier with ampligraph to do link prediction and thought of giving it try with GNN (PyTorch geometric).
Any help on this!!
Upvotes: 3
Views: 984
Reputation: 19
I don't understand your data format, but if a knowledge graph is what you are looking for, it can be implemented as follows:
from torch_geometric.data import Data
import torch
example_node_labels = ["cat", "dog", "horse"]
example_edge_labels = ["example0", "example1", "example2", "example3", "example4"]
example_node_label_references = torch.tensor([
0, 2, 1
])
example_edge_label_references = torch.tensor([
1, 3, 4
])
example_edges = torch.tensor([
[0, 1, 2], # start nodes
[1, 2, 0] # end nodes
])
knowledge_graph = Data(
x = example_node_label_references,
edge_index = example_edges,
edge_attr = example_edge_label_references
)
Nore that pytorch doesn't work with strings, but only with numerical data. Therefore, you need to encode strings as numbers. In this example, the numbers are indices that point to the right string in the list of string data.
Also, read this for more information about the data object in PyTorch Geometric that represents a graph.
Also see this tutorial on converting CSV files to Pytorch Geometric graphs.
Upvotes: 0