Reputation: 375
Goal: I am trying to import a graph FROM networkx into PyTorch geometric and set labels and node features.
(This is in Python)
Question(s):
from_networkx
function)I have seen some other/previous posts with this question but they weren't answered (correct me if I am wrong).
Attempt: (I have just used an unrealistic example below, as I cannot post anything real on here)
Let us imagine we are trying to do a graph learning task (e.g. node classification) on a group of cars (not very realistic as I said). That is, we have a group of cars, an adjacency matrix, and some features (e.g. price at the end of the year). We want to predict the node label (i.e. brand of the car).
I will be using the following adjacency matrix: (apologies, cannot use latex to format this)
A = [(0, 1, 0, 1, 1), (1, 0, 1, 1, 0), (0, 1, 0, 0, 1), (1, 1, 0, 0, 0), (1, 0, 1, 0, 0)]
Here is the code (for Google Colab environment):
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from torch_geometric.utils.convert import to_networkx, from_networkx
import torch
!pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.10.0+cpu.html
# Make the networkx graph
G = nx.Graph()
# Add some cars (just do 4 for now)
G.add_nodes_from([
(1, {'Brand': 'Ford'}),
(2, {'Brand': 'Audi'}),
(3, {'Brand': 'BMW'}),
(4, {'Brand': 'Peugot'}),
(5, {'Brand': 'Lexus'}),
])
# Add some edges
G.add_edges_from([
(1, 2), (1, 4), (1, 5),
(2, 3), (2, 4),
(3, 2), (3, 5),
(4, 1), (4, 2),
(5, 1), (5, 3)
])
# Convert the graph into PyTorch geometric
pyg_graph = from_networkx(G)
So this correctly converts the networkx graph to PyTorch Geometric. However, I still don't know how to properly set the labels.
The brand values for each node have been converted and are stored within:
pyg_graph.Brand
Below, I have just made some random numpy arrays of length 5 for each node (just pretend that these are realistic).
ford_prices = np.random.randint(100, size = 5)
lexus_prices = np.random.randint(100, size = 5)
audi_prices = np.random.randint(100, size = 5)
bmw_prices = np.random.randint(100, size = 5)
peugot_prices = np.random.randint(100, size = 5)
This brings me to the main question:
pyg_graph.Brand
when training the network?)Thanks in advance and happy holidays.
Upvotes: 12
Views: 19063
Reputation: 4892
The easiest way is to add all information to the networkx graph and directly create it in the way you need it. I guess you want to use some Graph Neural Networks. Then you want to have something like below.
x
and your labels/ground truth y
.PyTorch Geometric introduction
for an example, which uses the Cora dataset.import networkx as nx
import numpy as np
import torch
from torch_geometric.utils.convert import from_networkx
# Make the networkx graph
G = nx.Graph()
# Add some cars (just do 4 for now)
G.add_nodes_from([
(1, {'y': 1, 'x': 0.5}),
(2, {'y': 2, 'x': 0.2}),
(3, {'y': 3, 'x': 0.3}),
(4, {'y': 4, 'x': 0.1}),
(5, {'y': 5, 'x': 0.2}),
])
# Add some edges
G.add_edges_from([
(1, 2), (1, 4), (1, 5),
(2, 3), (2, 4),
(3, 2), (3, 5),
(4, 1), (4, 2),
(5, 1), (5, 3)
])
# Convert the graph into PyTorch geometric
pyg_graph = from_networkx(G)
print(pyg_graph)
# Data(edge_index=[2, 12], x=[5], y=[5])
print(pyg_graph.x)
# tensor([0.5000, 0.2000, 0.3000, 0.1000, 0.2000])
print(pyg_graph.y)
# tensor([1, 2, 3, 4, 5])
print(pyg_graph.edge_index)
# tensor([[0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4],
# [1, 3, 4, 0, 2, 3, 1, 4, 0, 1, 0, 2]])
# Split the data
train_ratio = 0.2
num_nodes = pyg_graph.x.shape[0]
num_train = int(num_nodes * train_ratio)
idx = [i for i in range(num_nodes)]
np.random.shuffle(idx)
train_mask = torch.full_like(pyg_graph.y, False, dtype=bool)
train_mask[idx[:num_train]] = True
test_mask = torch.full_like(pyg_graph.y, False, dtype=bool)
test_mask[idx[num_train:]] = True
print(train_mask)
# tensor([ True, False, False, False, False])
print(test_mask)
# tensor([False, True, True, True, True])
Upvotes: 14