Classifying graph with DGL GNN without nodes attributes

I'm following this guide to create the dataset for graph classification from my own data: https://docs.dgl.ai/en/0.6.x/new-tutorial/6_load_data.html

There they don't create any node's feature as it is not necessary if you are going to predict the graph class. In my case it is the same, I don't want to use any node feature (yet) for my classification.

To train the GNN I'm following this tutorial: https://docs.dgl.ai/tutorials/blitz/5_graph_classification.html#sphx-glr-tutorials-blitz-5-graph-classification-py

Both are from the official documentation but they seem to be incompatible because when I tried to use them together I received this error:

KeyError                                  Traceback (most recent call last) <ipython-input-39-8a94f1fa250d> in <module>
      4 for epoch in range(20):
      5     for batched_graph, labels in train_dataloader:
----> 6         pred = model(batched_graph, batched_graph.ndata['attr'].float())
      7         loss = F.cross_entropy(pred, labels)
      8         optimizer.zero_grad()

~/anaconda3/lib/python3.8/site-packages/dgl/view.py in
__getitem__(self, key)
     64             return ret
     65         else:
---> 66             return self._graph._get_n_repr(self._ntid, self._nodes)[key]
     67 
     68     def __setitem__(self, key, val):

~/anaconda3/lib/python3.8/site-packages/dgl/frame.py in
__getitem__(self, name)
    391             Column data.
    392         """
--> 393         return self._columns[name].data
    394 
    395     def __setitem__(self, name, data):

KeyError: 'attr'

and I don't find another example to train a GNN with DGl without using the node's feature. Is it possible? Do I have to create fake attributes?

Thanks!

Upvotes: 2

Views: 1551

Answers (2)

Lukman E. ISMAILA
Lukman E. ISMAILA

Reputation: 1

I got this error when the node feature from my dataset is either mission or it differ as define in the dataset module

epoch_losses = []
for epoch in range(200):
    epoch_loss = 0
    for iter, (bg, label) in enumerate(data_loader):
        prediction = model(bg)
        loss = loss_func(prediction, label)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        epoch_loss += loss.detach().item()
    epoch_loss /= (iter + 1)
    print('Epoch {}, loss {:.4f}'.format(epoch, epoch_loss))
    epoch_losses.append(epoch_loss) # this 

Based on the tutorial you follow, i assume you defined graph node features g.ndata['h'] not batched_graph.ndata['attr'] specifically the naming of the attribute

Mode Training Loss curve

You might find this helpful

Upvotes: 0

DGL model always needs to have at least one feature. SO I resolve it by using the degree feature in the classifier:

class Classifier(nn.Module):
    def __init__(self, in_dim, hidden_dim, n_classes):
        super(Classifier, self).__init__()
        self.conv1 = GraphConv(in_dim, hidden_dim)
        self.conv2 = GraphConv(hidden_dim, hidden_dim)
        self.classify = nn.Linear(hidden_dim, n_classes)

    def forward(self, g):
        # Use node degree as the initial node feature. For undirected graphs, the in-degree
        # is the same as the out_degree.
        h = g.in_degrees().view(-1, 1).float()
        # Perform graph convolution and activation function.
        h = F.relu(self.conv1(g, h))
        h = F.relu(self.conv2(g, h))
        g.ndata['h'] = h
        # Calculate graph representation by averaging all the node representations.
        hg = dgl.mean_nodes(g, 'h')
        return self.classify(hg)

Upvotes: 2

Related Questions