Reputation: 35
I have a dataset(pickle format) containing float('nan')
, and I need to remove it.
It is possible to add float('nan')
to a graph as a node in networkx. However, I don't know how to remove it.
import networkx as nx
G = nx.Graph()
G.add_node(float('nan'))
print(G.nodes) # [nan], so there is float('nan') in the graph
G.remove_node(float('nan')) # this statement raise a NetworkxError showing nan not in the graph
Code and data in CoReRank-WSDM-2019 and BirdNest.
Could anyone help me with this problem? Thank you in advance.
Upvotes: 3
Views: 1111
Reputation: 88275
We can test this on a simple dictionary, which is the underlying data structure of a NetworkX graph. Say you have:
d = {'a':3, float('nan'):4}
If we try accessing the NaN
key, as you're trying to do:
d[float('nan')]
> KeyError: nan
The core cause of this, is explained by the fact that a NaN
does not equal to itself:
>>> float("nan") == float("nan")
False
The reason behind why this causes the lookup to fail, is nicely explained here.
A workaround, could be to loop over the graph keys, and identify the NaN
node and then remove that key using the same reference of the object:
import math
G = nx.Graph()
G.add_node(float('nan'))
G.add_node(3)
print(G.nodes)
# [nan, 3]
nan_nodes = []
for node in G.nodes():
if math.isnan(node):
nan_nodes.append(node)
G.remove_nodes_from(nan_nodes)
G.nodes()
# NodeView((3,))
Upvotes: 4