user9499285
user9499285

Reputation:

Creating embeddings using node2vec

I'm trying to create embeddings for an edge list I have using networkx and node2vec. My edge list looks as follows:

1 2
1 6
...
450 230
...
601 602 

It's an unweighted undirected graph, basically, and is stored in a text file. I'm trying to convert this to a graph using the following:

nx_G = nx.read_edgelist(args.input, delimiter=' ', create_using=nx.DiGraph())
G = node2vec.Graph(nx_G, args.directed, args.p, args.q, args.seed)
G.preprocess_transition_probs()
walks = G.simulate_walks(args.num_walks, args.walk_length)
walks = [str(walk) for walk in walks]
model = Word2Vec(walks, size=args.dimensions, window=args.window_size, min_count=0, sg=1, workers=args.workers, iter=args.iter)
model.wv.save_word2vec_format(args.output)

where args.input provides the text file. The read_edgelist functions reads the nodes properly, and embeddings are being created. However, in the embeddings file, the nodes aren't numbered 1 through 606 (these are the node values in my edge list): instead, they are 14 in number, with digits from 0-9 and some special characters as the node values instead for which the embeddings have been created. That is, instead of reading an entire number of more than one digit (say 29) as a single node, it is simply reading 2 and calculating embeddings for it. I do not understand why this is happening and would appreciate some insight.

Upvotes: 3

Views: 1962

Answers (2)

Marco Cerliani
Marco Cerliani

Reputation: 22031

I suggest you the stellargraph library, which provides great graph algorithms for machine learning. For exemple the basic Node2Vec...

from stellargraph.data import BiasedRandomWalk
from stellargraph import StellarGraph
from gensim.models import Word2Vec

rw = BiasedRandomWalk(StellarGraph(g_nx))

walks = rw.run(
      nodes=list(g_nx.nodes()), # root nodes
      length=100,  # maximum length of a random walk
      n=10,        # number of random walks per root node 
      p=0.5,       # Defines (unormalised) probability, 1/p, of returning to source node
      q=2.0        # Defines (unormalised) probability, 1/q, for moving away from source node
)

model = Word2Vec(walks, size=128, window=5, min_count=0, sg=1, workers=2, iter=1)

model.wv['29']

Upvotes: 1

user9499285
user9499285

Reputation:

I solved this by commenting out the following line of code in the main.py file of the node2vec repository:

walks = [map(str, walk) for walk in walks]

Upvotes: 2

Related Questions