Masyaf
Masyaf

Reputation: 841

import data from file to dictionary in python

I want to import file to dictionary for further processing. The file contains embedding vectors for NLP. It looks like:

the 0.011384 0.010512 -0.008450 -0.007628 0.000360 -0.010121 0.004674 -0.000076 
of 0.002954 0.004546 0.005513 -0.004026 0.002296 -0.016979 -0.011469 -0.009159 
and 0.004691 -0.012989 -0.003122 0.004786 -0.002907 0.000526 -0.006146 -0.003058
one 0.014722 -0.000810 0.003737 -0.001110 -0.011229 0.001577 -0.007403 -0.005355

The code I used is:

embeddingTable = {}

with open("D:\\Embedding\\test.txt") as f:
    for line in f:
       (key, val) = line.split()
       d[key] = val
print(embeddingTable)

The error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-22-3612e9012ffe> in <module>()
 24 with open("D:\\Embedding\\test.txt") as f:
 25     for line in f:
---> 26        (key, val) = line.split()
 27        d[key] = val
 28 print(embeddingTable)

ValueError: too many values to unpack (expected 2)

I understand that it expects 2 values not 9, but is there possibility to insert word as key and vectors as value?

Upvotes: 3

Views: 2857

Answers (3)

Rein
Rein

Reputation: 3351

If you cannot use the * operator because you're using Python 2, you could do it like below:

embeddingTable = {}
with open('test.txt') as f:
    for line in f:
       values = line.split()
       embeddingTable[values[0]] = values[1:]
print(embeddingTable)

If you are however using Python 3, please do use the more elegant * operator.

Upvotes: 3

Padraic Cunningham
Padraic Cunningham

Reputation: 180411

Use the csv lib to parse just unpack and map the vals to floats using a dict comp:

import csv

with open("D:/Embedding/test.txt") as f:
    d = {k:list(map(float, vals)) for k, *vals in csv.reader(f,delimiter=" ")}

Upvotes: 4

Sede
Sede

Reputation: 61225

You need to use the * operator

embeddingTable = {}
with open("D:\\Embedding\\test.txt") as f:
    for line in f:
       key, *values = line.split() # fix here
       embeddingTable[key] = [float(value) for value in values]
print(embeddingTable)

Upvotes: 7

Related Questions