lenakmeth
lenakmeth

Reputation: 91

Populate dictionary from list

I have a list of strings (from a .tt file) that looks like this:

list1 = ['have\tVERB', 'and\tCONJ', ..., 'tree\tNOUN', 'go\tVERB']

I want to turn it into a dictionary that looks like:

dict1 = { 'have':'VERB', 'and':'CONJ', 'tree':'NOUN', 'go':'VERB' }

I was thinking of substitution, but it doesn't work that well. Is there a way to tag the tab string '\t' as a divider?

Upvotes: 5

Views: 2122

Answers (4)

Pitto
Pitto

Reputation: 8579

A short way to solve the problem, since split method splits '\t' by default (as pointed out by Jim Fasarakis-Hilliard), could be:

dictionary = dict(item.split() for item in list1)
print dictionary

I also wrote down a more simple and classic approach.

Not very pythonic but easy to understand for beginners:

list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']
dictionary1 = {}

for item in list1:
    splitted_item = item.split('\t')
    word = splitted_item[0]
    word_type = splitted_item[1]
    dictionary1[word] = word_type

print dictionary1

Here I wrote the same code with very verbose comments:

# Let's start with our word list, we'll call it 'list1'

list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']

# Here's an empty dictionary, 'dictionary1'

dictionary1 = {}

# Let's start to iterate using variable 'item' through 'list1'

for item in list1:

# Here I split item in two parts, passing the '\t' character
# to the split function and put the resulting list of two elements
# into 'splitted_item' variable.
# If you want to know more about split function check the link available
# at the end of this answer

    splitted_item = item.split('\t')

# Just to make code more readable here I now put 1st part
# of the splitted item (part 0 because we start counting
# from number 0) in "word" variable

    word = splitted_item[0]

# I use the same apporach to save the 2nd part of the 
# splitted item into 'word_type' variable
# Yes, you're right: we use 1 because we start counting from 0

    word_type = splitted_item[1]

# Finally I add to 'dictionary1', 'word' key with a value of 'word_type' 

    dictionary1[word] = word_type

# After the for loop has been completed I print the now
# complete dictionary1 to check if result is correct

print dictionary1

Useful links:

Upvotes: 3

ettanany
ettanany

Reputation: 19806

Try the following:

dict1 = dict(item.split('\t') for item in list1)

Output:

>>>dict1
{'and': 'CONJ', 'go': 'VERB', 'tree': 'NOUN', 'have': 'VERB'}

Upvotes: 16

Dimitris Fasarakis Hilliard
Dimitris Fasarakis Hilliard

Reputation: 160447

Since str.split also splits on '\t' by default ('\t' is considered white space), you could get a functional approach by feeding dict with a map that looks quite elegant:

d = dict(map(str.split, list1))

With the dictionary d now being in the wanted form:

print(d)
{'and': 'CONJ', 'go': 'VERB', 'have': 'VERB', 'tree': 'NOUN'}

If you need a split only on '\t' (while ignoring ' ' and '\n') and still want to use the map approach, you can create a partial object with functools.partial that only uses '\t' as the separator:

from functools import partial 

# only splits on '\t' ignoring new-lines, white space e.t.c 
tabsplit = partial(str.split, sep='\t')
d = dict(map(tabsplit, list1)) 

this, of course, yields the same result for d using the sample list of strings.

Upvotes: 7

Jean-François Fabre
Jean-François Fabre

Reputation: 140186

do that with a simple dict comprehension and a str.split (without arguments strip splits on blanks)

list1 = ['have\tVERB', 'and\tCONJ',  'tree\tNOUN', 'go\tVERB']
dict1 = {x.split()[0]:x.split()[1] for x in list1}

result:

{'and': 'CONJ', 'go': 'VERB', 'tree': 'NOUN', 'have': 'VERB'}

EDIT: the x.split()[0]:x.split()[1] does split twice, which is not optimal. Other answers here do it better without dict comprehension.

Upvotes: 4

Related Questions