Reputation: 91
I have a list of strings (from a .tt
file) that looks like this:
list1 = ['have\tVERB', 'and\tCONJ', ..., 'tree\tNOUN', 'go\tVERB']
I want to turn it into a dictionary that looks like:
dict1 = { 'have':'VERB', 'and':'CONJ', 'tree':'NOUN', 'go':'VERB' }
I was thinking of substitution, but it doesn't work that well. Is there a way to tag the tab string '\t'
as a divider?
Upvotes: 5
Views: 2122
Reputation: 8579
A short way to solve the problem, since split method splits '\t'
by default (as pointed out by Jim Fasarakis-Hilliard), could be:
dictionary = dict(item.split() for item in list1)
print dictionary
I also wrote down a more simple and classic approach.
Not very pythonic but easy to understand for beginners:
list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']
dictionary1 = {}
for item in list1:
splitted_item = item.split('\t')
word = splitted_item[0]
word_type = splitted_item[1]
dictionary1[word] = word_type
print dictionary1
Here I wrote the same code with very verbose comments:
# Let's start with our word list, we'll call it 'list1'
list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']
# Here's an empty dictionary, 'dictionary1'
dictionary1 = {}
# Let's start to iterate using variable 'item' through 'list1'
for item in list1:
# Here I split item in two parts, passing the '\t' character
# to the split function and put the resulting list of two elements
# into 'splitted_item' variable.
# If you want to know more about split function check the link available
# at the end of this answer
splitted_item = item.split('\t')
# Just to make code more readable here I now put 1st part
# of the splitted item (part 0 because we start counting
# from number 0) in "word" variable
word = splitted_item[0]
# I use the same apporach to save the 2nd part of the
# splitted item into 'word_type' variable
# Yes, you're right: we use 1 because we start counting from 0
word_type = splitted_item[1]
# Finally I add to 'dictionary1', 'word' key with a value of 'word_type'
dictionary1[word] = word_type
# After the for loop has been completed I print the now
# complete dictionary1 to check if result is correct
print dictionary1
Useful links:
Upvotes: 3
Reputation: 19806
Try the following:
dict1 = dict(item.split('\t') for item in list1)
Output:
>>>dict1
{'and': 'CONJ', 'go': 'VERB', 'tree': 'NOUN', 'have': 'VERB'}
Upvotes: 16
Reputation: 160447
Since str.split
also splits on '\t'
by default ('\t'
is considered white space), you could get a functional approach by feeding dict
with a map
that looks quite elegant:
d = dict(map(str.split, list1))
With the dictionary d
now being in the wanted form:
print(d)
{'and': 'CONJ', 'go': 'VERB', 'have': 'VERB', 'tree': 'NOUN'}
If you need a split only on '\t'
(while ignoring ' '
and '\n'
) and still want to use the map
approach, you can create a partial object with functools.partial
that only uses '\t'
as the separator:
from functools import partial
# only splits on '\t' ignoring new-lines, white space e.t.c
tabsplit = partial(str.split, sep='\t')
d = dict(map(tabsplit, list1))
this, of course, yields the same result for d
using the sample list of strings.
Upvotes: 7
Reputation: 140186
do that with a simple dict comprehension and a str.split
(without arguments strip
splits on blanks)
list1 = ['have\tVERB', 'and\tCONJ', 'tree\tNOUN', 'go\tVERB']
dict1 = {x.split()[0]:x.split()[1] for x in list1}
result:
{'and': 'CONJ', 'go': 'VERB', 'tree': 'NOUN', 'have': 'VERB'}
EDIT: the x.split()[0]:x.split()[1]
does split
twice, which is not optimal. Other answers here do it better without dict comprehension.
Upvotes: 4