Reputation: 7
I want to create a dictionary from list of strings. For example I have these list
AAAA
AAAA
AAAA
BBBB
BBBB
CCCC
CCCC
CCCC
....
Then I want to create a dictionary with numbering value from that, how to do that?
I explored some code but still have no idea
import os
path = "directoryA"
dirList = os.listdir(path)
with open("check.txt", "w") as a:
for path, subdirs, files in os.walk(path):
for filename in files:
# I have splitted the text and now I want to create dictionary
#from it
mylist = filename.split("_") # the text format is AAAA_0 and I split
#it so I can have list of 'AAAA' and '0'
k = mylist[0] #I only take 'AAAA' string after splitting
print(k) # here the output only give text output. From this I want to
# put into dictionary
This is the output after print(k) and these are not list
AAAA
AAAA
AAAA
BBBB
BBBB
CCCC
CCCC
CCCC
....
This is my expected result
myDic ={
'AAAA': 0,
'BBBB': 1,
'CCCC': 2,
'DDDD': 3,
# ... and so on
}
Upvotes: 0
Views: 146
Reputation: 542
Assuming keys of dictionary are :
keys = ['A', 'B', 'C']
Then:
id = range(len(keys))
d = dict(zip(keys, id))
Upvotes: 0
Reputation: 20490
Assuming the contents of check.txt
looks like li
, start by getting all unique elements in your list of strings by using a set, and then sort the unique list alphabetically
After that, use dictionary comprehension and enumerate
to generate your dictionary
li = [
"AAAA",
"AAAA",
"AAAA",
"BBBB",
"BBBB",
"CCCC",
"CCCC",
"CCCC"]
#Get the list of unique strings by converting to a set
li = (list(set(li)))
#Sort the list lexicographically
li = sorted(li)
#Create your dictionary via dictionary comprehension and enumerate
dct = {item:idx for idx, item in enumerate(li)}
print(dct)
The output will be
{'AAAA': 0, 'BBBB': 1, 'CCCC': 2}
We should be able to create the list of strings li
like so
import os
path = "directoryA"
dirList = os.listdir(path)
li = []
with open("check.txt", "w") as a:
for path, subdirs, files in os.walk(path):
for filename in files:
# I have splitted the text and now I want to create dictionary
#from it
mylist = filename.split("_") # the text format is AAAA_0 and I split
#it so I can have list of 'AAAA' and '0'
k = mylist[0]
#append item to li
li.append(k)
Upvotes: 2
Reputation: 774
first you have to remove duplicates based on this answer: How do you remove duplicates from a list whilst preserving order?
so it will be like this:
def f7(seq):
seen = set()
seen_add = seen.add
return [x for x in seq if not (x in seen or seen_add(x))]
l = [
"AAAA",
"AAAA",
"AAAA",
"BBBB",
"BBBB",
"CCCC",
"CCCC",
"CCCC"]
#first remove duplicates
s = f7(l)
#create desired dict
dict(zip(s,range(len(s))))
#{'AAAA': 0, 'CCCC': 1, 'BBBB': 2}
Upvotes: 0
Reputation: 17794
You can use dict.fromkeys()
to build a dict and count()
to fill values:
from itertools import count
lst = ["AAAA", "AAAA", "AAAA", "BBBB", "BBBB", "CCCC", "CCCC", "CCCC"]
dct = dict.fromkeys(lst)
c = count()
for key in dct:
dct[key] = next(c)
print(dct)
# {'AAAA': 0, 'BBBB': 1, 'CCCC': 2}
Upvotes: 1
Reputation: 92440
You can use itertools.groupby
to group the strings assuming they are sorted as you have them (it not, sort them first). Then enumerate()
over the groups which will give you the count:
from itertools import groupby
l = [
"AAAA",
"AAAA",
"AAAA",
"BBBB",
"BBBB",
"CCCC",
"CCCC",
"CCCC"]
d = {key:i for i, (key, group) in enumerate(groupby(l))}
# {'AAAA': 0, 'BBBB': 1, 'CCCC': 2}
If you are reading from a file and the strings are not sorted, you can add an entry and increment each time you find something not yet in the dict. The values will be sorted based on the first time you see a given string. For example:
from itertools import count, filterfalse
i = count(1)
d = {}
with open('test.txt') as f:
for line in filterfalse(lambda l: l.strip() in d, f):
d[line.strip()] = next(i)
Upvotes: 1
Reputation: 36370
I would do it following way:
data = ['A','A','A','B','B','C','C','D','C']
unique = [i for inx,i in enumerate(data) if data.index(i)==inx]
print(unique) # ['A', 'B', 'C', 'D']
d = {(i,inx) for inx,i in enumerate(unique)}
print(d) # {('D', 3), ('A', 0), ('B', 1), ('C', 2)}
Idea behind this method might be described as: get value from list only if it occurs first time (same value did not appear earlier). I utilized fact that .index
method of list
, always returns lowest value possible. Note that in this method same values do not have to be neighbors.
Upvotes: 0