Satya
Satya

Reputation: 5917

creating a dictionary from 2 list one as key other as value in python

i have a large string like

res = ["FAV_VENUE_CITY_NAME == 'Mumbai' & EVENT_GENRE == 'KIDS' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'Mumbai' & EVENT_GENRE == 'FANTASY' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME =='Mumbai' & EVENT_GENRE == 'FESTIVAL' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'New Delhi' & EVENT_GENRE == 'WORKSHOP' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'Mumbai' & EVENT_GENRE == 'EXHIBITION' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'Bangalore' & FAV_GENRE == '|DRAMA|'",
"FAV_VENUE_CITY_NAME = 'Mumbai' &  & FAV_GENRE == '|ACTION|ADVENTURE|SCI-FI|'",
"FAV_VENUE_CITY_NAME == 'Bangalore' & FAV_GENRE == '|COMEDY|'",
"FAV_VENUE_CITY_NAME == 'Bangalore' & FAV_GENRE == 'DRAMA' & FAV_LANGUAGE == 'English'",
"FAV_VENUE_CITY_NAME == 'New Delhi' & FAV_LANGUAGE == 'Hindi' & count_EVENT_LANGUAGE >= 1"]

now i am extracting fields by

 res = [re.split(r'[(==)(>=)]', x)[0].strip() for x in re.split('[&($#$)]', whereFields)]
 res = [x for x in list(set(res)) if x]

o/p:['FAV_GENRE', 'FAV_LANGUAGE', 'FAV_VENUE_CITY_NAME', 'count_EVENT_GENRE', 'EVENT_GENRE','count_EVENT_LANGUAGE']

then by following this filter out some items from a list and store in different arrays in python

i am getting values

 FAV_VENUE_CITY_NAME =  ['New Delhi', 'Mumbai', 'Bangalore']
 FAV_GENRE = ['|DRAMA|', '|COMEDY|', '|ACTION|ADVENTURE|SCI-FI|', 'DRAMA']
 EVENT_GENRE = ['FESTIVAL', 'WORKSHOP', 'FANTASY', 'KIDS', 'EXHIBITION']
 FAV_LANGUAGE = ['English', 'Hindi']
 count_on_field = ['EVENT_GENRE', 'EVENT_LANGUAGE']

Now i want to make a dictionary whose key will be field name in res. and values will be the result from above link.

Or is there a way to make items of list res as different different list by themselves.

SOmething like

res = ['FAV_GENRE', 'FAV_LANGUAGE', 'FAV_VENUE_CITY_NAME', 'count_EVENT_GENRE', 'EVENT_GENRE','count_EVENT_LANGUAGE']
for i in range(len(res)):
res[i] = list(res[i])   # make each item as an empty list with name as it is

so that they become like

  FAV_VENUE_CITY_NAME = []
  EVENT_GENRE = []
  FAV_GENRE = []
  FAV_LANGUAGE = [

then get the value to each individual lists in res list by following the method in above link.

Then make a dictionary like the below line making a dict with index as key

 a = [51,27,13,56]
 b = dict(enumerate(a))
 #####d = dict{key=each list name from res list, value = value in each ind. lists}
#

or if possible suggest something like from top res list....how to form a dict having key as field names and values as values from each lines

 o/p: d = {'FAV_VENUE_CITY_NAME':['Mumbai','New Delhi','Bangalore'], 'EVENT_GENRE':['KIDS','FANTASY','FESTIVAL','WORKSHOP','EXHIBITION'], 'FAV_GENRE':['|DRAMA|','|ACTION|ADVENTURE|SCI-FI|','|COMEDY|','DRAMA'], 'FAV_LANGUAGE':['English','Hindi']}

count_EVENT_GENRE>=1,count_EVENT_LANGUAGE>=1 should not be in that dictionary ,rather they should go to a list

count_on_fields = ['EVENT_GENRE','EVENT_LANGUAGE']

Pease if anybody has a better idea or suggestion, do help.

Upvotes: 0

Views: 284

Answers (3)

gboffi
gboffi

Reputation: 25093

Here follows an IPython session that shows you how you can build a dictionary from your data:

In [1]: from re import split

In [2]: from itertools import chain

In [3]: data = ["FAV_VENUE_CITY_NAME == 'Mumbai' & EVENT_GENRE == 'KIDS' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'Mumbai' & EVENT_GENRE == 'FANTASY' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'Mumbai' & EVENT_GENRE == 'FESTIVAL' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'New Delhi' & EVENT_GENRE == 'WORKSHOP' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'Mumbai' && EVENT_GENRE == 'EXHIBITION' & count_EVENT_GENRE >= 1",
"FAV_VENUE_CITY_NAME == 'Bangalore' & FAV_GENRE == '|DRAMA|'",
"FAV_VENUE_CITY_NAME == 'Mumbai' &  & FAV_GENRE == '|ACTION|ADVENTURE|SCI-FI|'",
"FAV_VENUE_CITY_NAME == 'Bangalore' & FAV_GENRE == '|COMEDY|'",
"FAV_VENUE_CITY_NAME == 'Bangalore' & FAV_GENRE == 'DRAMA' & FAV_LANGUAGE == 'English'",
"FAV_VENUE_CITY_NAME == 'New Delhi' & FAV_LANGUAGE == 'Hindi' & count_EVENT_LANGUAGE >= 1"]

In [4]: d = {}

In [5]: for elt in chain(*(split(' *& *', rec) for rec in data)):     
    if not elt: continue
    k, v = split(' *[=>]= *', elt)
    v = v.strip("'")
    if k not in d: d[k] = []
    if v not in d[k]: d[k].append(v)
   ...:     

In [6]: d
Out[6]: 
{'EVENT_GENRE': ['KIDS', 'FANTASY', 'FESTIVAL', 'WORKSHOP', 'EXHIBITION'],
 'FAV_GENRE': ['|DRAMA|', '|ACTION|ADVENTURE|SCI-FI|', '|COMEDY|', 'DRAMA'],
 'FAV_LANGUAGE': ['English', 'Hindi'],
 'FAV_VENUE_CITY_NAME': ['Mumbai', 'New Delhi', 'Bangalore'],
 'count_EVENT_GENRE': ['1'],
 'count_EVENT_LANGUAGE': ['1']}

In [7]: 

Addendum

In [7]: count_fields = []

In [8]: for k in d:
    if k[:6] == 'count_'
        # no need for testing 'cs dict keys are unique
        count_fields.append(k[6:])
        del d[k]

In [9]: 

Upvotes: 1

zehnpaard
zehnpaard

Reputation: 6243

I think it's going to be difficult for you to use the lists you get from the regex, as there's no way to tie them back to their 'keys'. I think it might be easiest to start from your original list, and work your way down.

from itertools import chain

res1 = [s.split(' & ') for s in res]
res2 = list(chain(*res1))
res3 = [item.replace('==', ' == ').replace('>=', ' >= ') for item in res2]
res4 = [item.split() for item in res3 if item]
res5 = [(item[0], item[-1]) for item in res4]

temp_dict = dict()
temp_set = set()
for key, value in res5:
    if key.startswith('count'):
        temp_set.add(key.replace('count_',''))
    else:
        clean_value = value.replace("'","")
        temp_dict.setdefault(key, set()).add(clean_value)

output_dict = {key:list(value) for key, value in temp_dict.items()}
output_list = list(temp_set)

print(output_dict)
print(output_list)

You can try printing the intermediate results (res1 ~ res5) to see what's going on.

For production use, especially if you're dealing with a much larger res, you should probably change each of the list comprehensions to generator expressions, and change res2 = list(chain(*res1)) to res2 = chain.from_iterable(res1)).

Upvotes: 1

ant0nisk
ant0nisk

Reputation: 591

Here you go:

Create a list with all the values:

 values=[
    FAV_GENRE,
    FAV_LANGUAGE,
    FAV_VENUE_CITY_NAME,
    EVENT_GENRE,
    count_on_field
]

Then create your dict as proposed on this answer:

 d=dict(zip(res, values))

Note that the array order does matter, of course...

Haven't tested it, because I am running out of battery now. I hope it results to what you need

Upvotes: 1

Related Questions