Reputation: 865
I have this list with 18k dictionary elements (which I'm only showing the part of) in which I need to replace a key and extract an element of the list. For e.g., following is my list of dictionaries.
[{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 1.0',
'column_index': 387,
'hypergeometric_p_values': [0.04813691453106806, 0.951863085468932],
'percent_in_group': [0.09896233666410453,
0.10215470469694621,
0.11547714514835605]},
{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 10.0',
'column_index': 387,
'hypergeometric_p_values': [0.00014612920992348574, 0.9998538707900765],
'percent_in_group': [0.08647194465795542,
0.09316385056580376,
0.1210906174819567]},
{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 2.0',
'column_index': 387,
'hypergeometric_p_values': [0.044335711647001765, 0.9556642883529982],
'percent_in_group': [0.09934665641813989,
0.10261974887614324,
0.11627906976744186]},
{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 3.0',
'column_index': 387,
'hypergeometric_p_values': [0.000497701807800938, 0.999502298192199],
'percent_in_group': [0.08724058416602613,
0.09331886529220276,
0.11868484362469928]},
{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 4.0',
'column_index': 387,
'hypergeometric_p_values': [0.07220994726016502, 0.927790052739835],
'percent_in_group': [0.08954650269023828,
0.0922337622074097,
0.10344827586206896]}]
I need to change hypergeometric_p_values
to simply p_values
and only take the first element from the list of the values. Also, I need to create a new key names percent_missing_group_1
and percent_missing_group_2
with elements 0 and 1 from the list.
So, the data should be something like this (for a single dictionary):
[{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 1.0',
'column_index': 387,
'p_values': 0.04813691453106806,
'percent_missing_in_group_1': 0.09896233666410453,
'percent_missing_in_group_2': 0.10215470469694621
}]
But I've been stuck on this for a while and tried many different ways but all failed. The following one works but it's only for renaming the key
data = [{"p_value" if k == 'hypergeometric_p_values' else k:v for k,v in d.items()} for d in data]
Also, when I try to do it the following way:
for item in cat:
for k,v in item.items():
if k == 'hypergeometric_p_values':
item['p_value'] = v[0]
del item['hypergeometric_p_values']
print(item)
I get the following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-397-5298b96a56bc> in <module>
10
11 for item in cat:
---> 12 for k,v in item.items():
13 if k == 'hypergeometric_p_values':
14 item['p_value'] = v[0]
RuntimeError: dictionary keys changed during iteration
Is there any simpler and easier way to do this so that I can rename may keys at once?
Upvotes: 0
Views: 55
Reputation: 81
I can give you 2 methods to deal with this question:
Method 1: change the keys and values at original data
data = [{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 1.0',
'column_index': 387,
'hypergeometric_p_values': [0.04813691453106806, 0.951863085468932],
'percent_in_group': [0.09896233666410453,0.10215470469694621,0.11547714514835605]}]
for e in data:
p_values = e['hypergeometric_p_values'][0]
e['p_values'] = e.pop('hypergeometric_p_values')
e['p_values'] = p_values
e['percent_missing_in_group_1'] = e['percent_in_group'][0]
e['percent_missing_in_group_2'] = e['percent_in_group'][1]
del e['percent_in_group']
print(data)
Method 2: build a new data list
data = [{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 1.0',
'column_index': 387,
'hypergeometric_p_values': [0.04813691453106806, 0.951863085468932],
'percent_in_group':[0.09896233666410453, 0.10215470469694621,0.11547714514835605]}]
data1 = []
for e in data:
d = {}
d['name'] = e['name']
d['column_index'] = e['column_index']
d['p_values'] = e['hypergeometric_p_values'][0]
d['percent_missing_in_group_1'] = e['percent_in_group'][0]
d['percent_missing_in_group_2'] = e['percent_in_group'][1]
data1.append(d)
print(data1)
Upvotes: 0
Reputation: 730
You can try a function to process each elem (or dictionary) in your list and return it. Then, either generate a new list or iterate on your list and edit each element in of list. You must delete the key in the dictionary that you do not need anymore.
my_list = [{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 1.0',
'column_index': 387,
'hypergeometric_p_values': [0.04813691453106806, 0.951863085468932],
'percent_in_group': [0.09896233666410453,
0.10215470469694621,
0.11547714514835605]},...]
def get_elem(elem):
elem["p_values"] = elem["hypergeometric_p_values"][0]
elem["percent_missing_in_group_1"] = elem['percent_in_group'][0]
elem["percent_missing_in_group_2"] = elem['percent_in_group'][1]
del elem["hypergeometric_p_values"]
del elem["percent_in_group"]
return elem
my_list = [get_elem(x) for x in my_list]
or you can iterate on your list if you think it will cause memory error.
for i in range(len(my_list)):
my_list[i] = get_elem(my_list[i])
>>> my_list[0]
{'name': 'Achieving_Results_in_a_Challenging_Business_Context_rank = 1.0', 'column_index': 387, 'p_values': 0.04813691453106806, 'percent_missing_in_group_1': 0.09896233666410453, 'percent_missing_in_
group_2': 0.10215470469694621}
>>>
Not: there may be faster way, but this should work!
Upvotes: 1