Reputation: 515
I've a dataframe
as below.
df = pd.DataFrame({
'code' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Tag' : ['A','B','C','D','B','C','D','A','D','C']
})
+------+-----+
| code | Tag |
+------+-----+
| 1 | A |
+------+-----+
| 2 | B |
+------+-----+
| 3 | C |
+------+-----+
| 4 | D |
+------+-----+
| 5 | B |
+------+-----+
| 6 | C |
+------+-----+
| 7 | D |
+------+-----+
| 8 | A |
+------+-----+
| 9 | D |
+------+-----+
| 10 | C |
+------+-----+
My objective is to create code
lists based on the common items in the Tag
column as below.
codes_A = [1,8]
codes_B = [2,5]
codes_C = [3,6,10]
codes_D = [4,7,9]
How I'm doing it right now is
codes_A = df[df['Tag'] == 'A']['code'].to_list()
codes_B = df[df['Tag'] == 'B']['code'].to_list()
codes_C = df[df['Tag'] == 'C']['code'].to_list()
codes_D = df[df['Tag'] == 'D']['code'].to_list()
This code does the job. But, as you can see this is very cumbersome and inefficient. I'm repeating the same code multiple times and also repeating when I want to create new lists.
is there a more efficient and pythonic
way to do this in pandas
or numpy
?
Upvotes: 2
Views: 44
Reputation: 862406
Create dictionary of list, becasue variable names are not recommended:
d = df.groupby('Tag')['code'].agg(list).to_dict()
print (d)
{'A': [1, 8], 'B': [2, 5], 'C': [3, 6, 10], 'D': [4, 7, 9]}
Then for list lookup by keys in dict, but no assign to variable name:
print (d['A'])
[1, 8]
So practically it means in your code if use codes_A
then it change to d['A']
, similar for all variables.
But if really need it:
for k, v in d.items():
globals()[f'code_{k}'] = v
print (code_A)
[1, 8]
Upvotes: 3