Reputation: 573
I can successfully create a single new attribute on a dataframe called df
using a single dictionary as follows:
mye2_men
:In [13]: mye2_men = pd.read_csv("~/03_Maps_March_2020/mye2_men.csv",index_col="Code")
...: mye2_men.head()
Out[13]:
Name Geography1 All ages 0 1 2 3 4 5 6 7 8 9 ... 78 79 80 81 82 83 84 85 86 87 88 89 90
Code ...
K02000001 UNITED KINGDOM Country 32790202 382332 395273 408684 408882 412553 421934 434333 427809 419161 414994 ... 192839 186251 175626 160475 146314 132941 116050 103669 93155 81174 68110 55652 183486
K03000001 GREAT BRITAIN Country 31864002 370474 382933 395754 396181 399764 409061 420947 414613 406062 401647 ... 188073 181546 171350 156506 142851 129815 113306 101194 91038 79342 66699 54387 179629
K04000001 ENGLAND AND WALES Country 29215251 343642 355122 366722 366885 370156 379046 389944 382853 375940 370701 ... 172046 166392 157065 143896 131207 119193 104143 93055 83798 73224 61794 50297 167009
E92000001 ENGLAND Country 27667942 327309 338368 349229 349199 352148 360688 370995 363496 356965 351790 ... 161540 156343 147733 135514 123492 112133 98000 87528 79030 69067 58264 47498 157788
E12000001 NORTH EAST Region 1305486 13992 14423 15124 15159 15542 15839 16314 16283 16068 15748 ... 8130 8108 7601 6977 6118 5723 4958 4383 3889 3360 2747 2148 6822
[5 rows x 94 columns]
df
In [14]: df = pd.DataFrame({"A":[num for num in range(0,430)],
...: "B":[num**2 for num in range(0,430)],
...: "Code":mye2_men.index})
...: df.head()
Out[14]:
A B Code
0 0 0 K02000001
1 1 1 K03000001
2 2 4 K04000001
3 3 9 E92000001
4 4 16 E12000001
In [15]: male_counts = mye2_men["All ages"].to_dict()
...: male_counts
Out[15]:
{'K02000001': 32790202,
'K03000001': 31864002,
'K04000001': 29215251,
'E92000001': 27667942,
'E12000001': 1305486,
'E06000047': 259299,
'E06000005': 51919,
'E06000001': 45524 ....}
df
To Create New Column "male_count":In [19]: # CREATE NEW male_count COLUMN IN df
...: df["male_count"] = df["Code"].map(male_counts)
...: df.head()
Out[19]:
A B Code male_count
0 0 0 K02000001 32790202
1 1 1 K03000001 31864002
2 2 4 K04000001 29215251
3 3 9 E92000001 27667942
4 4 16 E12000001 1305486
For the 2nd dictionary:
In [20]: female_counts = (mye2_men["All ages"]+10).to_dict()
...: female_counts
Out[20]:
{'K02000001': 32790212,
'K03000001': 31864012,
'K04000001': 29215261,
'E92000001': 27667952,
'E12000001': 1305496,
'E06000047': 259309,
'E06000005': 51929 ...}
I can successfully produce a second attribute called df["female_count"]
by repeating step 4 above, but this time using the female_counts dictionary.
How can I create multiple new df
columns (ie df["male_count"]
and df["female_count"]
) in a single step?
Many thanks
Note: The mye2_men data is from the "MYE2 - Males" tab of the following excel doc: https://www.ons.gov.uk/file?uri=%2fpeoplepopulationandcommunity%2fpopulationandmigration%2fpopulationestimates%2fdatasets%2fpopulationestimatesforukenglandandwalesscotlandandnorthernireland%2fmid2019april2020localauthoritydistrictcodes/ukmidyearestimates20192020ladcodes.xls
Upvotes: 1
Views: 50
Reputation: 862406
Create DataFrame
from dictionaries and then use DataFrame.join
:
new = pd.DataFrame({'male_count':male_counts, 'female_count':female_count})
df = df.join(new, on='Code')
Upvotes: 1