Reputation: 2606
I have an existing dataframe and I'm trying to concatenate a dictionary where the length of the dictionary is different from the dataframe
A B C
0 0.46324 0.32425 0.42194
1 0.10596 0.35910 0.21004
2 0.69209 0.12951 0.50186
3 0.04901 0.31203 0.11035
4 0.43104 0.62413 0.20567
5 0.43412 0.13720 0.11052
6 0.14512 0.10532 0.05310
and
test = {"One": [0.23413, 0.19235, 0.51221], "Two": [0.01293, 0.12235, 0.63291]}
I'm trying to add test
to df
, while changing the keys to "D"
and "C"
and I've had a look at https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html and
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html which indicates that I should be able to concatenate the dictionary to the dataframe
I've tried:
pd.concat([df, test], axis=1, ignore_index=True, keys=["D", "E"])
pd.concat([df, test], axis=1, ignore_index=True)
but I'm not having any luck, the result I'm trying to achieve is
A B C D E
0 0.46324 0.32425 0.42194 0.23413 0.01293
1 0.10596 0.35910 0.21004 0.19235 0.12235
2 0.69209 0.12951 0.50186 0.51221 0.63291
3 0.04901 0.31203 0.11035 NaN NaN
4 0.43104 0.62413 0.20567 NaN NaN
5 0.43412 0.13720 0.11052 NaN NaN
6 0.14512 0.10532 0.05310 NaN NaN
Upvotes: 10
Views: 76796
Reputation: 23071
To add a dictionary as new columns, another method is to convert it into a dataframe and simply assign.
df[['D', 'E']] = pd.DataFrame(test)
To add a dictionary as new rows, another method is to convert the dict into a dataframe using from_dict
method and concatenate.
df = pd.concat([df, pd.DataFrame.from_dict(test, orient='index', columns=df.columns)], ignore_index=True)
Upvotes: 1
Reputation: 25189
The only way you can do that is with:
df.join(pd.DataFrame(test).rename(columns={'One':'D','Two':'E'}))
A B C D E
0 0.46324 0.32425 0.42194 0.23413 0.01293
1 0.10596 0.35910 0.21004 0.19235 0.12235
2 0.69209 0.12951 0.50186 0.51221 0.63291
3 0.04901 0.31203 0.11035 NaN NaN
4 0.43104 0.62413 0.20567 NaN NaN
5 0.43412 0.13720 0.11052 NaN NaN
6 0.14512 0.10532 0.05310 NaN NaN
because as @Alexander mentioned correctly the number of rows being concatenated should match. Otherwise, as in your case, missing rows will be filled with NaN
Upvotes: 6
Reputation: 109526
Assuming you want to add them as rows:
>>> pd.concat([df, pd.DataFrame(test.values(), columns=df.columns)], ignore_index=True)
A B C
0 0.46324 0.32425 0.42194
1 0.10596 0.35910 0.21004
2 0.69209 0.12951 0.50186
3 0.04901 0.31203 0.11035
4 0.43104 0.62413 0.20567
5 0.43412 0.13720 0.11052
6 0.14512 0.10532 0.05310
7 0.01293 0.12235 0.63291
8 0.23413 0.19235 0.51221
If added as new columns:
df_new = pd.concat([df, pd.DataFrame(test.values()).T], ignore_index=True, axis=1)
df_new.columns = \
df.columns.tolist() + [{'One': 'D', 'Two': 'E'}.get(k) for k in test.keys()]
>>> df_new
A B C E D
0 0.46324 0.32425 0.42194 0.01293 0.23413
1 0.10596 0.35910 0.21004 0.12235 0.19235
2 0.69209 0.12951 0.50186 0.63291 0.51221
3 0.04901 0.31203 0.11035 NaN NaN
4 0.43104 0.62413 0.20567 NaN NaN
5 0.43412 0.13720 0.11052 NaN NaN
6 0.14512 0.10532 0.05310 NaN NaN
Order is not guaranteed in dictionaries (e.g. test
), so the new column names actually need to be mapped to the keys.
Upvotes: 8