Reputation: 666
I am compiling a list of dataframes from ReST endpoints (so from json results). In some cases, during the final steps, when I set the final set of columns I receive a KeyError exception.
images_df = pd.concat(images)
images_df = images_df[list(cvpc.images_columns.keys())]
What I would like to know is, is there a way to set the columns in such a way that non-existent columns are simply created with null values?
I've also tried to set the columns before appending to the list of dataframes, i.e.:
temp_df = temp_df[list(cvpc.images_columns.keys())]
images.append(temp_df)
So if I can get the columns to "create" even when they don't exist this would be a huge win as setting columns sooner can help keep the final list of images to a minimal size.
Here's a simple example:
data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
df_t = pd.DataFrame(data)
final_columns = ['col_1', 'col_2', 'col_3']
df = df_t[final_columns]
Any suggestions would be greatly appreciated.
Upvotes: 2
Views: 2091
Reputation: 23099
you could create a dictionary and unpack it using assign
for non existing columns, then simply slice the columns as you've done above with a list.
import numpy as np
df = df_t.assign(**{col : np.nan for col in final_columns if col not in df_t.columns}
)[final_columns]
print(df)
col_1 col_2 col_3
0 3 a NaN
1 2 b NaN
2 1 c NaN
3 0 d NaN
Upvotes: 1
Reputation:
You can do something like this:
import numpy as np
import pandas as pd
data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
df_t = pd.DataFrame(data)
final_columns = ['col_1', 'col_2', 'col_3']
for col in final_columns:
if col not in df_t.columns:
df_t[col] = np.NaN
Upvotes: 2
Reputation: 15498
I will assign empty columns filled with NaN values:
import numpy as np
import pandas as pd
data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
df_t = pd.DataFrame(data)
final_columns = ['col_1', 'col_2', 'col_3']
for x in final_columns:
if not x in list(df_t.columns.values):
df_t[x] = np.nan
df = df_t[final_columns]
Later you can fill the NaN columns.
Upvotes: 0