Reputation: 1249
I have a dataframe and within 1 of the columns is a nested dictionary. I want to create a function where you pass each row and a column name and the function json_normalizes the the column into a dataframe. However, I keep getting and error 'function takes 2 positional arguments, 6 were given' There are more than 6 columns in the dataframe and more than 6 columns in the row[col] (see below) so I am confused as how 6 arguments are being provided.
import pandas as pd
from pandas.io.json import json_normalize
def fix_row_(row, col):
if type(row[col]) == list:
df = json_normalize(row[col])
df['id'] = row['id']
else:
df = pd.DataFrame()
return df
new_df = data.apply(lambda x: fix_po_(x, 'Items'), axis=1)
So new_df will be a dataframe of dataframes. In the example below, it would just be a dataframe with A,B,C as columns and 1,2,3 as the values.
Quasi-reproducible example:
my_dict = {'A': 1, 'B': 2, 'C': 3}
ids = pd.Series(['id1','id2','id3'],name='ids')
data= pd.DataFrame(ids)
data['my_column']=''
m = data['ids'].eq('id1')
data.loc[m, 'my_column'] = [my_dict] * m.sum()
Upvotes: 0
Views: 57
Reputation: 155
Just pass your column using axis=1
df.apply(lambda x: fix_row_(x['my_column']), axis=1)
Upvotes: 1