Reputation: 1891
I have a list of dictionaries which has been converted into a dataframe using json_normalize
.
Dataset:
x = [{
"_id" : 71,
"Ids" : [
"10",
"59"
],
"roles" : [
"janitor", "mechanic", "technician"
]
}]
Dataframe:
_id Ids roles
71 [10, 59] [janitor, mechanic, technician]
What I am trying to do is find a way to dynamically explode all the list columns/keys (Ids and roles) without explicitly typing the column names. Is this possible?
Desired Output:
_id Ids roles
71 10 janitor
71 10 mechanic
71 10 technician
71 59 janitor
71 59 mechanic
71 59 technician
Any assistance would be appreciated.
Upvotes: 0
Views: 220
Reputation: 688
I would do it like this:
from itertools import product
list_ = [df.iloc[0,i] if type(df.iloc[0,i]) == list else [df.iloc[0,i]] for i in range(df.shape[1])]
prod = list(product(*list_))
df = pd.DataFrame(prod, columns=df.columns)
Upvotes: 0
Reputation: 1960
A naive solution iterating through all entries checking for lists.
Works for multiple lines of entries and nested lists.
while True:
newdf = pd.DataFrame(columns=df.columns)
for row in df.values:
for index, value in enumerate(row):
if isinstance(value, list):
for listentry in value:
newdf.loc[len(newdf)] = [*row[:index], listentry, *row[index+1:]]
break
else:
newdf.loc[len(newdf)] = row
df = newdf.copy(deep=True)
if not newdf.applymap(lambda value:isinstance(value, list)).values.any():
break
Upvotes: 1
Reputation: 9047
I am not sure about the efficiency of this, but it just iterating over the dataframe column names and check whether it is list
or not. If it is list
just explode
.
df_final = df.copy()
for c in df.columns:
if(isinstance(df[c][0],list)):
df_final = df_final.explode(c)
_id Ids roles
0 71 10 janitor
0 71 10 mechanic
0 71 10 technician
0 71 59 janitor
0 71 59 mechanic
0 71 59 technician
Upvotes: 2