Reputation: 435
I normalized a json object into a dataframe using json_normalize
which looks something like this:
ID Name Email_id ID Name Email_id ID Name Email_id
1 A [email protected] 2 B [email protected] 3 C [email protected]
I wanna convert the column values into rows like this:-
ID Name Email_id
1 A [email protected]
2 B [email protected]
3 C [email protected]
but I'm not able to do that. I tried pd.melt()
but it gives me Data must be 1-dimensional
exception.
Upvotes: 1
Views: 73
Reputation: 4929
If you do know such data structure is consistent, you can just slice the data by index, then concatenate them:
pd.concat([df.iloc[:, i:i+3] for i in range(0, df.shape[1], 3)])
To make sure to get the column IDs:
import numpy as np
# Get the target indexes
idx = np.arange(df.shape[1])[df.columns=='ID']
idx = np.append(idx, df.shape[1])
# Slice and concatenate data
pd.concat([df.iloc[:, idx[i]:idx[i+1]] for i in range(len(idx)-1)])
Upvotes: 1
Reputation: 862691
You can select only one column, but because duplicated columns names are selected all columns with same label, then convert to 1d
numpy array and pass to DataFrame
constructor:
print (df['ID'])
ID ID ID
0 1 2 3
df = pd.DataFrame({'ID': df['ID'].to_numpy().ravel(),
'Name': df['Name'].to_numpy().ravel(),
'Email_id': df['Email_id'].to_numpy().ravel()})
print (df)
ID Name Email_id
0 1 A [email protected]
1 2 B [email protected]
2 3 C [email protected]
Another idea is create MultiIndex
in columns by GroupBy.cumcount
and reshape by DataFrame.stack
:
s = df.columns.to_series()
df.columns = [s, s.groupby(s).cumcount()]
print (df)
ID Name Email_id ID Name Email_id ID Name Email_id
0 0 0 1 1 1 2 2 2
0 1 A [email protected] 2 B [email protected] 3 C [email protected]
df = df.stack().reset_index(drop=True)
print (df)
Email_id ID Name
0 [email protected] 1 A
1 [email protected] 2 B
2 [email protected] 3 C
Upvotes: 1