Reputation: 15
I have the following data frame(df1
):
Value col1 col2 col3
0 a aa ab ac
1 b ba bb bc
2 c ca cb cc
3 d da db dc
4 e ea eb ec
I need to derive the data frame(df2
) from df1
such that column 1 of d2
will have concatenated raw values of Value column with column names of Col 1 to Col 3. Column 2 of d2
will have the raw value corresponding to each concatenated column name, Below is the sample which require to generate. :
Value Col 1
0 a_Col 1 aa
1 a_Col 2 ab
2 a_Col 3 ac
3 b_Col 1 ba
4 b_Col 2 bb
5 b_Col 3 bc
6 c_Col 1 ca
7 c_Col 2 cb
8 c_Col 3 cc
9 d_Col 1 da
10 d_Col 2 db
11 d_Col 3 dc
12 e_Col 1 ea
13 e_Col 2 eb
14 e_Col 3 ec
I have followed the below steps to derive df2 from df1. But this process seems a bit long. Any recommendations on shortening the process?
Below is the code I have used
d = {'Value': ['a','b','c','d','e'],'col1': ['aa','ba','ca','da','ea'], 'col2' : ['ab','bb','cb','db','eb'],'col3': ['ac','bc','cc','dc','ec']}
df1 = pd.DataFrame(data = d)
# Repeat every value is Value column 3 times.
X = df1['Value'].repeat(4).reset_index(drop=True)
# Create separate series with Col 1, Col 2, Col 3 names.
Y = pd.Series(df1.columns[1:])
# Repeated series Y to the length of data df1
YY = pd.Series(np.tile(Y.values, len(df1)))
# Create the first column by concatenating X and YY
first_column_1 = X + "_" + YY
Z = df1.set_index('Value')
ZZ = np.ravel(Z.values)
#Create 2nd column from ZZ
second_column = pd.Series(ZZ)
#Create df2
df2 = pd.DataFrame([first_column, second_column]).T
Upvotes: 1
Views: 170
Reputation: 195553
Try:
x = df.melt("Value", value_name="Col 1")
x.Value += "_" + x.variable
x = x.drop(columns="variable")
print(x)
Prints:
Value Col 1
0 a_col1 aa
1 b_col1 ba
2 c_col1 ca
3 d_col1 da
4 e_col1 ea
5 a_col2 ab
6 b_col2 bb
7 c_col2 cb
8 d_col2 db
9 e_col2 eb
10 a_col3 ac
11 b_col3 bc
12 c_col3 cc
13 d_col3 dc
14 e_col3 ec
Optionally, you can sort values afterwards:
x = x.sort_values(by="Value").reset_index(drop=True)
print(x)
Value Col 1
0 a_col1 aa
1 a_col2 ab
2 a_col3 ac
3 b_col1 ba
4 b_col2 bb
5 b_col3 bc
6 c_col1 ca
7 c_col2 cb
8 c_col3 cc
9 d_col1 da
10 d_col2 db
11 d_col3 dc
12 e_col1 ea
13 e_col2 eb
14 e_col3 ec
Upvotes: 1