Reputation: 33
I am trying to merge multiple columns within a csv into a single column with each original column's header being repeated as shown below.
userA userB
A1 B1
A2 B2
A2 B3
A2 B4
Into this:
userA A1
userA A2
userA A3
userA A4
userB B1
userB B2
userB B3
userB B4
Does anyone have any suggestions on how to do this. I do have some experience in pandas but I'm currently at a loss.
UPDATE: I found how to merge the columns
df = pd.read_csv(filename, sep='\t')
df = df.combine_first(pd.Series(df.values.ravel('F')).to_frame('merged'))
FINAL UPDATE: Solved using melt()
df = pd.melt(df)
Upvotes: 2
Views: 112
Reputation: 33
Solved first using:
With a dataframe of just those two columns, you could do df.stack().reset_index(level=1) – cmaher
Following by a simple sort to order properly:
df.sort_values(by=[0])
See pd.melt(df) above for better answer.
Upvotes: 1
Reputation: 323226
You can using melt
df.melt()
Out[702]:
variable value
0 userA A1
1 userA A2
2 userA A2
3 userA A2
4 userB B1
5 userB B2
6 userB B3
7 userB B4
Upvotes: 2
Reputation: 294258
ravel
and repeat
pd.Series(df.values.ravel(), df.columns.repeat(len(df)))
userA A1
userA B1
userA A2
userA B2
userB A2
userB B3
userB A2
userB B4
dtype: object
Upvotes: 2