Nostradonis
Nostradonis

Reputation: 55

Combine columns and sort in Pandas

For example lets say I have a table with columns 1, 2, and 3. Column 1 contains dog, cat, dog. I want to combine columns 2 and 3 as such

chars = ['column2', 'column3']
csv['combined'] = csv[chars].apply(lambda row: ','.join(row.values.astype(str)), axis=1)

Now lets say column2 contains a in row 1, b in row 2, and c in row 3. Let's say column3 contains b in row 1, c in row 2, and a in row 3. I want to sort and have row 1 show up (a, b) followed by row 3 which should be (a, c) as well. I can't get the a in row 3 to take precedence even when I try sort_values as such:

csv['combined'] = csv['combined'].sort_values()

Ultimately I want to group by column 1 and then aggregate. In the end I should see (dog, a,c), (dog, a,c), and (cat, b, c)

Upvotes: 1

Views: 132

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150735

You want to sort the rows in increasing order. In that case, you can try np.sort:

cols = ['column2','column3']

df[cols] = np.sort(df[cols], axis=1)

Output:

  column1 column2 column3
0     dog       a       b
1     cat       b       c
2     dog       a       c

Upvotes: 1

Induraj PR
Induraj PR

Reputation: 304

Hope this helps, Happy Coding :)

df = pd.DataFrame(data=[['dog','a','b'],['cat','b','c']['dog','c','a']],columns=(['animal','column1','column2']))
print("Actual DataFrame \n")
print(df)

for i in range(len(df)):
  if df.loc[i]['column1']>df.loc[i]['column2']:
      tmp=df.loc[i]['column1'] 
      df.loc[i]['column1'] =df.loc[i]['column2']
      df.loc[i]['column2'] = tmp

print("\n Transformed DataFrame \n")
print(df.sort_values(by='animal',ascending=False))

enter image description here

Upvotes: 0

Related Questions