Reputation: 53
This dataframe from csv:
id name A B C gpa
0 1111 Phineas NaN B NaN 3.0
1 1113 Tilly NaN NaN C 2.5
2 1110 Andres A NaN NaN 3.8
3 1112 Jax NaN B NaN 3.2
4 1114 Ray NaN B NaN 3.1
5 1115 Koda NaN NaN C 2.4
6 1120 Bruno A NaN NaN 3.7
7 1134 Davis NaN NaN C 2.6
8 1102 Cassie A NaN NaN 4.0
I want the output:
id name grade gpa
0 1111 Phineas B 3.0
1 1113 Tilly C 2.5
2 1110 Andres A 3.8
3 1112 Jax C 3.2
4 1114 Ray B 3.1
5 1115 Koda C 2.4
6 1120 Bruno A 3.7
7 1134 Davis C 2.6
8 1102 Cassie A 4.0
what is code for that?
Upvotes: 1
Views: 402
Reputation: 13255
Use combine_first
with drop
, you don't need melt
in this case:
df['grade'] = df['A'].combine_first(df['B']).combine_first(df['C'])
df.drop(['A','B','C'], axis=1, inplace=True)
Or:
df['grade'] = df[['A','B','C']].values[df[['A','B','C']].notnull()]
df.drop(['A','B','C'], axis=1, inplace=True)
print(df)
id name gpa grade
0 1111 Phineas 3.0 B
1 1113 Tilly 2.5 C
2 1110 Andres 3.8 A
3 1112 Jax 3.2 B
4 1114 Ray 3.1 B
5 1115 Koda 2.4 C
6 1120 Bruno 3.7 A
7 1134 Davis 2.6 C
8 1102 Cassie 4.0 A
Upvotes: 2
Reputation: 654
If you don't have your heart set on using melt, this solution might work for you: Because each student has either an A,B or C exclusively, you can first turn all the NaN
values in these columns into empty strings, and then concatenate your A,B and C columns together using the +
operator
Import statements and starting DataFrame:
import pandas as pd
import numpy as np
df = pd.DataFrame({'id':[1111,1113],
'name':['Phineas','Tilly'],
'A':[np.NaN,np.NaN],
'B':['B',np.NaN],
'C':[np.NaN,'C'],
'gpa':[3.0,2.5]
})
# id name A B C gpa
# 0 1111 Phineas NaN B NaN 3.0
# 1 1113 Tilly NaN NaN C 2.5
Column-wise string concatenation and output:
df.fillna('',inplace=True) #replaces all NaN's with ""-empty strings
df['letter_grades'] = df['A'] + df['B'] + df['C'] #concatenate
df = df[['id','name','letter_grades','gpa']] #reassign dataframe identifier
print(df)
# id name letter_grades gpa
#0 1111 Phineas B 3.0
#1 1113 Tilly C 2.5
Upvotes: 1