art
art

Reputation: 9

Pandas Dataframe - find max between two columns

I have a csv that looks like this:

Students | Math | Reading
Tom | 80 | 75
Mike | 65 | 90

I want to import this csv and determine which Student has the largest difference between their Math and Reading scores.

In this example. I expect Mike to be the result as his difference is 25 while Tom's difference is 15.

Upvotes: 0

Views: 1670

Answers (1)

piRSquared
piRSquared

Reputation: 294258

You want Student to be the index of the dataframe. With the sample data you gave, I'd import it like this.

df = pd.read_csv('test.csv', sep='\s*\|\s*', engine='python', index_col=0)

This will separate the columns when it sees zero or more spaces followed by a vertical bar followed by zero or more spaces. It will also se the index to be the Students column

Now you can use this to find the Student with the largest absolute difference between their respective Math and Reading scores.

df.Math.sub(df.Reading).abs().idxmax()

'Mike'

Upvotes: 4

Related Questions