Reputation: 57
I newbie with phython, I have I have the following table TABLE.INFO:
ID_STUDENT|CURRENCY|ID_COURSE|STATUS
A12345|IDR|MATH|ACTIVE
A12346|||INACTIVE
A12347|SGD|MATH|ACTIVE
A12348||ENGLISH|ACTIVE
A12349|IDR||INACTIVE
B12345|IDR|ENGLISH|
B12346|IDR|PHYSICS|ACTIVE
B12347||PHYSICS|
B12348|THB||ACTIVE
B12349|||
I want to achieve the result which selects all field on TABLE.INFO with condition only having any field blank:
ID_STUDENT|CURRENCY|ID_COURSE|STATUS
A12346|||INACTIVE
A12346|||INACTIVE
A12348||ENGLISH|ACTIVE
A12349|IDR||INACTIVE
B12345|IDR|ENGLISH|
B12347||PHYSICS|
B12348|THB||ACTIVE
B12349|||
How can I achieve this using pandas phyton? Can someone help with the script? Thank you before.
Upvotes: 0
Views: 62
Reputation: 111
import io
import pandas as pd
data = R"""ID_STUDENT|CURRENCY|ID_COURSE|STATUS
A12345|IDR|MATH|ACTIVE
A12346|||INACTIVE
A12347|SGD|MATH|ACTIVE
A12348||ENGLISH|ACTIVE
A12349|IDR||INACTIVE
B12345|IDR|ENGLISH|
B12346|IDR|PHYSICS|ACTIVE
B12347||PHYSICS|
B12348|THB||ACTIVE
B12349|||"""
df = pd.read_csv(io.StringIO(data), sep='|')
# if you care about the number on NaNs (> 0)
df.loc[df.isna().sum(axis=1) > 0]
# if not
df.loc[df.isna().any(axis=1)]
Upvotes: 0
Reputation: 7204
Try this, it should work:
df
# ID_STUDENT CURRENCY ID_COURSE STATUS
#0 A12345 IDR MATH ACTIVE
#1 A12346 NaN NaN INACTIVE
#2 A12347 SGD MATH ACTIVE
#3 A12348 NaN ENGLISH ACTIVE
#4 A12349 IDR NaN INACTIVE
#5 B12345 IDR ENGLISH NaN
#6 B12346 IDR PHYSICS ACTIVE
#7 B12347 NaN PHYSICS NaN
#8 B12348 THB NaN ACTIVE
#9 B12349 NaN NaN NaN
df[df.isna().any(axis=1)]
# ID_STUDENT CURRENCY ID_COURSE STATUS
#1 A12346 NaN NaN INACTIVE
#3 A12348 NaN ENGLISH ACTIVE
#4 A12349 IDR NaN INACTIVE
#5 B12345 IDR ENGLISH NaN
#7 B12347 NaN PHYSICS NaN
#8 B12348 THB NaN ACTIVE
#9 B12349 NaN NaN NaN
For the opposite, you can do:
df[~df.isna().any(axis=1)]
# ID_STUDENT CURRENCY ID_COURSE STATUS
#0 A12345 IDR MATH ACTIVE
#2 A12347 SGD MATH ACTIVE
#6 B12346 IDR PHYSICS ACTIVE
Upvotes: 1