Reputation: 65
I'm currently working on a script to check if a bunch of CSVs have an adequate format for another one to process it. I'm having trouble with some of the assertions It haves to pass. One is to have no missings, for which I tried:
df = pd.read_csv("C:PATH\\test.csv", sep= ',')
def check(self, file):
try:
assert df.notna().values.any()
except AssertionError:
assert False, " NaN in data"
It does nothing. I tried it on a CSV with NaNs and it didn't raise an error. Then, I also want it with comma separated, but they may pass me a dot and comma separated one. This is my try:
try:
assert len(df.columns) != 1
except AssertionError:
"Not comma separated"
It responds poorly, sometimes raises the flag, sometimes it doesn't.
Is there something from "assert" I didn't understand or is something else the issue?
Upvotes: 1
Views: 805
Reputation: 2032
pd.notna()
https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.notna.html
requires you to provide dataframe as argument
Please try using notnull() as below:
df= pd.read_csv("C:PATH\\test.csv", sep= ',')
def check(file):
try:
assert file.notnull().values.any()
except AssertionError:
assert False, " NaN in data"
check(df)
Upvotes: 0
Reputation: 4767
Pass df as parameter to check()
.
Also change .any()
to .all()
df= pd.read_csv("C:\\PATH\\test.csv", sep= ',')
def check(file):
try:
assert file.notna().values.all()
except AssertionError:
assert False, " NaN in data"
check(df)
Upvotes: 2