Eastsun
Eastsun

Reputation: 18859

How to make pandas#read_csv raise exception when some columns missed

As following, the last line of text has a missed column. I want read_csv raise an exception rather than return a DataFrame with NaN value. Is this possible?

A row with missed column if len(row.split(sep)) < len(columns).

In [1]: import pandas as pd

In [2]: from io import StringIO

In [3]: text = """x,y,z
   ...: 1,2,3
   ...: 4,5,6
   ...: 7,8"""

In [4]: df = pd.read_csv(StringIO(text))

In [5]: df
Out[5]: 
   x  y   z
0  1  2   3
1  4  5   6
2  7  8 NaN

Upvotes: 1

Views: 1062

Answers (1)

Matt Messersmith
Matt Messersmith

Reputation: 13747

It doesn't look like there's an easy way to get the read_csv function to do what you're asking, according to the docs.

However, you can use df.isnull().values.any(). That statement will evaluate to True if there exists a NaN in df, and False otherwise, which should accomplish your task. So, immediately after you read in your csv you could write:

if df.isnull().values.any():
    raise ValueError("Found a NaN")

HTH.

Upvotes: 3

Related Questions