Reputation: 2279
Every row of my dataframe contain a record with a unique key combination. The data validation will be based on the columns and on key combination. For example, in a single column, cells may have a different min/max requirement based on the key combination.
Several questions:
The library does look cool, and I am interested to pursue further.
thanks
Upvotes: 0
Views: 700
Reputation: 169
so you can create a validator that validates a single value at a time with the element_size=True
kwarg, you can read more here.
import pandera as pa
check = pa.Check(lambda x: 0 <= x <= 100, element_wise=True)
The function must take an individual value as input and output a boolean.
Can you elaborate on the exact check that you want to perform? If you want to do a dataframe-level row-wise check you can use an element-wise check at the dataframe-level as a wide check.
does Pandera have a schema generator capable of this type of flexibility. Perhaps it scans a "golden dataframe" as a starting place to create a schema based on some provided criteria. I realize the schema generator output may need a bit of tweaking.
You can use the schema = pandera.infer_schema(golden_dataframe)
function to bootstrap a starter schema, then write it out to a file with schema.to_script("path/to/file")
to further iterate.
Upvotes: 2