Reputation: 93
I have a spreadsheet where data is laid out in a strange way with chunks of data separated by a single empty row of NaN cells. I can remove the NaN rows with drop.na
of course, however is there a way for me to drop those NaN rows as well as a specified number of rows beneath them?
For example I'd like to drop each of the NaN rows in the dataframe as well as the 2 rows below it that are not NaN.
Upvotes: 1
Views: 98
Reputation: 16561
One way is to create a mask from shifted column values, for example:
# identify nan rows, true if nan
df['row_is_na'] = df['some_column'].isna()
# identify two rows that follow nan, true if after nan
df['rows_after_nan'] = df['row_is_na'].shift(-1) | df['row_is_na'].shift(-2)
# apply the mask
df = df[~df['rows_after_nan']]
Upvotes: 1