Reputation: 720
I have no idea why this isn't working... Why am I not able to get rid of these?
I have tried the following:
dfa = dfa[dfa['Date Sold_y'].str.len() < 4] #empty
dfa = dfa[dfa['Date Sold_y'] != ''] #no change
dfa = dfa[dfa['Date Sold_y'] != np.nan] #no change
Dtype is string, and sample values below:
['May-30-2018', nan, nan, 'June-11-2014', 'December-3-2021', nan, 'February-2-2022', nan, nan, 'December-30-2011', nan, nan, nan, nan, nan, nan, nan, nan, 'November-30-2021', nan, 'April-1-2020', nan, 'May-10-2007', nan, nan, nan, nan, nan, nan, 'January-28-2022', nan, nan, nan, 'January-18-2022', nan, nan, nan, 'January-12-2022', nan, 'November-15-2021'
Upvotes: 1
Views: 62
Reputation: 126
By the way if the values are actually nan (and not strings) check out the dropna()
method of pandas.DataFrame
. It allows to drop rows of the dataframe if one or more nan is found (you can chose) or you can specify a subset of columns to check against nan values
Upvotes: 0
Reputation: 120391
nan
values are string with extra whitespaces:>>> dfa[dfa['Date Sold_y'].str.strip() != 'nan']
Date Sold_y
0 May-30-2018
3 June-11-2014
4 December-3-2021
6 February-2-2022
9 December-30-2011
18 November-30-2021
20 April-1-2020
22 May-10-2007
29 January-28-2022
33 January-18-2022
37 January-12-2022
39 November-15-2021
>>> dfa[dfa['Date Sold_y'].str.contains('\d{4}$')]
nan
values, as suggested by @HenryEcker:>>> dfa[dfa['Date Sold_y'].notna()]
# OR
>>> dfa[~dfa['Date Sold_y'].isna()]
Upvotes: 1