Stacey
Stacey

Reputation: 5097

Drop row from data-frame where that contains a specific string

I have a number of CSV files where the head looks something like:

09/07/2014,26268315,,
10/07/2014,6601181,16.3857
11/07/2014,916651,12.5879
14/07/2014,213357,,
15/07/2014,205019,10.8607

I need to read this into a dataframe and remove any rows with ,, however when I read the CSV data into a dataframe using:

df = pd.read_csv(raw_directory+'\\'+filename, error_bad_lines=False,header=None)

I get:

            0         1        2   3
0  09/07/2014  26268315      NaN NaN
1  10/07/2014   6601181  16.3857 NaN
2  11/07/2014    916651  12.5879 NaN
3  14/07/2014    213357      NaN NaN
4  15/07/2014    205019  10.8607 NaN

How can I read the CSV data into a dataframe and get:

                             0 
 0       09/07/2014,26268315,,
 1  10/07/2014,6601181,16.3857
 2   11/07/2014,916651,12.5879
 3         14/07/2014,213357,,
 4   15/07/2014,205019,10.8607

I need to remove any rows where there are ,, present. and then resave the adjusted dataframe to a new CSV file. I was going to use:

stringList = [',,']

df = df[~df[0].isin([stringList])]

to remove the rows with ,, present so the resulting .csv head looks like:

10/07/2014,6601181,16.3857
11/07/2014,916651,12.5879
15/07/2014,205019,10.8607

Upvotes: 1

Views: 83

Answers (1)

jezrael
jezrael

Reputation: 862601

I guess here is possible remove all columns with all NaNs and then rows with any NaNs:

df = df.dropna(axis=1, how='all').dropna()
print (df)
            0        1        2
1  10/07/2014  6601181  16.3857
2  11/07/2014   916651  12.5879
4  15/07/2014   205019  10.8607

Another solution is add separator which value is not in data like | and then filter by endswith:

df = pd.read_csv(raw_directory+'\\'+filename, error_bad_lines=False,header=None, sep='|')
df = df[~df[0].str.endswith(',')]
#alternative solution - $ is for end of string
#df = df[~df[0].str.contains(',$')]
print (df)
                            0
1  10/07/2014,6601181,16.3857
2   11/07/2014,916651,12.5879
4   15/07/2014,205019,10.8607

Upvotes: 3

Related Questions