Reputation: 2273
I have the following df, and I would like to apply a filter over the column names and simply remain those that begin with a certain string:
This is my current df:
ruta2:
Current SAN Prev.1m SAN Prev.2m SAN Prev.3m SAN Current TRE \
A 5 6 7 6 3
B 6 5 7 6 6
C 12 11 11 11 8
Basically what I would like is to filter the dataframe and remain the columns that begin with Current
.
Then the desired output would be:
ruta2:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
In order to do this I tried this filter but outputs a value error :
ruta2=ruta2[~(ruta2.columns.str[:4].str.startswith('Prev'))]
Upvotes: 1
Views: 2663
Reputation: 863226
It seems you only need:
ruta2=ruta2.loc[:, ~(ruta2.columns.str[:4].str.startswith('Prev'))]
#same as
#ruta2=ruta2.loc[:, ~ruta2.columns.str.startswith('Prev')]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or:
cols = ruta2.columns[ ~(ruta2.columns.str[:4].str.startswith('Prev'))]
ruta2=ruta2[cols]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
But if need only Current
columns use filter
(^
means start of string in regex):
ruta2=ruta2.filter(regex='^Current')
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Upvotes: 1
Reputation: 19957
#filter the columns names starting with 'Current'
ruta2[[e for e in ruta2.columns if e.startswith('Current')]]
Out[383]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or you can use a mask array to filter columns:
ruta2.loc[:,ruta2.columns.str.startswith('Current')]
Out[385]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Upvotes: 0