Reputation: 67
I want to extract column labels only which has time format from data frame. My data looks like below
and I want to get only data like 06-Dec-2020,07-Dec-2020,08-Dec-2020,09-Dec-2020,10-Dec-2020,11-Dec-2020,12-Dec-2020
Looking for great support
Upvotes: 2
Views: 2051
Reputation: 38415
Assuming the dates might be across multiple months, years,
cols = ['blah', '06-Dec-2020', '20-Dec-2020','01-Jan-2021']
df = pd.DataFrame(columns = cols)
You can convert the dates to datetime with errors set to coerce
which will convert non-dates to NaT. Then filter the columns using boolean indexing,
pd.to_datetime(df.columns, format = '%d-%b-%Y', errors='coerce').notna()
array([False, True, True, True])
df.columns[pd.to_datetime(df.columns, format = '%d-%b-%Y', errors='coerce').notna()].tolist()
Output:
['06-Dec-2020', '20-Dec-2020', '01-Jan-2021']
Upvotes: 3
Reputation: 71687
You can use filter
with regex
pattern:
df.filter(regex=r'\d{2}-\w{3,}-\d{4}').columns
Result:
Index(['06-Dec-2020', '07-Dec-2020', '08-Dec-2020', '09-Dec-2020',
'10-Dec-2020', '11-Dec-2020', '12-Dec-2020'],
dtype='object')
Upvotes: 5