Tinzyl
Tinzyl

Reputation: 35

Parsing Column names as DateTime

Is there a way of parsing the column names themselves as datetime.? My column names look like this:

Name    SizeRank    1996-06    1996-07     1996-08    ...

I know that I can convert values for a column to datetime values, e.g for a column named datetime, I can do something like this:

temp = pd.read_csv('data.csv', parse_dates=['datetime'])

Is there a way of converting the column names themselves? I have 285 columns i.e my data is from 1996-2019.

Upvotes: 1

Views: 2820

Answers (3)

RightmireM
RightmireM

Reputation: 2492

It seems pandas will accept a datetime object as a column name...

import pandas as pd 
from datetime import datetime
import re

columns = ["Name", "2019-01-01","2019-01-02"]
data = [["Tom", 1,0], ["Dick",1,1], ["Harry",0,0]]

df = pd.DataFrame(data, columns = columns)
print(df)

newcolumns = {}
for col in df.columns:
    if re.search("\d+-\d+-\d+", col):
        newcolumns[col] = pd.to_datetime(col)
    else:
        newcolumns[col] = col
print(newcolumns) 

df.rename(columns = newcolumns, inplace = True)
print("--------------------")
print(df)

print("--------------------")
for col in df.columns:
    print(type(col), col)

OUTPUT:

Name  2019-01-01  2019-01-02
0    Tom           1           0
1   Dick           1           1
2  Harry           0           0
{'Name': 'Name', '2019-01-01': Timestamp('2019-01-01 00:00:00'), '2019-01-02': Timestamp('2019-01-02 00:00:00')}
--------------------
Name  2019-01-01 00:00:00  2019-01-02 00:00:00
0    Tom                    1                    0
1   Dick                    1                    1
2  Harry                    0       
--------------------
<class 'str'> Name
<class 'pandas._libs.tslibs.timestamps.Timestamp'> 2019-01-01 00:00:00
<class 'pandas._libs.tslibs.timestamps.Timestamp'> 2019-01-02 00:00:00

For brevity you can use...

newcolumns  = {col:(pd.to_datetime(col) if re.search("\d+-\d+-\d+", col) else col) for col in df.columns}
df.rename(columns = newcolumns, inplace = True)

Upvotes: 0

Florian Bernard
Florian Bernard

Reputation: 2569

You could do something like this.

df.columns = df.columns[:2] + pd.to_datetime (df.columns[2:])

Upvotes: 1

molybdenum42
molybdenum42

Reputation: 373

There's no way of doing that immediately while reading the data from a file afaik, but you can fairly simply convert the columns to datetime after you've read them in. You just need to watch out that you don't pass columns that don't actually contain a date to the function.

Could look something like this, assuming all columns after the first two are dates (as in your example):

dates = pd.to_datetime(df.columns[2:])

You can then do whatever you need to do with those datetimes.

Upvotes: 1

Related Questions