Coder_Alex
Coder_Alex

Reputation: 43

Date time format using python

I am working on an excel file containing below columns:-

df1

Number     Dateopened      Status          Date closed            Work

 1      2019-10-30 10.10    Open           2019-10-30 12.10       walking
 2      2019-9-30  11.10    Open           2019-9-31 12.10        cycling
 3      2019-5-30 10.10     Closed         2019-5-30 12.10        nothing
 4      2019-4-30 10.10     Open           2019-4-30 12.10        walking

I can change the date format using the below code:-

df1['Dateopened'] = pd.to_datetime(df1['Dateopened']).dt.strftime("%b %d,%Y - %I:%M%p")
df1['Date closed'] = pd.to_datetime(df1['Date closed']).dt.strftime("%b %d,%Y - %I:%M%p")

In my code i need to manually enter the column names.Is there something i can do so that the python code itself search such columns and modify the data format.

Upvotes: 1

Views: 85

Answers (5)

user7548672
user7548672

Reputation:

Using a simple List Comprehension like this does the trick:

df[[n for n in df.columns]]

n - represents our column names.

So, to use your example we can substitute like this:

pd.to_datetime(df1['Dateopened']).dt.strftime("%b %d,%Y - %I:%M%p")

for

pd.to_datetime(df1[[n for n in df1.columns]]).dt.strftime("%b %d,%Y - %I:%M%p")

You can even do a check for ex. "Dateopened" or else perform some function.

pd.to_datetime(df1[[n if n == "Dateopened" else "do something here..." for n in df1.columns]]).dt.strftime("%b %d,%Y - %I:%M%p")

Upvotes: 0

jupiterbjy
jupiterbjy

Reputation: 3503

Try-Except for EAFP style.

If you have lots of date formatting then this is way to go. Will run faster than regex in that case.

from openpyxl import load_workbook
from datetime import datetime


work_book = load_workbook("Z:/github/StackOverFlow/62717532/source.xlsx")
work_sheet = work_book.active

format_ = '%Y-%m-%d'

for index, row in enumerate(work_sheet.rows):
    data = [i.value for i in row]

    # trying format to each rows
    for idx, item in enumerate(data):
        try:
            row[idx] = datetime.strptime(item, format_)
        except (TypeError, ValueError):  # if fail, it's not date.
            pass

    print(data)

source.xlsx

enter image description here


result

['Num', 'Date', 'Stat']  
[1, datetime.datetime(2019, 10, 30, 0, 0), 'Open']  
[2, datetime.datetime(2019, 10, 31, 0, 0), 'Closed']  
[3, datetime.datetime(2019, 11, 1, 0, 0), 'Open']  
[4, datetime.datetime(2019, 11, 2, 0, 0), 'Open']  
[5, datetime.datetime(2019, 11, 3, 0, 0), 'Closed']

Upvotes: 0

Parth Shah
Parth Shah

Reputation: 1475

If all your columns containing dates are in a specific format, you can look into using a regex. The solution would look something like this:

toModify = []
for column in df:
    if(re.match(pattern, column)):
        toModify.append(column)

And then you have all the column names that need to be modified. Iterate over that.

Upvotes: 0

briba
briba

Reputation: 2987

You can also check column names:

for col in df1.columns:
    if col.lower().startswith("date"):
        df1[col] = pd.to_datetime(df1[col]).dt.strftime("%b %d,%Y - %I:%M%p")

Upvotes: 1

pecey
pecey

Reputation: 681

You can use regex to detect the columns where there is a date field.

\d*[-]\d*[-]\d*.* can help you detect the date portion, assuming the columns begin with a date. You can apply the regex on one row of the dataframe and filter the columns where the regex matches.

Upvotes: 0

Related Questions