Reputation: 51
Context:
I would like to transform the "Date" to float()
, as a requirement to use the dataset for training.
Question:
I was wondering if Python can transform "Date" data to datetime type?
The goal:
Transform "Jul 24 2021" ---> "07/24/2021"?
The dataset: BTC Historical Data
Date Close High Low Open Volume (24H) Market Cap
490 Dec 14, 2019 $7,091.76 $7,340.28 $7,040.29 $7,279.04 $17,075,801,948 69,010 BTC $129,002,951,070
491 Dec 13, 2019 $7,279.04 $7,354.13 $7,192.74 $7,213.44 $16,667,772,107 71,176 BTC $131,468,549,582
492 Dec 12, 2019 $7,214.58 $7,352.19 $7,127.09 $7,230.50 $18,895,200,531 102,171 BTC $131,200,636,979
493 Dec 11, 2019 $7,230.50 $7,312.27 $7,169.96 $7,242.22 $16,323,246,786 80,414 BTC $130,567,148,332
494 Dec 10, 2019 $7,242.22 $7,409.36 $7,172.39 $7,362.61 $18,215,577,663 106,404 BTC $131,626,188,206
495 Dec 09, 2019 $7,362.61 $7,656.77 $7,309.09 $7,534.30 $17,847,629,948 122,066 BTC $133,889,762,913
496 Dec 08, 2019 $7,534.30 $7,702.15 $7,394.45 $7,510.99 $15,315,140,388 72,921 BTC $136,960,305,336
497 Dec 07, 2019 $7,510.99 $7,699.64 $7,489.03 $7,549.93 $15,502,310,183 81,337 BTC $136,521,384,515
498 Dec 06, 2019 $7,549.93 $7,615.61 $7,330.45 $7,400.13 $17,845,739,598 124,357 BTC $136,292,864,233
499 Dec 05, 2019 $7,400.13 $7,492.44 $7,175.62 $7,206.09 $18,880,551,089 154,696 BTC $134,769,681,329
Here are the codes (another context):
My goal was to cleanse the data, to meet the criteria of float()
The criteria of float()
So, I removed the "$" and "," symbols in the dataset.
df = df.replace({'\$':''}, regex = True)
df = df.replace({'\,':''}, regex = True)
I was trying to transform the column of "Date" and "Open" to float()
.
df = df.astype({"Open": float})
df["Date"] = pd.to_datetime(df.Date, format="%m/%d/%Y")
df.dtypes
The error! :(
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
449 try:
--> 450 values, tz = conversion.datetime_to_datetime64(arg)
451 dta = DatetimeArray(values, dtype=tz_to_dtype(tz))
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()
TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
3 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
416 try:
417 result, timezones = array_strptime(
--> 418 arg, format, exact=exact, errors=errors
419 )
420 if "%Z" in format or "%z" in format:
pandas/_libs/tslibs/strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
ValueError: time data 'Jul 24 2021' does not match format '%m/%d/%Y' (match)
Upvotes: 0
Views: 2223
Reputation: 247
You're specifying the wrong format in pd.to_datetime
df['Date'] = pd.to_datetime(df['Date'], format='%b %d, %Y')
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior
Use dt.strftime
to get whatever format you want afterwards. The same placeholders from the link above apply.
Upvotes: 1