Pranav Marla
Pranav Marla

Reputation: 11

ValueError: unconverted data remains: 00

I am using the pd.to_datetime() method to convert a column of strings to datetime objects.

import pandas as pd import numpy as np

main = pd.read_csv('correctedstats.csv', nrows = 50) main['Date'] =
pd.to_datetime(main['Date'], format = '%d/%m/%y')

Which is giving me the following error-

Traceback (most recent call last): File "/Users/pranavmarla/jama/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 448, in _convert_listlike_datetimes values, tz = conversion.datetime_to_datetime64(arg) File "pandas/_libs/tslibs/conversion.pyx", line 200, in pandas._libs.tslibs.conversion.datetime_to_datetime64 TypeError: Unrecognized value type:

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "mergeDates.py", line 8, in main['Date'] = pd.to_datetime(main['Date'].astype(str), format = '%d/%m/%y') File "/Users/pranavmarla/jama/lib/python3.6/site-packages/pandas/util/_decorators.py", line 208, in wrapper return func(*args, **kwargs) File "/Users/pranavmarla/jama/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 778, in to_datetime values = convert_listlike(arg._values, True, format) File "/Users/pranavmarla/jama/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 451, in _convert_listlike_datetimes raise e File "/Users/pranavmarla/jama/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 416, in _convert_listlike_datetimes arg, format, exact=exact, errors=errors File "pandas/_libs/tslibs/strptime.pyx", line 148, in pandas._libs.tslibs.strptime.array_strptime ValueError: unconverted data remains: 00

The date is in the format of '03/01/00'

The code is running fine if us the date's text into the function in the following way

pd.to_datetime('03/01/00', format = '%d/%m/%y')

What am I doing wrong here?

Edit - companystats.csv -

0          1.0             34.0  Asian Paints        15.90  INE021A01026    03/01/00
1          2.0             34.0  Asian Paints        16.06  INE021A01026    04/01/00
2          3.0             34.0  Asian Paints        15.96  INE021A01026    05/01/00
3          4.0             34.0  Asian Paints        15.73  INE021A01026    06/01/00
4          5.0             34.0  Asian Paints        16.07  INE021A01026    07/01/00
5          6.0             34.0  Asian Paints        17.29  INE021A01026    10/01/00
6          7.0             34.0  Asian Paints        17.30  INE021A01026    11/01/00
7          8.0             34.0  Asian Paints        17.40  INE021A01026    12/01/00
8          9.0             34.0  Asian Paints        17.45  INE021A01026  13/01/2000
9         10.0             34.0  Asian Paints        17.08  INE021A01026  14/01/2000
10        11.0             34.0  Asian Paints        17.14  INE021A01026  17/01/2000
11        12.0             34.0  Asian Paints        17.25  INE021A01026  18/01/2000
12        13.0             34.0  Asian Paints        17.47  INE021A01026  19/01/2000
13        14.0             34.0  Asian Paints        17.52  INE021A01026  20/01/2000
14        15.0             34.0  Asian Paints        17.74  INE021A01026  21/01/2000
15        16.0             34.0  Asian Paints        19.10  INE021A01026  24/01/2000
16        17.0             34.0  Asian Paints        20.61  INE021A01026  25/01/2000
17        18.0             34.0  Asian Paints        19.71  INE021A01026  27/01/2000
18        19.0             34.0  Asian Paints        20.03  INE021A01026  28/01/2000
19        20.0             34.0  Asian Paints        19.19  INE021A01026  31/01/2000
20        21.0             34.0  Asian Paints        19.18  INE021A01026    01/02/00
21        22.0             34.0  Asian Paints        19.17  INE021A01026    02/02/00
22        23.0             34.0  Asian Paints        19.00  INE021A01026    03/02/00
23        24.0             34.0  Asian Paints        19.03  INE021A01026    04/02/00

EDIT - Solved! My date column had two types of dates;

21/01/2000

and

02/02/00

I used the errors attribute of the pd.to_datetime() method to create two seperate columns which i then merged to give me my desired column

pd.to_datetime(main,format = '%d/%m/%y', errors = 'coerce')

and

pd.to_datetime(main,format = '%d/%m/%Y', errors = 'coerce')

A more graceful solution would be welcome

Upvotes: 1

Views: 6909

Answers (3)

Gregory Duke
Gregory Duke

Reputation: 98

You can try passing format='mixed' to the to_datetime method like so:

df['Date'] = pd.to_datetime(df['Date'], format='mixed').dt.date

If format is set to 'mixed', Pandas will infer each date format individually based. Fair warning, though. They recommend using it with 'dayfirst' to be safe.

Source

Upvotes: 0

YOLO
YOLO

Reputation: 21719

Here's a nice way to see where the function is failing:

df = pd.DataFrame({'date': ['03/01/00','13/01/2000','00/01/00']})

def date_format(f):
    try:
        return pd.to_datetime(f)
    except Exception as e:
        return e.args

print(df['date'].apply(date_format))

0          2000-03-01 00:00:00
1          2000-01-13 00:00:00
2    (month must be in 1..12,)

Upvotes: 1

qaiser
qaiser

Reputation: 2868

try to pass infer_datetime_format as true, it might be that in your data, date it is not in the format that you are passing

pd.to_datetime(df['Date'], infer_datetime_format=True) 

OR pass errors = 'ignore'

pd.to_datetime(df['Date'], format = '%d/%m/%y',errors='ignore')

Upvotes: 0

Related Questions