Reputation: 221

Python ValueError: time data 'Date' does not match format '%Y/%m/%d %H:%M:%S.%f'

I am parsing dates from a Dataframe where the data comes from a CSV file. I get the said error. I am sure the format is correct.

My Code:

import pandas as pd
    from datetime import datetime
    import csv
    import matplotlib.pyplot as plt
    import matplotlib.dates as mdates
    headers = ['Sensor Value','Date','Time']
    df = pd.read_csv('C:/Users\Lala Rushan\Downloads\DataLog.CSV',names=headers)
    print (df)

    df['Date'] = df['Date'].map(lambda x: datetime.strptime(str(x), '%Y/%m/%d %H:%M:%S.%f'))
    x = df['Date']
    y = df['Sensor Value']

    # plot
    plt.plot(x,y)
    # beautify the x-labels
    plt.gcf().autofmt_xdate()

    plt.show()

My DataFrame:

0    Sensor Value         Date           Time
1               2   2017/02/17   19:06:17.188
2              72   2017/02/17   19:06:22.360
3              72   2017/02/17   19:06:27.348

Console Error:

   new_values = map_f(values, arg)
  File "pandas\src\inference.pyx", line 1207, in pandas.lib.map_infer     (pandas\lib.c:66124)
  File "C:/Users/Lala Rushan/PycharmProjects/newgraph/newgraph.py", line 10, in <lambda>
  df['Date'] = df['Date'].map(lambda x: datetime.strptime(str(x), '%Y/%m/%d %H:%M:%S.%f'))
  File "C:\Users\Lala    Rushan\AppData\Local\Programs\Python\Python35\lib\_strptime.py", line 500, in _strptime_datetime
  tt, fraction = _strptime(data_string, format)
  File "C:\Users\Lala   Rushan\AppData\Local\Programs\Python\Python35\lib\_strptime.py", line 337, in _strptime
  (data_string, format))
 ValueError: time data 'Date' does not match format '%Y/%m/%d %H:%M:%S.%f'

CSV Input:

    Sensor Value    Date    Time
2    2017/02/17  19:06:17.188
72   2017/02/17  19:06:22.360
72   2017/02/17  19:06:27.348
72   2017/02/17  19:06:32.482
74   2017/02/17  19:06:37.515
70   2017/02/17  19:06:42.580

Upvotes: 1

Answers (3)

tmsss

Reputation: 2129

I had a similar problem while parsing dates in a large csv file. In my case, I had a few bad lines in the csv that were triggering the error, so I simply removed them from the dataframe, to later parse the dates.

If you don't mind to lose this information, you can do something like this:

df = df[df['Date'].str.contains(r'^\d{4}-\d{2}-\d{2} \d{2}\:\d{2}:\d{2}.\d{3}')]

timer = lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S.%f')       

df['Date'] =  df['Date'].apply(timer)

Upvotes: 0

Louise Davies

Reputation: 15991

Remove the names=headers part as this is confusing pandas. pandas assumes that the first row is a header rowby default. By specifying the header names, it assumes that the first row must be data and so your error is that the word Date does not match your format (which it doesn't)

EDIT: just change the read_csv line to this:

df = pd.read_csv('C:/Users\Lala Rushan\Downloads\DataLog.CSV')

Consequently, you don't have to specify headers anymore so you can remove that line.

EDIT2:

The problem is that your date and time fields are deperate. Create a new column, called DateTime, which combines the two and then apply your striptime function to that.

Replace the line I told you to delete, the df['Date'] = df['Date'].map one with this

df['DateTime'] = df['Date'] + " " + df['Time']
df['DateTime'] = df['DateTime'].map(lambda x: datetime.strptime(str(x), '%Y/%m/%d %H:%M:%S.%f'))

Upvotes: 0

jezrael

Reputation: 863741

I think you need to_datetime with parameter errors='coerce' for parse problematic values to NaN:

df['Date'] = pd.to_datetime(df['Date'], errors='coerce')

For check problematic rows:

print (df[pd.to_datetime(df['Date'], errors='coerce').isnull()])

But if need read columns date and time to datetime use parameter parse_dates in read_csv:

import pandas as pd
from pandas.compat import StringIO

temp=u"""
2,2017/02/17,19:06:17.188
72,2017/02/17,19:06:22.360
72,2017/02/17,19:06:27.348
72,2017/02/17,19:06:32.482
74,2017/02/17,19:06:37.515
70,2017/02/17,19:06:42.580"""
#after testing replace 'StringIO(temp)' to 'C:/Users\Lala Rushan\Downloads\DataLog.CSV'
headers = ['Sensor Value','Date','Time']
df = pd.read_csv(StringIO(temp), names=headers, parse_dates={'Datetime':['Date','Time']})
print (df)
                 Datetime  Sensor Value
0 2017-02-17 19:06:17.188             2
1 2017-02-17 19:06:22.360            72
2 2017-02-17 19:06:27.348            72
3 2017-02-17 19:06:32.482            72
4 2017-02-17 19:06:37.515            74
5 2017-02-17 19:06:42.580            70

print (df.dtypes)
Datetime        datetime64[ns]
Sensor Value             int64
dtype: object

First solution combined with second, where in last row is 30.2.2017 what does not exist:

temp=u"""
2,2017/02/17,19:06:17.188
72,2017/02/17,19:06:22.360
72,2017/02/17,19:06:27.348
72,2017/02/17,19:06:32.482
74,2017/02/17,19:06:37.515
70,2017/02/30,19:06:42.580"""
#after testing replace 'StringIO(temp)' to 'C:/Users\Lala Rushan\Downloads\DataLog.CSV'
headers = ['Sensor Value','Date','Time']
df = pd.read_csv(StringIO(temp), names=headers, parse_dates={'Datetime':['Date','Time']})
print (df)
                  Datetime  Sensor Value
0  2017/02/17 19:06:17.188             2
1  2017/02/17 19:06:22.360            72
2  2017/02/17 19:06:27.348            72
3  2017/02/17 19:06:32.482            72
4  2017/02/17 19:06:37.515            74
5  2017/02/30 19:06:42.580            70

df['Datetime'] = pd.to_datetime(df['Datetime'], errors='coerce')
print (df)
                 Datetime  Sensor Value
0 2017-02-17 19:06:17.188             2
1 2017-02-17 19:06:22.360            72
2 2017-02-17 19:06:27.348            72
3 2017-02-17 19:06:32.482            72
4 2017-02-17 19:06:37.515            74
5                     NaT            70 <- replace 30.2.2017 to NaT (same as NaN for dates)

print (df.dtypes)
Datetime        datetime64[ns]
Sensor Value             int64
dtype: object

Upvotes: 3

Python ValueError: time data &#39;Date&#39; does not match format &#39;%Y/%m/%d %H:%M:%S.%f&#39;

Answers (3)

Related Questions

Python ValueError: time data 'Date' does not match format '%Y/%m/%d %H:%M:%S.%f'