Gandalf44
Gandalf44

Reputation: 51

AttributeError: 'Timestamp' object has no attribute 'read'

I am trying to use pandas and groupby to extract the months from a date field for further manipulation. Line 40 is where I am trying to apply the dateutil to extract year, month, day.

My code:

df = pandas.DataFrame.from_records(defects, columns=headers)
df['date'] = pandas.to_datetime(df['date'], format="%Y-%m-%d")
df['date'] = df['date'].apply(dateutil.parser.parse, yearfirst=True)
 ....
print df.groupby(['month']).groups.keys()

And I'm getting:

Traceback (most recent call last):
 File "jira-sandbox.py", line 40, in <module>
 defects_df['created'] =    defects_df['created'].apply(dateutil.parser.parse, yearfirst=True)
  File "/Library/Python/2.7/site-packages/pandas/core/series.py", line 2294, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/src/inference.pyx", line 1207, in pandas.lib.map_infer (pandas/lib.c:66124)
  File "/Library/Python/2.7/site-packages/pandas/core/series.py", line 2282, in <lambda>
    f = lambda x: func(x, *args, **kwds)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 697, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 301, in parse
    res = self._parse(timestr, **kwargs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 349, in _parse
    l = _timelex.split(timestr)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 143, in split
    return list(cls(s))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 137, in next
    token = self.get_token()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 68, in get_token
    nextchar = self.instream.read(1)
AttributeError: 'Timestamp' object has no attribute 'read'

Upvotes: 1

Views: 2339

Answers (1)

Stephen Rauch
Stephen Rauch

Reputation: 49774

I do not think you need the dateutil operation. The column is already a datetime after the pandas.to_datetime() call. Here is one way to construct a column that can be used by groupby().

Code:

# build a test dataframe
import datetime as dt
df = pd.DataFrame([dt.datetime.now() + dt.timedelta(days=x*15)
                   for x in range(10)],
                  columns=['date'])
print(df)

# add a year/moth column to allow grouping
df['month'] = df.date.apply(lambda x: x.year * 100 + x.month)

# show a groupby
print(df.groupby(['month']).groups.keys())

Results:

                     date
0 2017-03-17 14:30:24.344
1 2017-04-01 14:30:24.344
2 2017-04-16 14:30:24.344
3 2017-05-01 14:30:24.344
4 2017-05-16 14:30:24.344
5 2017-05-31 14:30:24.344
6 2017-06-15 14:30:24.344
7 2017-06-30 14:30:24.344
8 2017-07-15 14:30:24.344
9 2017-07-30 14:30:24.344

[201704, 201705, 201706, 201707, 201703]

Upvotes: 1

Related Questions