Reputation: 1082
I am trying to create a Pandas Series object with a timestamp field start
:
a = pd.Series(index=['preceding_id', 'file', 'start'], dtype=[np.int, np.str, np.datetime64], )
it fails with a
TypeError: data type not understood
Can someone explain me what I am doing wrong?
I've been looking for dates and datetime objects in pandas, but the documentation only says how to use it as index - which is not what I want...
Thank you!
Upvotes: 1
Views: 2713
Reputation: 1166
Are you sure you don't want a dataframe?
If so, it would look something like:
data = {'preceeding_id': [list of ids],
'file': [list of files],
'start': [list of timestamps]}
df = pd.DataFrame(data)
df.start = pd.to_datetime(df.start)
Or, if you are reading the data in from a file or something, you could easily use parse_dates=True
for most of pandas' I/O functions. Pandas is actually pretty fantastic at assigning the correct dtype inherently.
Upvotes: 0
Reputation: 35235
A Series can only have one data type. If you want to store multiple types in one Series, the Series' type will be object
, the generic Python type.
In [12]: Series([1, 'some string', pd.to_datetime('2014-01-01')])
Out[12]:
0 1
1 some string
2 2014-01-01 00:00:00
dtype: object
This is no problem. The types of the constituent elements are retained. For example, the Timestamp in the Series above is still a Timestamp, as we can see by accessing it.
In [13]: Series([1, 'some string', pd.to_datetime('2014-01-01')])[2]
Out[13]: Timestamp('2014-01-01 00:00:00', tz=None)
So, in conclusion, don't specify the datatypes. In general, they'll be handled properly without your help.
Upvotes: 4