Reputation: 25
I am preparing data for plotting but im currently encountering issues on applying functions on dataframes in Pandas
This is my dataframe:
What I need to do is to get only the date from the timestamp. So in the current dataframe, the expected result should look like this:
timestamp action
0 2020-03-03 pagevisit
1 2020-03-03 pagevisit
2 2020-03-03 pagevisit
3 2020-03-03 pagevisit
4 2020-03-03 pagevisit
I have around 100,000 records that I need to clean and get only the date. I tried
df['timestamp'] = df['timestamp'].apply(lambda x: x.split(' ')[0])
And it returns error
AttributeError: 'Timestamp' object has no attribute 'split'
-- I also tried
df['timestamp'] = df.apply(lambda x: x['timestamp'].split(' ')[0])
But it returns
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 135, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index_class_helper.pxi", line 109, in pandas._libs.index.Int64Engine._check_type
KeyError: 'timestamp'
I feel that this is a fairly easy task but I have already checked for the past hour but still can't get it. My pandas ver is 1.0.1 so I honestly do not know the cause and I am already desperate. Please help.
Upvotes: 0
Views: 274
Reputation: 421
Looking at the error, it seems that the column timestamp have type of pd.Timestamp
(check documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html)
If you just want to get the date as string you can do as follow
df['timestamp'] = df['timestamp'].apply(lambda x: str(x.date()))
(or you can just use x.date()
to get the datetime.date
type )
Upvotes: 1
Reputation: 82785
Use .date()
Ex:
df['timestamp'] = df['timestamp'].date()
Demo:
print(pd.Timestamp('2020-03-03 12:13:56+09:00').date())
# -->2020-03-03
Upvotes: 1