Reputation: 1
I need to plot some data series in x axis and I have a csv file which "Start time" colum is full of dates.
As I work with DataFrame, I use pandas
library to manipulate the data.
My datetime data is:
Input:
print(paradas["Start time"])
Output:
0 31/12/2020 00:13:30
1 30/12/2020 19:30:00
2 30/12/2020 19:01:45
3 30/12/2020 19:00:10
4 30/12/2020 18:55:35
...
10704 02/01/2020 08:37:33
10705 02/01/2020 08:32:33
10706 02/01/2020 08:28:03
10707 02/01/2020 08:19:03
10708 31/12/2019 02:41:01
Name: Start time, Length: 10709, dtype: object
As I am working with time data I transform to datetime64[ns]
class all the timestamps from the column:
Input:
paradas["Start time"]=pd.to_datetime(paradas["Start time"])
print(paradas["Start time"])
Output:
0 2020-12-31 00:13:30
1 2020-12-30 19:30:00
2 2020-12-30 19:01:45
3 2020-12-30 19:00:10
4 2020-12-30 18:55:35
...
10704 2020-02-01 08:37:33
10705 2020-02-01 08:32:33
10706 2020-02-01 08:28:03
10707 2020-02-01 08:19:03
10708 2019-12-31 02:41:01
Name: Start time, Length: 10709, dtype: datetime64[ns]
Now, since the dates are resversed, I tried to put them backwards by using:
Input:
paradas["Start time"]=paradas["Start time"].sort_values(by=['Date'], ascending=False)
print(paradas["Start time"])
However it doesn't recognise my code because of the 'by':
Output:
TypeError Traceback (most recent call last)
<ipython-input-35-d4f349ab2092> in <module>()
126 #print(paradas["Start time"])
127
--> 128 paradas["Start time"]=paradas["Start time"].sort_values(by=['Date'], ascending=False)
129 print(paradas["Start time"])
130
TypeError: sort_values() got an unexpected keyword argument 'by'
Also, I tried to evaluate it without the arguments, but it doesn't change anything eitherway.
So I don't know what I'm doing wrong, if it's the type of the elements or what.
I read on another post about doing it with str
, but since I need the datetime format, and I've already sawn other codes evaluating this project with datetime64[ns]
, I am almost sure that it is possible...
Upvotes: 0
Views: 839
Reputation: 1
Ok, I solved it.
To whom it may concern, the problem was that paradas["Start time"]=paradas["Start time"].sort_values(by=['Date'], ascending=False)
wasn't right because sort_values()
it's not prepared to operate by calling just one column from the DataFrame. In particular, my data (writen as paradas
) is in pandas.Dataframe
format, and paradas["Start time"]
(which is just one column from paradas) is in pandas.Seriers
format.
We need to use sort_values()
with the Dataframe
format, so I needed to apply this comand to all my data, which means that we must use paradas
:
Input:
paradas=paradas.sort_values(by=['Date'], ascending=False)
Output:
0 31/12/2020 00:13:30
10708 31/12/2019 02:41:01
2026 31/10/2020 05:04:06
2027 31/10/2020 04:59:06
2028 31/10/2020 04:57:46
...
7642 01/04/2020 01:36:15
7643 01/04/2020 01:23:40
7644 01/04/2020 01:11:20
7645 01/04/2020 00:14:20
7646 01/04/2020 00:08:25
Name: Start time, Length: 10709, dtype: object
Nevertheless, it didn't sort like I wanted, so I just realised that if I needed to sort backwards the "Start time" column, so I finally did:
Input:
paradas=paradas.reindex(index=paradas["Start time"].index[::-1])
Output:
10708 2019-12-31 02:41:01
10707 2020-02-01 08:19:03
10706 2020-02-01 08:28:03
10705 2020-02-01 08:32:33
10704 2020-02-01 08:37:33
...
4 2020-12-30 18:55:35
3 2020-12-30 19:00:10
2 2020-12-30 19:01:45
1 2020-12-30 19:30:00
0 2020-12-31 00:13:30
Name: Start time, Length: 10709, dtype: datetime64[ns]
(Now I've already changed into datetime64[ns] format) This is what it worked for me. I've just checked several times the documentation:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html
and compared it with:
https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.Series.sort_values.html
Upvotes: 0