Reputation: 97
We're working on a python project, where we're using SQLalchemy to retrieve data from our MySQL database. We have succeeded in establishing a connection to the database:
engine = create_engine("mysql://xxx:xxx@localhost/xxx")
conn = engine.connect()
When we are connected, we are retrieving first and last timestamp, so we can set up a range and an interval of 20 minutes. We are doing so by using following line of code:
lastLpnTime = pd.read_sql('SELECT MAX(timestamp) FROM Raw_Data WHERE topic = "lpn1"', conn).astype(str).values
firstLpnTime = pd.read_sql('SELECT MIN(timestamp) FROM Raw_Data WHERE topic = "lpn1"', conn).astype(str).values.tolist()
We then want to round down the timestamp
for firstLpnTime
, so the minutes will be round down to either xx.00, xx.20, xx:40, so for example 15:09:25 will round down to 15:00:00
firstLpnTime = pd.to_datetime(firstLpnTime).round('20T')
And then we set up a range between firstLpnTime
and lastLpnTime
and an interval of 20 min
index = pd.date_range(start=firstLpnTime.min(), end=lastLpnTime.max(), freq='20T')
However... we are receiving an error for this line of code:
firstLpnTime = pd.to_datetime(firstLpnTime).round('20T')
Error message is saying:
TypeError: arg must be a string, datetime, list, tuple, 1-d array, or Series
Variable firstLpnTime
has an astype(str)
, so it should be a string, so we don't get why it's not working. Maybe the datatype is not changing to string?
The output, when I print firstLpnTime
, is: [['2019-07-26 15:09:25']]
Upvotes: 0
Views: 453
Reputation: 111
In order to avoid value nested within array, you can use DataFrame directly as shown in below code:
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('mysql://xxxx:xxxx@localhost:3306/test')
conn = engine.connect()
lastLpnTime = pd.read_sql('SELECT MAX(timestamp) as max_time FROM
test.test_table WHERE topic = "lpn1";', conn).iloc[0]['max_time']
firstLpnTime = pd.read_sql('SELECT MIN(timestamp) as min_time FROM
test.test_table WHERE topic = "lpn1"', conn).iloc[0]['min_time']
firstLpnTime = firstLpnTime.round('20T')
firstLpnTime
This will provide the expected output in Pandas Timestamp class.
Timestamp('2019-07-26 15:00:00')
Hope it helps! :)
Upvotes: 0
Reputation: 111
It looks like your firstLpnTime
is nested within an array.
So, you can try taking first index from this variable and then casting it using pandas.
firstLpnTime = pd.to_datetime(firstLpnTime[0]).round('20T')
Then, firstLpnTime
should give this result:
DatetimeIndex(['2019-07-26 15:00:00'], dtype='datetime64[ns]', freq=None)
I hope it solves your problem.
Upvotes: 1