Reputation: 55
so I have an object containing filtered data from a Django database. The data contains a list of time values in the form of: YYYY-MM-DD HH:MM:SS and I'm trying to group each second to its corresponding minute and each minute to its corresponding hour. Then I need to pass this grouping to the front end of my website that uses Javascript.
So far I have code that does the grouping, though it's not exactly what I want, the code and output are seen below:
# Makes pd dataframe of database object, sets dateTime as index so we can easily extract hour, min, sec later
queryToDataFrame = pd.DataFrame(filterTable.values()[1:],columns=filterTable.values()[0]).set_index('dateTime')
hours = queryToDataFrame.index.hour # extracts hours from dataframe
minutes = queryToDataFrame.index.minute
seconds = queryToDataFrame.index.second
timepd = pd.DataFrame({'hours':hours, 'minutes':minutes, 'seconds':seconds}) # puts time values into new dataframe for easier processing
groupVar = timepd.groupby([timepd.hours, timepd.minutes]).apply(print) # groups minutes to hours and seconds to mins
Output
hours | minutes | seconds | |
---|---|---|---|
0 | 20 | 52 | 10 |
1 | 20 | 52 | 30 |
2 | 20 | 52 | 35 |
hours | minutes | seconds | |
---|---|---|---|
3 | 20 | 53 | 0 |
4 | 20 | 53 | 5 |
5 | 20 | 53 | 10 |
...
This is so close to the correct grouping that I was trying to get but I'm actually hoping for something that looks like:
hours | minutes | seconds | |
---|---|---|---|
0 | 20 | 52 | 0 |
1 | 5 | ||
2 | 10 | ||
3 | 53 | 0 | |
4 | 5 | ||
5 | 21 | 1 | 0 |
Where there is only one minute value that corresponds to all of the seconds that fall into that minutes category and same for hours.
I've seen similar examples of this but so far the examples have all differed just enough to make this task extremely confusing for me. Is this possible to accomplish just using Pandas? Or maybe there is a different way to achieve this? The data that this code will handle is expected to be very large so I'm trying to avoid using loops.
If you needed some data to test, this is what the "queryToDataFrame" variable converts to a data frame:
<QuerySet [{'id': 10063705, 'valueName': 'Temp', 'value': 3.3, 'units': 'C', 'dateTime': datetime.datetime(2021, 3, 18, 20, 51, 50, tzinfo=<UTC>), 'timestamp': 1616122310.0},
{'id': 10063745,'valueName': 'Temp', 'value': 3.4, 'units': 'C', 'dateTime': datetime.datetime(2021, 3, 18, 20, 52, 10, tzinfo=<UTC>), 'timestamp': 1616122330.0}]
Upvotes: 2
Views: 103
Reputation: 13518
You could try this:
import pandas as pd
df = pd.DataFrame(
{
"hours": [20, 20, 20, 20, 20, 20],
"minutes": [52, 52, 52, 53, 53, 53],
"seconds": [10, 30, 35, 0, 5, 10],
}
)
# Convert values as strings (needed to clear content)
df = df.astype(str)
# Iterate on a copy of the dataframe and modify rows as needed
previous_row = df.iloc[0]
for i, row in df.copy().iterrows():
if i == 0:
continue
if row["minutes"] == previous_row["minutes"]:
df.loc[i, "minutes"] = ""
df.loc[i, "hours"] = ""
previous_row = row
print(df)
#Outputs
hours minutes seconds
0 20 52 10
1 30
2 35
3 20 53 0
4 5
5 10
Upvotes: 1