Reputation: 41
I have these two dataframe: 1) data here are grouped by station_id( from 1 up to 98) and time( a data every hour from 27-01-2020 to 26-05-2020)
My aim is to create a list of list in this format:
latitude longitude flow hour month day
[[53.37947845458979, -1.46990168094635, 278.0, 0.0, 1.0, 27.0],
[53.379791259765604, -1.46999669075012, 122.0, 0.0, 1.0, 27.0],
[53.380035400390604, -1.47001004219055, 58.0, 0.0, 1.0, 27.0], ...]
In order to have a list [latitude, longitude, flow, month, day] for every row in the first dataframe. I tried with the following code:
import pandas as pd
import datetime as dt
df = pd.read_csv("readings_by_hour.csv")
df['time'] = pd.to_datetime(df['time'])
df1 = pd.read_csv("stations_info.csv")
i = 0
a = []
b = []
count = df1['station_id'].count()
while i < count:
if df['station_id'][i] == df1['station_id'][i]:
a = print(df1['latitude'][i] + ", " + df1['longitude'][i] + ", " + df['flow'][i] + ", " + df['time'].dt.hour + ", " + df['time'].dt.month + ", " + df['time'].dt.day)
b += [a]
i += 1
print(b)
but it seems it doesn't work, indeed didn't giving any output though it didn't give any error.
Upvotes: 0
Views: 51
Reputation: 23
In the given code, you are trying to assign return value of print function to a
and then adding it to b
. Here, the value of a
is null
. So, when you try to print the value, you will get empty string.
I have made corrections so that it works. Hope it helps..
while i < count:
if df['station_id'][i] == df1['station_id'][i]:
a = [df1['latitude'][i],df1['longitude'][i], df['flow'][i], df['time'][i].hour,df['time'][i].month,df['time'][i].day]
b.append(a)
i += 1
print(b)
Upvotes: 1
Reputation: 49
You could merge the two dataframes on the station_id
column, then create your list of lists like so:
merged_df = pd.merge(df, df1, left_on = 'station_id', right_on = 'station_id')
list_of_lists =[]
# Iterate over each row
for index, row in merged_df.iterrows():
# Create list for the current row
rowlist =[row.latitude, row.longitude, row.flow, row.hour, row.month, row.day]
# append the list to the final list
list_of_lists.append(rowlist)
You can use the datetime
module to extract the month, day, hour from the Date
column
See the pandas docs on pd.merge
for more info: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html
Upvotes: 1