Leonardo Acquaroli
Leonardo Acquaroli

Reputation: 25

How to use a column filled with floats in the function Point(lat=float,lon=float) of meteostat? Python

I am trying to build a dataframe using Daily(Point(lat,lon), start-date, end-date), a function of the meteostat library that returns all the daily weather statistics for the location indicated by Point(lat,lon) using latitude and longitude, from the start-date to the end-date.

The issue is that (lat, lon) arguments needs to be float and so indicates only one location. I want to addres several locations and collect the daily metereological data for each of them.

import meteostat
from datetime import datetime
from meteostat import Point, Daily
import matplotlib.pyplot as plt
from meteostat import Stations
import pandas as pd
import numpy 

data = pd.read_csv(r'C:\Users\leoac\OneDrive\Desktop\Coding\Python apps\Correlation temp-goals in Serie A\seasons 09-19.csv', ";")
date_not_converted = data['Date']
date_being_converted = datetime.strptime(date_not_converted,'%d,%m,%Y')             #1bis non può essere una serie...allora provo a cambiare il tipo di dati
date = date_being_converted.strftime('%Y,%m,%d')

#plot = Daily(Point(data['lat'][15],data['lon'][15]),d1,d2).fetch()
data['temp']  = Daily(Point(data['lat'][1],data['lon'][1]),datetime(date),datetime(date)).fetch() #1 sistemare il formato data
print(data['temp'])                                                                               #2 trovare un modo per inserire i vettori date e lat/lon nel df
data['temp'].plot(y=['tavg'])
plt.show()

print(data)

Upvotes: 0

Views: 349

Answers (1)

Pierre
Pierre

Reputation: 2002

Here is a solution inspired by this github issue. It makes parallel requests for the different locations and then merges the results in a pandas dataframe.

from datetime import datetime
from meteostat import Point, Daily
from multiprocessing import cpu_count

from joblib import Parallel, delayed
import pandas as pd

def get_bulk_data(row):
    location = Point(row.lat, row.lon)
    data = Daily(location, row.Date, row.Date).fetch()
    data["latitude"] = row.lat
    data["longitude"] = row.lon
    return data

if __name__ == "__main__":
    df = pd.read_csv("seasons.csv", sep=";")
    df["Date"] = pd.to_datetime(df["Date"], format="%d,%m,%Y")

    executor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')
    tasks = (
        delayed(get_bulk_data)(row)
        for _, row in df.iterrows()
    )
    list_of_locations_data = executor(tasks)
    data_full = pd.concat(list_of_locations_data)
    print(data_full)

It works with the following csv and date formats, you can adapt the code if yours are slightly different:

Date;lat;lon
18,02,1997;50.3;-4.7
12,07,1998;41.3;1.5

Upvotes: 0

Related Questions