user3849852
user3849852

Reputation:

Manipulations with Lat-Lon and Time Series Pandas

I am trying to do some file merging with Latitude and Longitude.

Input File1.csv

Name,Lat,Lon,timeseries(n)
London,80.5234,121.0452,523
London,80.5234,121.0452,732
London,80.5234,121.0452,848
Paris,90.4414,130.0252,464
Paris,90.4414,130.0252,829
Paris,90.4414,130.0252,98
New York,110.5324,90.0023,572
New York,110.5324,90.0023,689
New York,110.5324,90.0023,794


File2.csv
Name,lat,lon,timeseries1
London,80.5234,121.0452,500
Paris,90.4414,130.0252,400
New York,110.5324,90.0023,700

Now Expected output is

File2.csv

Name,lat,lon,timeseries1,timeseries(n) #timeseries is 24 hrs format 17:45:00
London,80.5234,121.0452,500,2103 #Addition of all three values 
Paris,90.4414,130.0252,400,1391
New York,110.5324,90.0023,700,2055

With python, numpy and dictionaries it would be straight as key = sum of values but I want to use Pandas

Please suggest me how to start with or may be a point me to some example. I have not see anything like Dictionary types with Pandas with Latitude and Longitude.

Upvotes: 1

Views: 534

Answers (1)

EdChum
EdChum

Reputation: 394021

Perform a groupby aggregation on the first df, call sum and then merge this with the other df:

In [12]:
gp = df.groupby('Name')['timeseries(n)'].sum().reset_index()
df1.merge(gp, on='Name')

Out[14]:
       Name       Lat       Lon  timeseries1  timeseries(n)
0    London   80.5234  121.0452          500           2103
1     Paris   90.4414  130.0252          400           1391
2  New York  110.5324   90.0023          700           2055

the aggregation looks like this:

In [15]:    
gp

Out[15]:
       Name  timeseries(n)
0    London           2103
1  New York           2055
2     Paris           1391

Your csv files can loaded using read_csv so something like:

df = pd.read_csv('File1.csv')
df1 = pd.read_csv('File2.csv')

Upvotes: 1

Related Questions