Reputation: 41
Sorry, I apologise now, just started learning Python and trying to get something working.
Ok dataset is
Buy, typeid, volume, issued, duration, Volume Entered,Minimum Volume, range, price, locationid, locationname
SELL 20 2076541 2015-09-12T06:31:13 90 2076541 1 region 331.21 60008494 Amarr
SELL 20 194642 2015-09-07T19:36:49 90 194642 1 region 300 60008494 Amarr
SELL 20 2320 2015-09-13T07:48:54 3 2320 1 region 211 60008491 Irnin
I would like to filter for a specific location either by name or ID, doesn't bother me, then to pick the minimum price for that location. Preferably to hardcode it in, since I only have a few locations I'm interested. e.g locationid = 60008494.
I see you can do two conditions on one line, but I don't see how to apply it. So I'm trying to nest it. Doesn't have to be pandas, just seems the first thing I found that did one part of what I required.
The code I've gotten so far is, and only does the minimum part of what I'm looking to achieve.
data = pd.read_csv('orders.csv')
length = len(data['typeid'].unique())
res = pd.DataFrame(columns=('Buy', 'typeid', 'volume','duration','volumeE','Minimum','range','price','locationid','locationname'))
for i in range(0,length):
name_filter = data[data['typeid'] == data['typeid'].unique()[i]]
price_min_filter = name_filter[name_filter['price'] == name_filter['price'].min() ]
res = res.append(price_min_filter, ignore_index=True)
i=i+1
res.to_csv('format.csv') # writes output to csv
print "Complete"
UPDATED. Ok so, the latest part, seems like the following code is the direction I should be going in. If I could have s=typeid, locationid and price, thats perfect. So I've written what I want to do, whats the correct syntax to get that in python? Sorry I'm used to Excel and SQL.
import pandas as pd
df = pd.read_csv('orders.csv')
df[df['locationid'] ==60008494]
s= df.groupby(['typeid'])['price'].min()
s.to_csv('format.csv')
Upvotes: 0
Views: 792
Reputation: 91009
If what you really want is -
I would like to filter for a specific location either by name or ID, doesn't bother me, then to pick the minimum price for that location. Preferably to hardcode it in, since I only have a few locations I'm interested. e.g locationid = 60008494.
You can simply filter the df on the locationid
first and then use ['price'].min()
. Example -
In [1]: import pandas as pd
In [2]: s = """Buy,typeid,volume,issued,duration,Volume Entered,Minimum Volume,range,price,locationid,locationname
...: SELL,20,2076541,2015-09-12T06:31:13,90,2076541,1,region,331.21,60008494,Amarr
...: SELL,20,194642,2015-09-07T19:36:49,90,194642,1,region,300,60008494,Amarr
...: SELL,20,2320,2015-09-13T07:48:54,3,2320,1,region,211,60008491,Irnin"""
In [3]: import io
In [4]: df = pd.read_csv(io.StringIO(s))
In [5]: df
Out[5]:
Buy typeid volume issued duration Volume Entered \
0 SELL 20 2076541 2015-09-12T06:31:13 90 2076541
1 SELL 20 194642 2015-09-07T19:36:49 90 194642
2 SELL 20 2320 2015-09-13T07:48:54 3 2320
Minimum Volume range price locationid locationname
0 1 region 331.21 60008494 Amarr
1 1 region 300.00 60008494 Amarr
2 1 region 211.00 60008491 Irnin
In [8]: df[df['locationid']==60008494]['price'].min()
Out[8]: 300.0
If you want to do it for all the locationids', then as said in the other answer you can use DataFrame.groupby
for that and then take the ['price']
column for the group you want and use .min()
. Example -
data = pd.read_csv('orders.csv')
data.groupby(['locationid'])['price'].min()
Demo -
In [9]: df.groupby(['locationid'])['price'].min()
Out[9]:
locationid
60008491 211
60008494 300
Name: price, dtype: float64
For getting the complete row which has minimum values in the corresponding groups, you can use idxmin()
to get the index for the minimum value and then pass it to df.loc
to get those rows. Example -
In [9]: df.loc[df.groupby(['locationid'])['price'].idxmin()]
Out[9]:
Buy typeid volume issued duration Volume Entered \
2 SELL 20 2320 2015-09-13T07:48:54 3 2320
1 SELL 20 194642 2015-09-07T19:36:49 90 194642
Minimum Volume range price locationid locationname
2 1 region 211 60008491 Irnin
1 1 region 300 60008494 Amarr
Upvotes: 1
Reputation: 2569
If I understand your question correctly, you really won't need to do much more than a DataFrame.Groupby()
. As an example, you can group the dataframe by the locationname
, then select the price
column from the resulting groupby object, then use the min()
method to output the minimum value for each group:
data.groupby('locationname')['price'].min()
which will give you the minimum value of price
for each group. So it will look something like:
locationname
Amarr 300
Irnin 211
Name: price, dtype: float64
Upvotes: 0