Reputation: 129
Ok, so I have a pandas dataframe, but my row indexes are not correct because the dataframe comes from a read_sql_table with as row indexes the number of the row. Like this :
scalars
name value
0 p_EXPORT_TEELECE -1.187000e+04
1 MaxCO2Emiss 1.510000e+02
2 ModelType 2.000000e+00
3 CO2EmissCostInObjFunct 0.000000e+00
4 IncludeAdequacyConstr 1.000000e+00
5 IncludeReservesConstr 1.000000e+00
6 ESVMAllowed 1.000000e+00
7 LSESSTAllowed 1.000000e+00
So I'm trying to get the value for MaxCO2Emiss for example. After searching for quite a long time I found a solution to get the value of 151, but I don't think this is the correct way to do it:
maxco2emiss = df.ix[df.index[df['name'] == 'MaxCO2Emiss'].tolist(),1][1]
Is there a more understandable way to get this value?
Thanks
Upvotes: 3
Views: 1787
Reputation: 863281
Simpliest is create Series
and use it for lookup:
s = df.set_index('name')['value']
print (s['MaxCO2Emiss'])
151.0
But if there is multiple same name
s is necessary for scalar select only first value, e.g. by iat[0]
, iloc[0]
, values[0]
:
print (df)
name value
0 p_EXPORT_TEELECE -11870.0
1 MaxCO2Emiss 151.0
2 ModelType 2.0
3 CO2EmissCostInObjFunct 0.0
4 CO2EmissCostInObjFunct 1.0
5 IncludeReservesConstr 1.0
6 ESVMAllowed 1.0
7 LSESSTAllowed 1.0
s = df.set_index('name')['value']
print (s['CO2EmissCostInObjFunct'])
CO2EmissCostInObjFunct 0.0
CO2EmissCostInObjFunct 1.0
Name: value, dtype: float64
print (s['CO2EmissCostInObjFunct'].iat[0])
0.0
Another general solution for first value is compare and get first index of first True
and then select by loc
:
s = df.loc[(df['name'] == 'CO2EmissCostInObjFunct').idxmax(), 'value']
print (s)
0.0
s = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax(), 'value']
print (s)
151.0
Detail:
print (df['name'] == 'CO2EmissCostInObjFunct')
0 False
1 False
2 False
3 True
4 True
5 False
6 False
7 False
Name: name, dtype: bool
print ((df['name'] == 'CO2EmissCostInObjFunct').idxmax())
3
print (df['name'] == 'MaxCO2Emiss')
0 False
1 True
2 False
3 False
4 False
5 False
6 False
7 False
Name: name, dtype: bool
print ((df['name'] == 'MaxCO2Emiss').idxmax())
1
EDIT: If want return one row DataFrame add []
:
For multiple columns:
df1 = df.loc[[(df['name'] == 'MaxCO2Emiss').idxmax()], ['value1','value2']]
print (df1)
value1 value2
1 151.0 7
For all columns:
df2 = df.loc[[(df['name'] == 'MaxCO2Emiss').idxmax()]]
print (df2)
name value1 value2 a
1 MaxCO2Emiss 151.0 7 5.0
If want return Series
:
s1 = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax(), ['value1','value2']]
print (s1)
value1 151
value2 7
Name: 1, dtype: object
s2 = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax()]
print (s2)
name MaxCO2Emiss
value1 151
value2 7
a 5
Name: 1, dtype: object
Upvotes: 5
Reputation: 164773
Generator
Possibly the fastest method is to bypass pandas
for this:
next(j for i, j in zip(df.name, df.value) if i == 'MaxCO2Emiss')
Pandas
pd.DataFrame.loc
is designed for label-based indexing. This will return a series, so it will also work for multiple matches:
df.loc[df['name'] == 'MaxCO2Emiss', 'value']
For example, to get the first value you can index the series you can either use .iloc
or .values
:
df.loc[df['name'] == 'MaxCO2Emiss', 'value'].iloc[0]
df.loc[df['name'] == 'MaxCO2Emiss', 'value'].values[0]
Upvotes: 1