Pandas get value next to a string in a dataframe

Question

Ok, so I have a pandas dataframe, but my row indexes are not correct because the dataframe comes from a read_sql_table with as row indexes the number of the row. Like this :

scalars
                                name         value
0                       p_EXPORT_TEELECE -1.187000e+04
1                            MaxCO2Emiss  1.510000e+02
2                              ModelType  2.000000e+00
3                 CO2EmissCostInObjFunct  0.000000e+00
4                  IncludeAdequacyConstr  1.000000e+00
5                  IncludeReservesConstr  1.000000e+00
6                            ESVMAllowed  1.000000e+00
7                          LSESSTAllowed  1.000000e+00

So I'm trying to get the value for MaxCO2Emiss for example. After searching for quite a long time I found a solution to get the value of 151, but I don't think this is the correct way to do it:

maxco2emiss = df.ix[df.index[df['name'] == 'MaxCO2Emiss'].tolist(),1][1]

Is there a more understandable way to get this value?

Thanks

jezrael · Accepted Answer

Simpliest is create Series and use it for lookup:

s = df.set_index('name')['value']

print (s['MaxCO2Emiss'])
151.0

But if there is multiple same names is necessary for scalar select only first value, e.g. by iat[0], iloc[0], values[0]:

print (df)
                     name    value
0        p_EXPORT_TEELECE -11870.0
1             MaxCO2Emiss    151.0
2               ModelType      2.0
3  CO2EmissCostInObjFunct      0.0
4  CO2EmissCostInObjFunct      1.0
5   IncludeReservesConstr      1.0
6             ESVMAllowed      1.0
7           LSESSTAllowed      1.0

s = df.set_index('name')['value']

print (s['CO2EmissCostInObjFunct'])
CO2EmissCostInObjFunct    0.0
CO2EmissCostInObjFunct    1.0
Name: value, dtype: float64

print (s['CO2EmissCostInObjFunct'].iat[0])
0.0

Another general solution for first value is compare and get first index of first True and then select by loc:

s = df.loc[(df['name'] == 'CO2EmissCostInObjFunct').idxmax(), 'value']
print (s)
0.0

s = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax(), 'value']
print (s)
151.0

Detail:

print (df['name'] == 'CO2EmissCostInObjFunct')
0    False
1    False
2    False
3     True
4     True
5    False
6    False
7    False
Name: name, dtype: bool

print ((df['name'] == 'CO2EmissCostInObjFunct').idxmax())
3

print (df['name'] == 'MaxCO2Emiss')
0    False
1     True
2    False
3    False
4    False
5    False
6    False
7    False
Name: name, dtype: bool

print ((df['name'] == 'MaxCO2Emiss').idxmax())
1

EDIT: If want return one row DataFrame add []:

For multiple columns:

df1 = df.loc[[(df['name'] == 'MaxCO2Emiss').idxmax()], ['value1','value2']]
print (df1)
   value1  value2
1   151.0       7

For all columns:

df2 = df.loc[[(df['name'] == 'MaxCO2Emiss').idxmax()]]
print (df2)
          name  value1  value2    a
1  MaxCO2Emiss   151.0       7  5.0

If want return Series:

s1 = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax(),  ['value1','value2']]
print (s1)
value1    151
value2      7
Name: 1, dtype: object

s2 = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax()]
print (s2)
name      MaxCO2Emiss
value1            151
value2              7
a                   5
Name: 1, dtype: object

Pandas get value next to a string in a dataframe

Answers (2)

Related Questions