David
David

Reputation: 129

Pandas get value next to a string in a dataframe

Ok, so I have a pandas dataframe, but my row indexes are not correct because the dataframe comes from a read_sql_table with as row indexes the number of the row. Like this :

scalars
                                name         value
0                       p_EXPORT_TEELECE -1.187000e+04
1                            MaxCO2Emiss  1.510000e+02
2                              ModelType  2.000000e+00
3                 CO2EmissCostInObjFunct  0.000000e+00
4                  IncludeAdequacyConstr  1.000000e+00
5                  IncludeReservesConstr  1.000000e+00
6                            ESVMAllowed  1.000000e+00
7                          LSESSTAllowed  1.000000e+00

So I'm trying to get the value for MaxCO2Emiss for example. After searching for quite a long time I found a solution to get the value of 151, but I don't think this is the correct way to do it:

maxco2emiss = df.ix[df.index[df['name'] == 'MaxCO2Emiss'].tolist(),1][1]

Is there a more understandable way to get this value?

Thanks

Upvotes: 3

Views: 1787

Answers (2)

jezrael
jezrael

Reputation: 863281

Simpliest is create Series and use it for lookup:

s = df.set_index('name')['value']

print (s['MaxCO2Emiss'])
151.0

But if there is multiple same names is necessary for scalar select only first value, e.g. by iat[0], iloc[0], values[0]:

print (df)
                     name    value
0        p_EXPORT_TEELECE -11870.0
1             MaxCO2Emiss    151.0
2               ModelType      2.0
3  CO2EmissCostInObjFunct      0.0
4  CO2EmissCostInObjFunct      1.0
5   IncludeReservesConstr      1.0
6             ESVMAllowed      1.0
7           LSESSTAllowed      1.0

s = df.set_index('name')['value']

print (s['CO2EmissCostInObjFunct'])
CO2EmissCostInObjFunct    0.0
CO2EmissCostInObjFunct    1.0
Name: value, dtype: float64

print (s['CO2EmissCostInObjFunct'].iat[0])
0.0

Another general solution for first value is compare and get first index of first True and then select by loc:

s = df.loc[(df['name'] == 'CO2EmissCostInObjFunct').idxmax(), 'value']
print (s)
0.0

s = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax(), 'value']
print (s)
151.0

Detail:

print (df['name'] == 'CO2EmissCostInObjFunct')
0    False
1    False
2    False
3     True
4     True
5    False
6    False
7    False
Name: name, dtype: bool

print ((df['name'] == 'CO2EmissCostInObjFunct').idxmax())
3

print (df['name'] == 'MaxCO2Emiss')
0    False
1     True
2    False
3    False
4    False
5    False
6    False
7    False
Name: name, dtype: bool

print ((df['name'] == 'MaxCO2Emiss').idxmax())
1

EDIT: If want return one row DataFrame add []:

For multiple columns:

df1 = df.loc[[(df['name'] == 'MaxCO2Emiss').idxmax()], ['value1','value2']]
print (df1)
   value1  value2
1   151.0       7

For all columns:

df2 = df.loc[[(df['name'] == 'MaxCO2Emiss').idxmax()]]
print (df2)
          name  value1  value2    a
1  MaxCO2Emiss   151.0       7  5.0

If want return Series:

s1 = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax(),  ['value1','value2']]
print (s1)
value1    151
value2      7
Name: 1, dtype: object

s2 = df.loc[(df['name'] == 'MaxCO2Emiss').idxmax()]
print (s2)
name      MaxCO2Emiss
value1            151
value2              7
a                   5
Name: 1, dtype: object

Upvotes: 5

jpp
jpp

Reputation: 164773

Generator

Possibly the fastest method is to bypass pandas for this:

next(j for i, j in zip(df.name, df.value) if i == 'MaxCO2Emiss')

Pandas

pd.DataFrame.loc is designed for label-based indexing. This will return a series, so it will also work for multiple matches:

df.loc[df['name'] == 'MaxCO2Emiss', 'value']

For example, to get the first value you can index the series you can either use .iloc or .values:

df.loc[df['name'] == 'MaxCO2Emiss', 'value'].iloc[0]
df.loc[df['name'] == 'MaxCO2Emiss', 'value'].values[0]

Upvotes: 1

Related Questions