Nils
Nils

Reputation: 409

Pandas new Column with list comprehension and an if statement with reference to existings columns

I try to add a new column to a DataFrame with list comprehension and an if statement, this way:

SD['Ln(aT) ANALYTIC'] = [x + 1 for x in SD['T'] if SD['T'] >= SD['TG']]

and I get this error:

 The truth value of a Series is ambiguous. Use a.empty, 
 a.bool(), a.item(), a.any() or a.all().

I don´t know how to handle this problem.

Any suggestions?

EDIT: DataFrame looks like:

enter image description here

Upvotes: 4

Views: 10547

Answers (1)

jezrael
jezrael

Reputation: 863226

Use numpy.where with boolean mask:

mask = SD['T'] >= SD['TG']
SD['Ln(aT) ANALYTIC'] = np.where(mask, SD['T'] + 1, SD['T'])

Or:

SD['Ln(aT) ANALYTIC'] = np.where(mask, SD['T'] + 1, np.nan)

List comprehesnion is possible, but slow:

SD['Ln(aT) ANALYTIC1'] = [i + 1 if i >= j else i for i, j in zip(SD['T'], SD['TG'])]

SD = pd.DataFrame({'T': [1,2,3],
                   'TG':[2,5,1]})

#[3000 rows x 2 columns]
SD = pd.concat([SD] * 1000, ignore_index=True)


In [294]: %timeit SD['Ln(aT) ANALYTIC1'] = [i + 1 if i >= j else i for i, j in zip(SD['T'], SD['TG'])]
1.18 ms ± 82.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [295]: %timeit SD['Ln(aT) ANALYTIC2'] = np.where(SD['T'] >= SD['TG'], SD['T'] + 1, SD['T'])
511 µs ± 16.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Upvotes: 5

Related Questions