Starbucks
Starbucks

Reputation: 1568

IF Statement - Creating New Column Based on Two Columns

I am trying to create a new column that contains the past values of df2.past_values and the future values of df2.future_value called df2['past_future_mix'].

If window = 0, these are past values, window = 1, these are future values

I am trying to use a if elif statement, however i get this error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

import pandas as pd
import numpy as np
from datetime import timedelta

# Set up input data (taken from original post)
df1 = pd.DataFrame({
    'past_value': [100, 200, 200, 300, 350, 400, 'NaN', 'NaN', 'NaN', 'NaN', 'NaN', 'NaN'],
    'future_value': ['NaN', 'NaN', 'NaN', 'NaN', 'NaN', 'NaN', 800, 900, 900, 950, 975, 1000],
    'window': [0,0,0,0,0,0,1,1,1,1,1,1],
    'category': ['Category A']*4 + ['Category B']*4 + ['Category C']*4})

ca_list = df1.category.unique()

bigdf = pd.DataFrame()

for cat in cat_list:
    df2 = df1[(df1.category == cat)]
    if df2['window'] == 0:
        df2['past_future_mix'] = df2['past_value']
    elif df2['window'] == 1:
        df2['past_future_mix'] = df2['future_value']
    bigdf = bigdf.append(df2)

This is the error message that i am getting:

ValueError                                Traceback (most recent call last)
<ipython-input-56-cfdd200ce9ff> in <module>
     16 for cat in cat_list:
     17     df2 = df1[(df1.category == cat)]
---> 18     if df2['window'] == 0:
     19         df2['past_future_mix'] = df2['past_value']
     20     elif df2['window'] == 1:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1553             "The truth value of a {0} is ambiguous. "
   1554             "Use a.empty, a.bool(), a.item(), a.any() or a.all().".format(
-> 1555                 self.__class__.__name__
   1556             )
   1557         )

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I changed if df2['window] == 0 to if 0 in df2['window'], however it does not print a dataframe now.

for cat in cat_list:
    df2 = df1[(df1.category == cat)]
    if 0 in df2['window']:
        df2['past_future_mix'] = df2['past_value']
    elif 1 in df2['window']:
        df2['past_future_mix'] = df2['future_value']
    bigdf = bigdf.append(df2)

print(bigdf)

Upvotes: 0

Views: 37

Answers (1)

mohsinali
mohsinali

Reputation: 296

You can try .loc

for cat in cat_list:
    df2 = df1[(df1.category == cat)]
    df2.loc[df2['window'] == 0,  'past_future_mix'] = df2['past_value']
    df2.loc[df2['window'] == 1,  'past_future_mix'] = df2['future_value']
    bigdf = bigdf.append(df2)

Upvotes: 1

Related Questions