Wolverine
Wolverine

Reputation: 51

Looping over one pandas column to match values with index of another dataframe

Exp is one DataFrame with datetime object

           Exp
0   1989-06-01
1   1989-07-01
2   1989-08-01
3   1989-09-01
4   1989-10-01

CL is the Dataframe with Index as DateTime Object

                    CL
1989-06-01   68.800026
1989-06-04   68.620026
1989-06-05   68.930023
1989-06-06   68.990021
1989-06-09   69.110023

This what my desired output should look like

                   CL          R

1989-06-01   68.800026   1989-06-01
1989-06-04   68.620026
1989-06-05   68.930023
1989-06-06   68.990021
1989-06-09   69.110023

This is what I tried doing:

for m in Exp.iloc[:,0]:
if m == CL.index:
    CL['R'] = m

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Can someone please help me ? I keep getting this ValueError a lot of times

Upvotes: 2

Views: 1867

Answers (2)

mkos
mkos

Reputation: 428

Edit: updated with commenters suggestion.

You need to do LEFT JOIN:

Exp = pd.DataFrame(
    pd.to_datetime(['1989-06-01', '1989-07-01', '1989-08-01', '1989-09-01', '1989-10-01']),
    columns=['Exp'])

gives:

          Exp
0  1989-06-01
1  1989-07-01
2  1989-08-01
3  1989-09-01
4  1989-10-01

and

CL = pd.DataFrame(
[68.800026, 68.620026, 68.930023, 68.990021, 69.110023],
index = pd.to_datetime(['1989-06-01', '1989-06-04', '1989-06-05', '1989-06-06', '1989-06-09']),
columns = ['CL'])

gives

                   CL
1989-06-01  68.800026
1989-06-04  68.620026
1989-06-05  68.930023
1989-06-06  68.990021
1989-06-09  69.110023

then:

(CL
 .reset_index()
 .merge(Exp, how='left', right_on='Exp', left_on='index')
 .set_index('index')
 .rename(columns={'Exp': 'R'}))

returns what you are looking for

                   CL           R
index                            
1989-06-01  68.800026  1989-06-01
1989-06-04  68.620026         NaN
1989-06-05  68.930023         NaN
1989-06-06  68.990021         NaN
1989-06-09  69.110023         NaN

Because looping over dataframe is not Pandas way of doing things.

Upvotes: 2

piRSquared
piRSquared

Reputation: 294218

pd.DataFrame.join
join focuses on combining dataframes/series via the index.
Use set_index on Exp with drop=False to keep the same information in the dataframe proper and the index. We put it in the index to make join convenient.

CL.join(Exp.set_index('Exp', drop=False)).rename(columns=dict(Exp='R'))

                   CL          R
1989-06-01  68.800026 1989-06-01
1989-06-04  68.620026        NaT
1989-06-05  68.930023        NaT
1989-06-06  68.990021        NaT
1989-06-09  69.110023        NaT

Setup

Exp = pd.DataFrame(dict(
        Exp=pd.to_datetime(
            ['1989-06-01', '1989-07-01', '1989-08-01', '1989-09-01', '1989-10-01'])
    ))

CL = pd.DataFrame(dict(
        CL=[68.800026, 68.620026, 68.930023, 68.990021, 69.110023],
    ), pd.to_datetime(
        ['1989-06-01', '1989-06-04', '1989-06-05', '1989-06-06', '1989-06-09']))

Upvotes: 0

Related Questions