Red Sparrow
Red Sparrow

Reputation: 397

Conditionally populate dataframe row from tuple values when dataframe and tuple values match

I am trying to update my dataframe by conditionally populating columns. I want to compare a value in dataframe row with that of a tuple and then populate a different column of the same dataframe row with another value from the tuple.

For example:

foo = pd.DataFrame({"TIME":([1,1,2,2,3,3,4,4,5,5,6,6]),
                 "PLACE": (["place1","place2","place1","place2","place1","place2","place1","place2","place1","place2","place1","place2"]),
                 "Xcords" :(["","","","","","","","","","","",""]),
                 "Ycords" :(["","","","","","","","","","","",""])})

and a tuple with the place and its x and y coordinates:

bar = [('place1','1','11'),('place3','3','33'),('place2','2','22')]

In the end I would like to have the following:

     PLACE  TIME Xcords Ycords
0   place1     1      1     11
1   place2     1      2     22
2   place1     2      1     11
3   place2     2      2     22
4   place1     3      1     11
5   place2     3      2     22
6   place1     4      1     11
7   place2     4      2     22
8   place1     5      1     11
9   place2     5      2     22
10  place1     6      1     11
11  place2     6      2     22

so where the dataframe "PLACE" column value matches the tuple first value in that same dataframe row the Xcords and Ycords should be populated with the tuple's 2nd and 3rd value. This should happen for all instances since they might be appearing more than one time.

What would be the correct syntax for something like this? I have tried this but then all my X_cords and Y_cords rows end up with the same value:

for i in bar:
    if any(foo["PLACE"]==i[0]):
        foo.X_cords = i[1]
        foo.Y_cords = i[2]

Also can a for loop be avoided since both the datasets are very large?

Upvotes: 1

Views: 706

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210932

is that what you want?

In [191]: bar_df = pd.DataFrame(bar, columns=['PLACE','Xcords','Ycords'])

In [192]: bar_df
Out[192]:
    PLACE Xcords Ycords
0  place1      1     10
1  place3      3     30
2  place2      2     20

In [193]: pd.merge(foo[['PLACE','TIME']], bar_df, on='PLACE', how='left')
Out[193]:
     PLACE  TIME Xcords Ycords
0   place1     1      1     10
1   place2     1      2     20
2   place1     2      1     10
3   place2     2      2     20
4   place1     3      1     10
5   place2     3      2     20
6   place1     4      1     10
7   place2     4      2     20
8   place1     5      1     10
9   place2     5      2     20
10  place1     6      1     10
11  place2     6      2     20

or as @Alexander mentioned:

In [235]: foo[['PLACE','TIME']].merge(bar_df, on='PLACE', how='left')
Out[235]:
     PLACE  TIME Xcords Ycords
0   place1     1      1     10
1   place2     1      2     20
2   place1     2      1     10
3   place2     2      2     20
4   place1     3      1     10
5   place2     3      2     20
6   place1     4      1     10
7   place2     4      2     20
8   place1     5      1     10
9   place2     5      2     20
10  place1     6      1     10
11  place2     6      2     20

Upvotes: 4

Related Questions