Python copy values from one row into other based on other column

Question

I've been trying this the whole day, so I really hope someone can help me.

I have a dataframe that looks like this (sorry for the format, it didn't let me format it as table, because it then remarked there would be an error in my code...):

| ColumnA  | ColumnB |otherColumn    |
| -------- | ------- |-------------- |
| 1        | n.a.    |row x          |
| 1        | n.a.    |row b          |
| 1        | n.a.    |row x          |
| 2        | 23467   |row x          |
| 2        | n.a.    |row y          |
| 3        | n.a.    |row x          |
| 3        | 768345  |row  y         |
| 3        | n.a.    |row  y         |
| 3        | 768345  |row   x        |
| 4        | 95634511|row  x         |
| 4        | n.a.    |row    r       |
| 5        | n.a.    |row    d       |

I now need to fill the same values of ColumnB in those rows that have the same number in ColumnA. (otherColumn is not needed in this case, I've just added it to show that there are multiple other columns too). If there is no value in any of the rows in ColumnB belonging to the same number in ColumnA, it should remain n.a. (as e.g. for "1" and "5") So the desired output should be

| ColumnA  | ColumnB |otherColumn    |
| -------- | --------|-------------- |
| 1        | n.a.    |row x          |
| 1        | n.a.    |row b          |
| 1        | n.a.    |row x          |
| 2        | 23467   |row x          |
| 2        | 23467   |row y          |
| 3        | 768345  |row x          |
| 3        | 768345  |row  y         |
| 3        | 768345  |row  y         |
| 3        | 768345  |row   x        |
| 4        | 95634511|row  x         |
| 4        | 95634511|row    r       |
| 5        | n.a.    |row    d       |

I've tried it with converting the two columns to a dictionary, but was then not able to use the keys and values separately (to build an if-statement); I've tried it with producing a second dataframe consisting of ColumnA and ColumnB and trying to merge it (but then it added more rows); I've tried it with update() and combine_first(), which wasn't successful either.

Every advice is very much appreciated!!

Henry Yik · Accepted Answer

You can find the valid values with groupby and transform first:

print (df.replace('n.a.', np.NaN).groupby("ColumnA")["ColumnB"].transform("first"))

0         None
1         None
2         None
3        23467
4        23467
5       768345
6       768345
7       768345
8       768345
9     95634511
10    95634511
11        None

Python copy values from one row into other based on other column

Answers (2)

Related Questions