MarkAlanFrank
MarkAlanFrank

Reputation: 737

Filling cell based on existing cells

I have data in following format:

8A564   nan         json
8A928   nan         json
8A563   nan         json
8A564   10616280    json
8A563   10616222    json
8A564   nan         json
8B1BB   10982483    json
8A564   10616280    json

I would like to fill data in second column to match row that has same first column and non null value in second. So I would get following:

8A564   10616280    json
8A928   nan         json
8A563   10616222    json
8A564   10616280    json
8A563   10616222    json
8A564   10616280    json
8B1BB   10982483    json
8A564   10616280    json

How can it be achieved?

Upvotes: 7

Views: 105

Answers (2)

Rajat Jain
Rajat Jain

Reputation: 2032

here's a python one-liner to help you.

df.iloc[:, [0,1]].groupby(0).apply(lambda x: x[1].fillna( np.nan if len(x[1].dropna()) == 0 else x[1].dropna().mode()[0])).reset_index()[[0,1]]

It uses group-by and also fills only if certain condition is met.

Upvotes: 0

piRSquared
piRSquared

Reputation: 294228

groupby and bfill

Keep in mind the the 0 in groupby(0) refers to the column named 0. If your column has a different name, use that.

df.groupby(0).bfill()

       0         1     2
0  8A564  10616280  json
1  8A928       NaN  json
2  8A563  10616222  json
3  8A564  10616280  json
4  8A563  10616222  json
5  8A564  10616280  json
6  8B1BB  10982483  json
7  8A564  10616280  json

If the ordering of what is null doesn't lend itself to back filling, you can get the first non-null value.

df[1] = df.groupby(0)[1].transform('first')
df

       0         1     2
0  8A564  10616280  json
1  8A928       NaN  json
2  8A563  10616222  json
3  8A564  10616280  json
4  8A563  10616222  json
5  8A564  10616280  json
6  8B1BB  10982483  json
7  8A564  10616280  json

Upvotes: 5

Related Questions