Reputation: 435
I am trying to replicate a "right fill" excel-like function which fills the values right till the next value is not null/nan/empty. This "right-fill" exercise is only to be done if the value in the immediate following row in not empty or "nan". Furthermore, this has to be done for every group. I have the following pandas dataframe dataset. My current input table is "have". My output table is "want".
I am just a beginner in python. So any help would be appreciated. Also for those who would like this operation to be undertaken on a by group operation, data as follows: Table "have" as follows with grouping field "groups":
import pandas as pd
have = pd.DataFrame({ \
"groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \
,"0": pd.Series(["abc","1","something here","abc2","1","something here"]) \
,"1": pd.Series(["","2","something here","","","something here"]) \
,"2": pd.Series(["","3","something here","","3","something here"]) \
,"3": pd.Series(["something","1","something here","something","1","something here"]) \
,"4": pd.Series(["","2","something here","","2","something here"]) \
,"5": pd.Series(["","","something here","","","something here"]) \
,"6": pd.Series(["","","something here","","","something here"]) \
,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \
,"8": pd.Series(["","6","something here","","6","something here"]) \
,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \
})
Table "want" with grouping fields "groups":
import pandas as pd
want = pd.DataFrame({ \
"groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \
,"0": pd.Series(["abc","1","something here","anything","1","something here"]) \
,"1": pd.Series(["abc","2","something here"," anything ","2","something here"]) \
,"2": pd.Series(["abc","3","something here"," anything ","3","something here"]) \
,"3": pd.Series(["something","1","something here","","","something here"]) \
,"4": pd.Series(["something ","2","something here","","","something here"]) \
,"5": pd.Series(["","","something here","","","something here"]) \
,"6": pd.Series(["","","something here","","","something here"]) \
,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \
,"8": pd.Series(["cdf ","6","something here"," mnop ","6","something here"]) \
,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \
})
I tried to use this code, but I am still trying to familiar myself with groupby
and apply
statements:
grouped=have.groupby('groups')
have.groupby('groups').apply(lambda g: have.loc[g].isnull() )
#cond = have.loc[1].isnull() | have.loc[1].ne('')
want.loc[0, cond] = want.loc[0, cond].str.strip().replace('', None)
want
Upvotes: 1
Views: 194
Reputation: 294228
def fill(df):
df = df.copy()
i0, i1 = df.index[0], df.index[1]
cond = have.loc[i1].isnull() | have.loc[i1].ne('')
df.loc[i0, cond] = df.loc[i0, cond].str.strip().replace('', None)
return df
have.groupby('groups', group_keys=False).apply(fill)
Upvotes: 1