Reputation: 415
Need to filter the data in dataframe and create a single row out of it. I tried to group by but want different values to filter so am failed to do so. Here is the example
#1 #2 #3 #4
1 Bob 10 Eng
2 Jack 21 Tel
3 Rian 11 Tam
4 Vick 10 Ram
5 Jick 20 Mar
6 Rams 10 Mal
7 Venk 21 Mar
First need to filter the rows on column #3 as 10 and followed by 21 means
#1 #2 #3 #4
1 Bob 10 Eng
2 Jack 21 Tel
3 Rams 10 Mal
4 Venk 21 Mal
Then convert this 2 row data into a single row like below
#1 #2 #3 #4 #5
1 Bob Jack Eng Tel
2 Jick Venk Mal Mar
Being a beginner i tried grouping in several ways but the values need to be picked is in order first row should contain 10 and the immediate row should be 21.
Can anybody suggest which API in pandas will help me achieving this?
Upvotes: 2
Views: 117
Reputation: 32
Lists are ordered iterables, and the filter
generator reads each item in, one at a time in that order. Therefore, it will yield output in order.
The filter
function takes boolean value as input.
list(filter(lambda a: a is 10 or a is 21, col_3))
Upvotes: 0
Reputation: 632
If 21
always comes just after 10
, you may like:
df_shifted = df.shift(-1, axis=0) # df is the original DataFrame you have.
is_ok = (df["#3"] == 10) & (df_shifted["#3"] == 21)
ans = pd.concat([df[is_ok], df_shifted[is_ok]], axis=1).drop(columns="#3")
This will return:
#2 #4 #2 #4
1 Bob Eng Jack Tel
6 Rams Mal Venk Mar
Upvotes: 0
Reputation: 71580
Do you mean by something like?
>>> x = df.loc[(df['#3'].eq(10) & df['#3'].shift(-1).eq(21)) | (df['#3'].shift().eq(10) & df['#3'].eq(21))]
>>> x.groupby(np.arange(len(x)) // 2).apply(lambda x: pd.DataFrame(x.drop(["#1", "#3"], axis=1).to_numpy().flatten()).T).reset_index(drop=True).rename(lambda x: f'#{x + 1}', axis=1)
#1 #2 #3 #4
0 Bob Eng Jack Tel
1 Rams Mal Venk Mar
>>>
Upvotes: 1
Reputation: 195438
Try:
mask = df["#3"].eq(21) & df["#3"].shift().eq(10)
mask = mask.shift(-1).fillna(False) | mask
x = (
df[mask]
.groupby(np.arange(mask.sum()) // 2)
.apply(lambda x: list(x["#2"]) + list(x["#4"]))
).apply(pd.Series)
x.columns = [f"#{x+2}" for x in x.columns]
x = x.reset_index().rename(columns={"index": "#1"})
x["#1"] += 1
print(x)
Prints:
#1 #2 #3 #4 #5
0 1 Bob Jack Eng Tel
1 2 Rams Venk Mal Mar
Upvotes: 1