SMO
SMO

Reputation: 149

Looking for specific value with if statement

Hi I have a large dataset in excel (and it will keep growing) of products' orders.

The dataset look like this

Product   Date         Lsat24               Next24                  Summary                   Buyer 
*day1*
AX1  |  2/1/2019 |Checking inventory  |Invoicing             |The product request started today |  HTN
X5L  |  2/1/2019 |Inventory counting  |Future request planing|No operation update               | null
NO6  |  2/1/2019 |Request in-process  |Closing request       |Operation product request ended   |  KLW
ZS5  |  2/1/2019 |Order placed        |Processing request    |Request started for the product   |  WHM
*day2* 
AX1  |  3/1/2019 |Invoicing the order |Logistic planning     |Product requested still pending   |  HTN
X5L  |  3/1/2019 |New request planing |processing order      |A new product request started     |  LKJ
NO6  |  3/1/2019 |Closing request     |No new request        |No update for new operation       |  null
ZS5  |  3/1/2019 |Request ended       |New order placed      |Product inventory counting        |  WHM
*day3*
AX1  |  4/1/2019 |Logistic planning   |Closing the request   |Product request ended             |  HTN
X5L  |  4/1/2019 |Request processing  |Planning logistic     |The product request under-process |  LKJ
NO6  |  4/1/2019 |No update           |New Request Planing   |No update for new operation       |  null
ZS5  |  4/1/2019 |New order started   |Request invoicing     |Inventory and logistic planing    |  KLW

My ultimate goal is to get the percentage of each buyer orders. And that would mean I need to look for the world in 3 columns (Last24, Next 24 and Summary) So to get that I need to make sure that if Product == Product and Buyer == Buyer then search Last24 or Next24 or Summary for string 'ended'

The result would look like this table

Product   Date         Lsat24               Next24                      Summary                   Buyer 
*day1*
NO6  |  2/1/2019 |Request in-process  |Closing request       |Operation product request ended   |  KLW
*day2* 
ZS5  |  3/1/2019 |Request ended       |New order placed      |Product inventory counting        |  WHM
*day3*
AX1  |  4/1/2019 |Logistic planning   |Closing the request   |Product request ended             |  HTN

So the real data table (in excel) has over 200 products. These products will be repeated everyday. The Last24, Next24 and summary for each product is different every day. Also the buyer can change(Not always).

I am using Python3.8.

Edit"

To make my question clear I will use examples.

  Product   Date         Lsat24               Next24                  Summary                   Buyer 
*day1*
0  AX1  |  2/1/2019  |Checking inventory  |Invoicing            |The product request started today |  HTN
1  X5L  |  2/1/2019  |Inventory counting |Future request planing|No operation update               | null
2  NO6  |  2/1/2019 |Request in-process  |Closing request       |Operation product request ended   |  KLW
3  ZS5  |  2/1/2019 |Order placed        |Processing request    |Request started for the product   |  WHM
*day2* 
4  AX1  |  3/1/2019 |Invoicing the order |Logistic planning     |Product requested still pending   |  HTN
5  X5L  |  3/1/2019 |New request planing |processing order      |A new product request started     |  LKJ
6  NO6  |  3/1/2019 |Closing request     |No new request        |No update for new operation       |  null
7  ZS5  |  3/1/2019 |Request ended       |New order placed      |Product inventory counting        |  WHM
*day3*
8  AX1  |  4/1/2019 |Logistic planning   |Closing the request   |Product request ended             |  HTN
9  X5L  |  4/1/2019 |Request processing  |Planning logistic     |The product request under-process |  LKJ
10 NO6  |  4/1/2019 |No update           |New Request Planing   |No update for new operation       |  null
11 ZS5  |  4/1/2019 |New order started   |Request invoicing     |Inventory and logistic planing    |  KLW

One case

*if** Product == Product ([1] X5L == [5] X5L) then go for Buyer == Buyer ([1] null != [5] LKJ(since they do not match) search in row [1] for "ended" in Last24 or Next24 or Summary, if "ended" is found in at least one of the 3 columns return the row[1] else remove it)

The other case

*if** Product == Product ([0] AX1 == [4] AX1), then go for Buyer == Buyer ([0]HTN == [4]HTN, since they match go for the next row with the same product name. Product == Product ([4]AX1 ==[8] AX1), then Buyer == Buyer ([4]HTN == [8]HTN), since they match and no new row with the same product name, search in row [8] for "ended" in Last24 or Next24 or Summary).

It is a bit complicated, Iknow.

Upvotes: 1

Views: 76

Answers (1)

jezrael
jezrael

Reputation: 863741

I think you need Series.str.endswith for looking for string in 3 columns with DataFrame.any for test at least one True - one match per row and filter by boolean indexing:

cols = ['Lsat24','Next24','Summary']
df = df[df[cols].apply(lambda x : x.str.endswith('ended')).any(axis=1)]

print (df)
  Product      Date              Lsat24               Next24  \
2     NO6  2/1/2019  Request in-process      Closing request   
7     ZS5  3/1/2019       Request ended     New order placed   
8     AX1  4/1/2019   Logistic planning  Closing the request   

                           Summary Buyer  
2  Operation product request ended   KLW  
7       Product inventory counting   WHM  
8            Product request ended   HTN  

Upvotes: 4

Related Questions