Reputation: 183
This is the seed DataSet:
In[1]: my_data =
[{'client':'A','product_s_n':'1','status':'in_store','month':'Jan'},
{'client':'A','product_s_n':'1','status':'sending', 'month':'Feb'},
{'client':'A','product_s_n':'2','status':'in_store','month':'Jan'},
{'client':'A','product_s_n':'2','status':'in_store','month':'Feb'},
{'client':'B','product_s_n':'3','status':'in_store','month':'Jan'},
{'client':'B','product_s_n':'3','status':'sending', 'month':'Feb'},
{'client':'B','product_s_n':'4','status':'in_store','month':'Jan'},
{'client':'B','product_s_n':'4','status':'in_store','month':'Feb'},
{'client':'C','product_s_n':'5','status':'in_store','month':'Jan'},
{'client':'C','product_s_n':'5','status':'sending', 'month':'Feb'}]
df = pd.DataFrame(my_data)
df
Out[1]:
client month product_s_n status
0 A Jan 1 in_store
1 A Feb 1 sending
2 A Jan 2 in_store
3 A Feb 2 in_store
4 B Jan 3 in_store
5 B Jan 4 in_store
6 B Feb 4 in_store
8 C Jan 5 sending
The question I want to ask this data is: what's the client for each product_serial_number? From the data in this example, this is how the resulting DataFrame would look like (I need a new DataFrame as a result):
product_s_n client
0 1 A
1 2 A
2 3 B
3 4 B
4 5 C
As you may have noticed, the 'status' and 'month' fields are just for 'giving sense' and structure to the data in this sample dataset. Tried using groupby, with no success. Any ideas?
Thanks!
Upvotes: 0
Views: 46
Reputation: 879869
After calling df.groupby(['product_s_n'])
you can restrict attention to a particular column by indexing with ['client']
. You can then select the first value of client
from each group by calling first()
.
>>> df.groupby(['product_s_n'])['client'].first()
product_s_n
1 A
2 A
3 B
4 B
5 C
Name: client, dtype: object
Upvotes: 2