How transform value pandas dataframe?

Question

I have data :

data = [
  (1, 'Shirt', 2),
  (1, 'Pants', 3),
  (2, 'Top', 2),
  (2, 'Shirt', 1),
  (2, 'T-Shirt', 4),
  (3, 'Shirt', 3),
  (3, 'T-Shirt', 2),
  (4, 'Top', 3),
  (4, 'Pants', 3),
  (4, 'T-Shirt', 3),
]

and I transform using pandas :

df = pd.DataFrame(data, columns=['unique_id', 'category_product', 'count'])

and matrix from df is :

    unique_id category_product  count
0          11            Shirt      2
1          11            Pants      3
2          24              Top      2
3          24            Shirt      1
4          24          T-Shirt      4
5          36            Shirt      3
6          36          T-Shirt      2
7          48              Top      3
8          48            Pants      3
9          48          T-Shirt      3

but I need change the unique_id start from 0, and increase in the order seen and result like :

   unique_id category_product  count
0          0            Shirt      2
1          0            Pants      3
2          1              Top      2
3          1            Shirt      1
4          1          T-Shirt      4
5          2            Shirt      3
6          2          T-Shirt      2
7          3              Top      3
8          3            Pants      3
9          3          T-Shirt      3

how can I do that?

Joachim Isaksson · Accepted Answer

There may be simpler ways, but here's one;

df.unique_id = (df.unique_id.diff() != 0).cumsum() - 1

Basically it just compares each row to the previous one, if the diff is != 0 it increases the output value by 1. The -1 at the end is to compensate for the leading NaN (where the first row has nothing to diff against)

How transform value pandas dataframe?

Answers (1)

Related Questions