Luiz Matias
Luiz Matias

Reputation: 49

Create a column based on dif values from another column

I have a dataframe like this:

|    | Vowel   |   Number |
|---:|:--------|---------:|
|  0 | a       |        2 |
|  1 | b       |        3 |
|  2 | c       |        4 |
|  3 | a       |        4 |
|  4 | a       |        8 |
|  5 | b       |        2 |
|  6 | c       |        5 |
|  7 | c       |        9 |

I want to create a column with the diff values based on the column Vowel and Number. I want this output:

|    | Vowel   |   Number |   Diff |
|---:|:--------|---------:|-------:|
|  0 | a       |        2 |    nan |
|  1 | b       |        3 |    nan |
|  2 | c       |        4 |    nan |
|  3 | a       |        4 |      2 |
|  4 | a       |        8 |      4 |
|  5 | b       |        2 |     -1 |
|  6 | c       |        5 |      1 |
|  7 | c       |        9 |      4 |

So, looking for the value 'a' in Vowel Column, the first 'a' get the value nan because there is no values on column 'Number' before. The second 'a' gets the value 2 because 4 - 2 = 2. (Number Column).

I'm doing something like this:

for i in list(set(df['Vowel'])):
    one_vowel = df[df['Vowel'] == i]
    for n in one_vowel['Number'].diff():
        print(f'{i} and {n}')

result:

b and nan
b and -1.0
a and nan
a and 2.0
a and 4.0
c and nan
c and 1.0
c and 4.0

but I want to get the right order according to the column.

please, somebody help me?

Upvotes: 0

Views: 26

Answers (1)

sushanth
sushanth

Reputation: 8302

try this,

df['Diff'] = df.groupby('Vowel')['Number'].diff()

output,

0    NaN
1    NaN
2    NaN
3    2.0
4    4.0
5   -1.0
6    1.0
7    4.0
Name: Diff, dtype: float64

Upvotes: 1

Related Questions