user866364
user866364

Reputation:

Generate sequence of negative numbers

I have the following dataframe:

    userId  firstName   lastName        gender          level
61  -1  Not Provided    Not Provided    Not Provided    paid
100 -1  Not Provided    Not Provided    Not Provided    free

Both userId are -1 because I executed the code user_df['userId'] = user_df['userId'].replace(r'^\s*$', '-1', regex=True).

Is possible to set sequential negative numbers like -1, -2, ...?

Upvotes: 2

Views: 344

Answers (5)

jezrael
jezrael

Reputation: 862611

If want replace only empty strings use Series.str.contains for mask of this values and then add array with length by sum of Trues in boolean mask:

user_df = pd.DataFrame({'userId':['','','qq','']})

m = user_df['userId'].str.contains(r'^\s*$')

user_df.loc[m, 'userId'] = -pd.np.arange(1, m.sum() + 1)
print (user_df)
.  userId
0     -1
1     -2
2     qq
3     -3

Detail:

user_df.loc[m, 'userId'] = -pd.np.arange(1, m.sum() + 1)
print (m)
0     True
1     True
2    False
3     True
Name: userId, dtype: bool

print (m.sum())
3

print (-pd.np.arange(1, m.sum() + 1))
[-1 -2 -3]

Also here is possible import numpy what is required for pandas:

import numpy as np

m = user_df['userId'].str.contains(r'^\s*$')

user_df.loc[m, 'userId'] = -np.arange(1, m.sum() + 1)

Upvotes: 3

René
René

Reputation: 4827

You can set negative sequential index numbers with the range function.

df = pd.DataFrame({'userId': [-1, -1]}, index=[61, 100])
df.index = range(-1, -df.shape[0]-1 , -1)

Result:

    userId
-1      -1
-2      -1

Upvotes: 3

Umar.H
Umar.H

Reputation: 23099

could also use a groupby and subtract with a cumulative count, I'm assuming your userId is already set to -1

df['userId'] = df['userId'].sub(df.groupby(['userId']).cumcount())
print(df)
       userId       firstName          lastName            gender     level
61       -1    Not Provided      Not Provided      Not Provided      paid
100      -2   Not Provided       Not Provided      Not Provided      free

Upvotes: 2

Chris Adams
Chris Adams

Reputation: 18647

Another solution using groupby.cumsum:

user_df['userId'] = (user_df['userId'].replace(r'^\s*$', -1, regex=True)
                     .groupby(user_df['userId']).cumsum())

Upvotes: 2

manuhortet
manuhortet

Reputation: 509

Try:

user_df['userId'] = (df.index + 1) * -1

Upvotes: 2

Related Questions