Paul Corcoran
Paul Corcoran

Reputation: 101

replace pandas column value with increments of numbers

import pandas as pd
data = {'Account':['Paul','Jenn']}


df = pd.DataFrame(data=data)

enter image description here

The desired output would be 1 for paul and 2 for Jenn, the basis of the solution would form a for loop for a much bigger dataset to replace account number names with numeric values

Upvotes: 0

Views: 412

Answers (3)

Ranudar
Ranudar

Reputation: 115

I think the question can be solved quite elegantly with the pd.DataFrame.rank-Method

import pandas as pd
data = {'Account':['Paul','Jenn','Paul','Jamie','Woo']}
df = pd.DataFrame(data=data)
df['Account_new'] = df.Account.rank(method='dense').astype(int)

Output:

Account Account_new
0 Paul 3
1 Jenn 2
2 Paul 3
3 Jamie 1
4 Woo 4

Upvotes: 0

Richard Kraus
Richard Kraus

Reputation: 126

Not entirely sure what you're trying to do, please futher elaborate if i misunderstood the question :) If you want to replace the 'name' column with index incremented by a value:

df['name'] = df.index + value

Increase the values of a column (or do any other arithmetical operation over a column) :

df['column name'] += value
# add do that over a column and add it to another column
df['result'] += df['other column'] + value

Count the number of occurences for each value in column name

df = pd.DataFrame({"name": ["Paul", "Jenn", "Paul"]})
# count the number of occurences for each name
df["count"] = df.groupby("name")['name'].transform('count')
# in case you don't want duplicate rows
df.drop_duplicates(inplace=True)

Upvotes: 0

Zach Flanders
Zach Flanders

Reputation: 1304

You could do something like this: First create a dictionary mapping of the unique names in Accounts to the number ordered by how they appear. Then use .replace() to replace the values in the series with this number. This will ensure that Paul is always replaced by 1 if it appears more than once and Jenn is replaced by 2 if it appears more than once, etc.

import pandas
import json


data = {'Account':['Paul','Jenn']}
df = pandas.DataFrame(data=data)

name_mapping = json.loads(pandas.Series(
    index=df.Account.unique(),
    data=range(1, len(df.Account.unique()) + 1)
).to_json())

df.Account = df.Account.replace(name_mapping)

Output:

>>> df
   Account
0        1
1        2

Upvotes: 1

Related Questions