Maryam Jalili
Maryam Jalili

Reputation: 83

Is there any fast way to convert all of the values in dataframe - python

There is a data frame like below:

A B
1 12
84 15
51 42
2 10

Each value shows the position of a string in a list. For example, list A=[Cat, Dog, Cow, ...] Therefore, the first value in column A should be Dog. How can I replace this values in this data frame fast. This data frame has more than 1 million rows. I wrote the code below, but it seems that it takes ages to run!!

for i in range (0, len(df)):
   a = df.iloc[i,0]
   df.iloc[i,0] = A[a]
   b = df.iloc[i,1]
   df.iloc[i,1] = B[b]

Upvotes: 1

Views: 354

Answers (3)

IoaTzimas
IoaTzimas

Reputation: 10624

You can use numpy which is much faster than Pandas. Try the following:

valsA=['Cat', 'Dog', 'Cow'] * 100
valsA=np.array(valsA)
valsB=['Dog', 'Cat', 'Cow'] * 100
valsB=np.array(valsB)

df['A']=valsA.take(df['A'])
df['B']=valsB.take(df['B'])

>>> print(df)

     A    B
0  Dog  Dog
1  Cat  Dog
2  Cat  Dog
3  Cow  Cat

Upvotes: 1

eroot163pi
eroot163pi

Reputation: 1815

You can Use apply.

Generally when using apply output is of type pd.Series, but when result_type='expand', the result of apply is unwrapped over columns and returns a pd.DataFrame

Below example is an illustration

>>> A = ['Cat', 'Dog', 'Cow']
>>> B = ['Catb', 'Dogb', 'Cowb']
>>> import pandas as pd
>>> df = pd.DataFrame([[1, 2]] * 3, columns=['A', 'B'])
>>> df.apply(lambda x: [A[x['A']], B[x['B']]], axis=1, result_type='expand')
     0    1
0  Dog  Cowb
1  Dog  Cowb
2  Dog  Cowb

Also one more method using map but without using lambda List comprehension vs map

>>> df['A'] = df['A'].map(A.__getitem__)
>>> df['B'] = df['B'].map(B.__getitem__)

Upvotes: 0

Daniel Wyatt
Daniel Wyatt

Reputation: 1151

So I don't believe your code is particularly bad from an efficiency point of view. It's likely to take a while given that you have such a large dataframe.

I would suggest though that the below code is more elegant when applying a function to a column in a dataframe:

df['A'] = df['A'].map(lambda x: A[x])
df['B'] = df['B'].map(lambda x: B[x])

Upvotes: 1

Related Questions