IanHacker
IanHacker

Reputation: 581

How to subtract a number from all elements in a DataFrame with pandas?

I'm trying to subtract a number from all elements in a DataFrame with pandas. However, only the 1st element is subtracted and the others get NaN.

Here's the data: DataFrame_3x5.csv

A   B   C
0.1 0.3 0.5
0.2 0.4 0.6
0.3 0.5 0.7
0.4 0.6 0.8
0.5 0.7 0.9

Here's my code:

import pandas as pd
data = pd.read_csv(r"DataFrame_3x5.csv")
df = pd.DataFrame(data)
medianList = pd.DataFrame()

for i in range(0, data.shape[1]):
  medianList = medianList.append([df.iloc[:,i].median()], ignore_index=True)

for i in range(0, data.shape[1]):
  print(data.iloc[:,i])
  print(medianList.iloc[i])
  print(data.iloc[:,i] - medianList.iloc[i])
  # print(data.iloc[:,i].sub([medianList.iloc[i]], axis='columns')) # doesn't work

Here's the result:

0    0.1
1    0.2
2    0.3
3    0.4
4    0.5
Name: A, dtype: float64
0    0.3
Name: 0, dtype: float64
0   -0.2
1    NaN
2    NaN
3    NaN
4    NaN
dtype: float64
0    0.3
1    0.4
2    0.5
3    0.6
4    0.7
Name: B, dtype: float64
0    0.5
Name: 1, dtype: float64
0   -0.2
1    NaN
2    NaN
3    NaN
4    NaN
dtype: float64
0    0.5
1    0.6
2    0.7
3    0.8
4    0.9
Name: C, dtype: float64
0    0.7
Name: 2, dtype: float64
0   -0.2
1    NaN
2    NaN
3    NaN
4    NaN
dtype: float64

My expectation:

0   -0.2
1   -0.1
2    0.0
3    0.1
4    0.2

According to this site,

print(data.iloc[:,i].sub([medianList.iloc[i]], axis='columns'))

... should work, but actually it produces an error:

ValueError: No axis named columns for object type <class 'pandas.core.series.Series'>

I don't know what to do any more. Please help me. Thank you.

Upvotes: 0

Views: 3481

Answers (4)

Vivs
Vivs

Reputation: 485

import pandas as pd
data = pd.read_csv(r"DataFrame_3x5.csv")
df = pd.DataFrame(data)
medianList = pd.DataFrame()
for i in range(0, data.shape[1]):
  medianList = medianList.append([df.iloc[:,i].median()], ignore_index=True)

df1 = pd.DataFrame(columns=['A'])
j=0
for i in range(0, data.shape[0]):
    print(data['A'].iloc[i]) # one column
    print(medianList.iloc[i])  #1 value
    print(data['A'].iloc[i] - medianList.iloc[j])

Upvotes: 0

Ran Cohen
Ran Cohen

Reputation: 751

A simple solution:

import pandas as pd
df = pd.read_csv(r"DataFrame_3x5.csv")

df['A'] - df['A'].median()

enter image description here

Upvotes: 1

Ayoub ZAROU
Ayoub ZAROU

Reputation: 2417

you could do:

df - df.median(axis=0)

and pandas would take care of the axis used to compute the values

Upvotes: 2

Arpit
Arpit

Reputation: 394

I think if you first try dropna and then simply subtract it can work

df=df.dropna(how='any')
df['Sub']=int(df['A']) - int(df['B']) - int(df['C'])

Upvotes: 0

Related Questions