Reputation: 126
I have a few DataFrame's which I am trying to add a new column with calculated values from the previous row and current row.
the problem is I get the same result for each row (when it needs to be different)
I made a function that gets a CSV turns it to df and do the changes.
def index_csv_update(file_path):
df = pd.read_csv(file_path)
df.drop(["Adj Close"], axis=1, inplace=True)
df.drop(["Volume"], axis=1, inplace=True)
print(df.head())
start = 0
for i, row in df.iterrows():
first_row = df.iloc[start, 4]
try:
second_row = df.iloc[start + 1, 4]
except IndexError:
second_row = 'null'
if not second_row == 'null':
df['Daily Change %'] = np.float((second_row/first_row)-1)
start += 1
df.to_csv(file_path, index=False)
print result:
Date Open High Low Close Daily Change %
0 2018-07-09 13.02 13.22 12.60 12.69 0.011575
1 2018-07-10 12.52 13.21 11.93 12.64 0.011575
2 2018-07-11 14.05 14.15 13.09 13.63 0.011575
3 2018-07-12 13.07 13.33 12.42 12.58 0.011575
4 2018-07-13 12.39 12.97 11.62 12.18 0.011575
on the Daily Change %
column there should be different numbers.
I can't find the problem,
please help thanks.
Upvotes: 0
Views: 202
Reputation: 1306
you see when you are using
df['Daily Change %'] = np.float((second_row/first_row)-1)
You creating a new column with the value .. what u need to use is loc or iloc.
import pandas as pd
import numpy as np
df = pd.DataFrame({'foo': [1,2,3,3], 'bar': [1.2,2.3,3.5,1.3], 'foo2':[None, None, None, None]})
print(df)
start=0
for i, row in df.iterrows():
first_row = df.iloc[start, 1]
try:
second_row = df.iloc[start + 1, 1]
except IndexError:
second_row = 'null'
if not second_row == 'null':
df.iloc[start,2] = np.float((second_row/first_row)-1)
start += 1
print(df)
outputs:
foo bar foo2
0 1 1.2 None
1 2 2.3 None
2 3 3.5 None
3 3 1.3 None
foo bar foo2
0 1 1.2 0.916667
1 2 2.3 0.521739
2 3 3.5 -0.628571
3 3 1.3 None
Upvotes: 1