Reputation: 1190
I have the following df:
Date
2015-11-27 105.449997
2015-11-30 104.239998
2015-12-01 107.120003
2015-12-02 106.070000
2015-12-03 104.379997
...
2020-11-18 271.970001
2020-11-19 272.940002
2020-11-20 269.700012
2020-11-23 268.429993
2020-11-24 276.920013
Name: Close, Length: 1258, dtype: float64
What I am trying to do is find for how many days in a row has the Close
closed lower than the previous day. Could you please advise how I can go about this?
At this point it would say Close
was lower for the last 3 days on Date
Date Close Days
10/30/2020 263.109985 0.0
11/2/2020 261.359985 1.0
11/3/2020 265.299988 0.0
11/4/2020 287.380005 1.0
11/5/2020 294.679993 0.0
11/6/2020 293.410004 1.0
11/9/2020 278.769989 0.0
11/10/2020 272.429993 1.0
11/11/2020 276.480011 0.0
11/12/2020 275.079987 1.0
11/13/2020 276.950012 0.0
11/16/2020 278.959991 0.0
11/17/2020 275.000000 1.0
11/18/2020 271.970001 2.0
11/19/2020 272.940002 0.0
11/20/2020 278.000000 1.0
11/23/2020 277.000000 2.0
11/24/2020 276.920013 3.0
How can achieve this?
Upvotes: 1
Views: 774
Reputation: 793
I just made this and it should work for you.
import pandas as pd
columns = ["Date", "Close"]
data = [
["10/30/2020", 263.109985],
["11/02/2020", 261.359985],
["11/03/2020", 265.299988],
["11/04/2020", 287.380005],
["11/05/2020", 294.679993],
["11/06/2020", 293.410004],
["11/09/2020", 278.769989],
["11/10/2020", 272.429993],
["11/11/2020", 276.480011],
["11/12/2020", 275.079987],
["11/13/2020", 276.950012],
["11/16/2020", 278.959991],
["11/17/2020", 275.000000],
["11/18/2020", 271.970001],
["11/19/2020", 272.940002],
["11/20/2020", 278.000000],
["11/23/2020", 277.000000],
["11/24/2020", 276.920013],
]
df = pd.DataFrame(data, columns=columns)
NewList = []
counter = 0
for i in range(len(df["Close"])):
print(i)
if i == 0:
NewList.append(0)
continue
previous = int(i)-1
if df["Close"][i] > df["Close"][previous]:
counter += 1
NewList.append(counter)
else:
counter = 0
NewList.append(counter)
df["Days"] = NewList
Upvotes: 0
Reputation: 30971
Use the following code:
# Is the current "Close" increasing or same (compared with previous row)
df['incr'] = df.Close >= df.Close.shift(fill_value=0)
# Generate the result column
df['DaysDecr'] = df.groupby(df.incr.cumsum()).apply(
lambda grp: (~grp.incr).cumsum()).reset_index(level=0, drop=True)
# Drop "incr" column
df.drop(columns='incr', inplace=True)
For your data sample the result is:
Date Close Days DaysDecr
0 2020-10-30 263.109985 0.0 0
1 2020-11-02 261.359985 1.0 1
2 2020-11-03 265.299988 0.0 0
3 2020-11-04 287.380005 1.0 0
4 2020-11-05 294.679993 0.0 0
5 2020-11-06 293.410004 1.0 1
6 2020-11-09 278.769989 0.0 2
7 2020-11-10 272.429993 1.0 3
8 2020-11-11 276.480011 0.0 0
9 2020-11-12 275.079987 1.0 1
10 2020-11-13 276.950012 0.0 0
11 2020-11-16 278.959991 0.0 0
12 2020-11-17 275.000000 1.0 1
13 2020-11-18 271.970001 2.0 2
14 2020-11-19 272.940002 0.0 0
15 2020-11-20 278.000000 1.0 0
16 2020-11-23 277.000000 2.0 1
17 2020-11-24 276.920013 3.0 2
To trace how this code works, generate the grouping:
df['incr'] = df.Close >= df.Close.shift(fill_value=0)
gr = df.groupby(df.incr.cumsum())
and print each group:
for key, grp in gr:
print(f'\nGroup: {key}\n{grp}')
Note that incr column in each group starts with a single True (the date when Close either increased or stayed the same) and then there is a sequence of False (days when Close was lower that on the previous date).
So the result can be generated by negation of incr and cumsum() of it.
And the only thing to do is to drop incr column (not needed any more).
Upvotes: 5
Reputation: 1174
Here a solution by using the function apply
to calculate this value for a given row.
import pandas as pd
columns = ["Date", "Close", "Days"]
data = [
["10/30/2020", 263.109985, 0.0],
["11/02/2020", 261.359985, 1.0],
["11/03/2020", 265.299988, 0.0],
["11/04/2020", 287.380005, 1.0],
["11/05/2020", 294.679993, 0.0],
["11/06/2020", 293.410004, 1.0],
["11/09/2020", 278.769989, 0.0],
["11/10/2020", 272.429993, 1.0],
["11/11/2020", 276.480011, 0.0],
["11/12/2020", 275.079987, 1.0],
["11/13/2020", 276.950012, 0.0],
["11/16/2020", 278.959991, 0.0],
["11/17/2020", 275.000000, 1.0],
["11/18/2020", 271.970001, 2.0],
["11/19/2020", 272.940002, 0.0],
["11/20/2020", 278.000000, 1.0],
["11/23/2020", 277.000000, 2.0],
["11/24/2020", 276.920013, 3.0]
]
# Load from data
df = pd.DataFrame(data, columns=columns)
# Convert to datetime
df["Date"] = pd.to_datetime(df["Date"], format="%m/%d/%Y")
# Order by date
df.sort_values(["Date"], ascending=True, inplace=True)
def find_last(df, my_date, my_close, count):
previous_index = df[df["Date"]<my_date]["Date"].max()
previous_close = df[df["Date"]==previous_index]["Close"].max()
if df[df["Date"] < my_date]["Date"].count()>0 and my_close < previous_close:
return find_last(df[df["Date"]<my_date], previous_index, previous_close, count + 1)
else:
return count
df["days_calc"] = df.apply(
lambda row: find_last(df.copy(), row["Date"], row["Close"], 0), axis=1)
Output:
Date Close Days days_calc
0 2020-10-30 263.109985 0.0 0
1 2020-11-02 261.359985 1.0 1
2 2020-11-03 265.299988 0.0 0
3 2020-11-04 287.380005 1.0 0
4 2020-11-05 294.679993 0.0 0
5 2020-11-06 293.410004 1.0 1
6 2020-11-09 278.769989 0.0 2
7 2020-11-10 272.429993 1.0 3
8 2020-11-11 276.480011 0.0 0
9 2020-11-12 275.079987 1.0 1
10 2020-11-13 276.950012 0.0 0
11 2020-11-16 278.959991 0.0 0
12 2020-11-17 275.000000 1.0 1
13 2020-11-18 271.970001 2.0 2
14 2020-11-19 272.940002 0.0 0
15 2020-11-20 278.000000 1.0 0
16 2020-11-23 277.000000 2.0 1
17 2020-11-24 276.920013 3.0 2
Upvotes: 0
Reputation: 186
Here a simple quick and dirty solution for your problem. I put your sample data in a csv file and imported it using pandas
. The rest should be described in the comments of the code:
import pandas as pd
#import your data from a csv that contains two columns with the headers "Date" and "Close"
df = pd.read_csv("/your_path/sample_data.txt")
#choose the column "Close" and save it into the variable "close"
close = df["Close"]
#go through dataframe and count the days where they increase
counter = 0
for i in range(1, len(close)):
if close[i] > close[i-1]:
counter = counter + 1
else:
counter = 0
#print the amount of days that the closes increased
print("The closes have increased over the last " + str(counter) + " days")
Upvotes: 0