benito.cano
benito.cano

Reputation: 857

How to find the lows of a stock for each trading day?

So I have a csv with minute stock data for Microsoft. I am trying to find the low of each trading day. The code looks like:

ticker='MSFT'
df = pd.read_csv('/Volumes/Seagate Portable/S&P 500 List/{}.txt'.format(ticker))
df.columns = ['Extra', 'Dates', 'Open', 'High', 'Low', 'Close', 'Volume']
df.Dates = pd.to_datetime(df.Dates)
df.set_index(df.Dates, inplace=True)
df.drop(['Extra', 'High', 'Volume', 'Dates', 'Open'], axis=1, inplace=True)
df = df.between_time('9:30', '16:00')
df['Low'] = df.Low.groupby(by=[df.index.day]).min()
df

The output is:

                     Low    Close
Dates       
2020-01-02 09:30:00 NaN 158.610
2020-01-02 09:31:00 NaN 158.380
2020-01-02 09:32:00 NaN 158.620
2020-01-02 09:33:00 NaN 158.692
2020-01-02 09:34:00 NaN 158.910
... ... ...
2020-12-18 15:56:00 NaN 218.700
2020-12-18 15:57:00 NaN 218.540
2020-12-18 15:58:00 NaN 218.710
2020-12-18 15:59:00 NaN 218.150
2020-12-18 16:00:00 NaN 218.500

So the issue is that the lows are filled with NaN values, which I am asuming is because I am miss using groupby. I have also tried:

ticker='MSFT'
df = pd.read_csv('/Volumes/Seagate Portable/S&P 500 List/{}.txt'.format(ticker))
df.columns = ['Extra', 'Dates', 'Open', 'High', 'Low', 'Close', 'Volume']
df.Dates = pd.to_datetime(df.Dates)
df.set_index(df.Dates, inplace=True)
df.drop(['Extra', 'High', 'Volume', 'Dates', 'Open'], axis=1, inplace=True)
df = df.between_time('9:30', '16:00')
df = df.groupby(by=[df.index.day]).min()
df

The output to this is:

         Low    Close
Dates       
1   150.8200    150.9800
2   150.3600    150.8400
3   152.1900    152.2800
4   165.6200    165.7000
5   165.6900    165.8200
6   156.0000    156.0700
7   157.3200    157.3500
8   157.9491    158.0000
9   150.0000    150.2700
10  152.5800    152.7950
11  151.1500    151.1930
12  138.5800    138.7600
13  140.7300    140.8700
14  161.7200    161.7500
15  162.5700    162.6300
16  135.0000    135.3300
17  135.0000    135.3400
18  135.0200    135.2600
19  139.0000    139.1300
20  135.8600    136.5900
21  166.1102    166.2100
22  165.6800    165.6900
23  132.5200    132.7100
24  141.2700    141.6481
25  144.4400    144.8102
26  148.3700    149.7000
27  149.2000    149.2700
28  152.0000    153.8152
29  165.6900    165.7952
30  150.0100    152.7200
31  156.5600    157.0450

The issue with this is that it is finding the low of of both Close and Open. Also there are only 31 total rows, though there should be more giving this is a dataset of all of 2020. I assuming in doing this I am grouping wrong because I looked at the close prices of each day for the first 31 days, and there is no way that these are the lows of each of those days. So the questions is how can I find the lows of each day, without affecting the Close columns, and avoiding the issues mentioned above?

Upvotes: 1

Views: 223

Answers (1)

ljuk
ljuk

Reputation: 784

Try this:

unique_dates = list(set([str(date).split()[0] for date in df.index]))

min_values_daily = [min(df.loc[df.index==date].Close) for date in unique_dates] 

And finally, create a new dataframe:

low_data = pd.DataFrame({
     'date': unique_dates,
     'low': min_values_daily
})

Upvotes: 1

Related Questions