Reputation: 59
I have a csv file which contains 30 years of stocks data. A portion of it looks like this:
How can I extract the last trading date of a specific month of a specific stock, say the last trading month of June 2018 of stock s_0003?
This is my code, but it doesn't seem to be working for me because my output was 6/9/18. Is there something wrong with my code? Or is there another method I could use to extract the last trading date?
df = pd.read_csv(stocks)
new = df[['date', 's_0003']]
jun = new[pd.to_datetime(new['date']).dt.month == 6]
jun = jun[pd.to_datetime(jun['date']).dt.year == 2018]
lastday = (max(jun['date']))
stock0003 = df.loc[df["date"].isin([lastday]), "s_0003"]
Upvotes: 0
Views: 74
Reputation: 120409
Use Period
to extract the year-month
from date
, filter by the query string ('2018-06') and keep the last index.
QS = '2018-06'
df = pd.read_csv(stocks, parse_dates=['date']).sort_values('date')
stock0003 = df.loc[df['date'].dt.to_period('M').eq(QS).tail(1).index, 's_0003']
sort_values
is not mandatory if your dates are already sorted in your csv file.
A better idea is to set date
column as index of your dataframe using pd.read_csv(..., index_col = 'date')
:
Upvotes: 0
Reputation: 10624
Try by replacing the last 2 lines with these:
lastday = (max(pd.to_datetime(jun['date']).dt.day))
stock0003 = jun.loc[pd.to_datetime(jun['date']).dt.day == lastday, "s_0003"].iloc[0]
Upvotes: 1
Reputation: 26
We can just use tail/head depending on the sort order.
df = pd.read_csv(stocks)
df = df.sort_values('date')
print(df.tail(1))
This way we can be sure that the dataframe is sorted is ascending order before we take the last element.
Ref:
Upvotes: 0