Reputation: 1
I'm kind of a rookie in python so sorry in advance if I'll raise trivial questions.
I need to create a DataFrame from existing one. my new df will take input from the user and should return a specific new data frame. In my raw data, I have 5 columns of stock info : 'open', 'close', 'high', 'low' and 'volume', and a 'timestamp' column which I already converted to datetime properly.
my user is given the option to pick one or more of the 5 data columns (i.e 'open', 'close', 'high', 'low' and 'volume'), A ticker name and a range of preferable dates. ( I succeeded with the ticker name and the time) this is my output so far now i need to create a df that has another column - 'Data Type', that gets the request of the user, so my output shoud look, for instance, something like this:
timestamp ticker_name data type
2018-02-26 HP 67.35
2018-02-23 HP 701271
if the user asked to see only the 'open' and the 'volume' values of HP between
2018-02-23 to 2018-02-26 .
I've been struggling with this for Hours. any help will be highy appreciated! Thanks in advance!!
Upvotes: 0
Views: 2547
Reputation: 9008
It's a simple slice of the dataframe:
s = "2018-01-17"
e = "2018-02-24"
_type = "open"
ticker = "HP"
df[(df.index >= s) & (df.index <= e) & (df.ticker_name == ticker)][["ticker_name",_type]]
Upvotes: 0
Reputation: 294218
You're going to want a combination of things. IMO the simplest way is to set the Timestamp
column to be the index so you can use the timestamp slicing. Then use loc
to do the rest.
df.set_index('Timestamp')[start:end].loc[df.Ticker == 'HP', ['open', 'volume']]
Upvotes: 0
Reputation: 1210
First off, be careful because "open" has its own meaning in python. Make sure you are treating it like a string by using quotes.
If you want only columns ['open','close'], you would do return_df = df[['open','close']]
.
If you want only rows with timestamps between 2018-02-23 to 2018-02-26 you can do return_df = df[(df.timestamp >= '2018-02-03') & (df.timestamp <= '2018-02-26')]
.
Hope this helps.
Upvotes: 1