Reputation: 1598
Newbie here.
Im trying to learn Python and work with datasets, ive kinda got thrown into the deep end at work. The language is clearly very powerful but very different to anything else ive experienced before.
I need some clarity / help / explanation on the following please.
Partial Algo code
history = data.history(context.stock_list, fields="price", bar_count=300, frequency="1d")
hs = h.iloc[-20:]
p = h.iloc[-1]
What is Difference Between 3 Variables Shown?
hs1 = history.iloc[:20]
hs2 = history.iloc[20:]
hs3 = history.iloc[-20]
history
creates a data sets of 4 asset prices, as can be seen from image under "additional info"
Ive researched and learned data iloc
is a pandas indexing and referencing function
However what I do not understand is the [:20]
, [20:]
, [-20]
indexes(?) attached to iloc
function in the 3 example variables shown above
Questions
hs1 = history.iloc[:20]
, According to my research following python programming tutorial on pandas dataframe hs1 = history.iloc[:20]
singles out deletes the first 20 columns within the dataframe, is this correct?hs2 = history.iloc[:20]
What is difference to above variable?hs3 = history.iloc[-20]
Why a minus -
and no :
inside the index?Additional Info
History variable creates dataset of 3 assets
Hope this makes sense, please comment if you need any additional info any help and advice much appreciated.
Upvotes: 3
Views: 1633
Reputation: 402333
Before beginning anything else, I recommend reading Understanding Python's slice notation to get a first class insight on how python's slicing notation works. In particular, look at the different slice modes available to you:
a[start:end] # items start through end-1 a[start:] # items start through the rest of the array a[:end] # items from the beginning through end-1 a[:] # a copy of the whole array
a[start:end]
returns a sub-slice from index start
(inclusive) to end - 1
>>> lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> lst[2:5]
[3, 4, 5]
a[start:]
returns a sub-slice from start
till the end of the list.
>>> lst[5:]
[6, 7, 8, 9, 10]
a[:end]
returns a sub-slice from the beginning of the list till end - 1
.
>>> lst[:5]
[1, 2, 3, 4, 5]
a[:]
just returns a new copy of the same list.
>>> lst[:]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Understand this, and you've understood dataframe indexing.
As I've already mentioned, iloc
is used to select dataframe subslices by their index, and the same rules apply. Here's the documentation:
DataFrame.iloc
Purely integer-location based indexing for selection by position.
.iloc[]
is primarily integer position based (from0
tolength-1
of the axis), but may also be used with a boolean array.
It's a bit much to take in, but the pandas cookbook makes it simple. The basic syntax is:
df.iloc[x, y]
Where x
is the row index/slice and y
is the column index/slice. If the second argument is omitted, row slicing is assumed. In your case, you have:
history.iloc[:20]
which returns the first 20 rows.
history.iloc[20:]
which returns everything after the first 20 rows.
history.iloc[-20]
, which is interpreted as history.iloc[len(history) - 20]
which is the 20th row from the end (negative indices specify indexing from the end).
Consider a dataframe:
df
A
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
Here are the different slice modes in action.
df.iloc[:5]
A
0 0
1 1
2 2
3 3
4 4
df.iloc[5:]
A
5 5
6 6
7 7
8 8
9 9
df.iloc[-5]
A 5
Name: 5, dtype: int64
References
Upvotes: 3