Python Difference between iloc indexes

Question

Newbie here.

Im trying to learn Python and work with datasets, ive kinda got thrown into the deep end at work. The language is clearly very powerful but very different to anything else ive experienced before.

I need some clarity / help / explanation on the following please.

Partial Algo code

history = data.history(context.stock_list, fields="price", bar_count=300, frequency="1d")
hs = h.iloc[-20:]
p = h.iloc[-1]

What is Difference Between 3 Variables Shown?

hs1 = history.iloc[:20]   
hs2 = history.iloc[20:]
hs3 = history.iloc[-20]

history creates a data sets of 4 asset prices, as can be seen from image under "additional info"

Ive researched and learned data iloc is a pandas indexing and referencing function

However what I do not understand is the [:20], [20:], [-20] indexes(?) attached to iloc function in the 3 example variables shown above

Questions

hs1 = history.iloc[:20], According to my research following python programming tutorial on pandas dataframe hs1 = history.iloc[:20] singles out deletes the first 20 columns within the dataframe, is this correct?
hs2 = history.iloc[:20] What is difference to above variable?
hs3 = history.iloc[-20] Why a minus - and no : inside the index?

Additional Info

History variable creates dataset of 3 assets

Hope this makes sense, please comment if you need any additional info any help and advice much appreciated.

cs95 · Accepted Answer

Before beginning anything else, I recommend reading Understanding Python's slice notation to get a first class insight on how python's slicing notation works. In particular, look at the different slice modes available to you:

a[start:end] # items start through end-1
a[start:]    # items start through the rest of the array
a[:end]      # items from the beginning through end-1
a[:]         # a copy of the whole array

a[start:end] returns a sub-slice from index start (inclusive) to end - 1
```
>>> lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> lst[2:5]
[3, 4, 5]
```
a[start:] returns a sub-slice from start till the end of the list.
```
>>> lst[5:]
[6, 7, 8, 9, 10]
```
a[:end] returns a sub-slice from the beginning of the list till end - 1.
```
>>> lst[:5]
[1, 2, 3, 4, 5]
```
a[:] just returns a new copy of the same list.
```
>>> lst[:]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
```

Understand this, and you've understood dataframe indexing.

As I've already mentioned, iloc is used to select dataframe subslices by their index, and the same rules apply. Here's the documentation:

DataFrame.iloc

Purely integer-location based indexing for selection by position.

.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.

It's a bit much to take in, but the pandas cookbook makes it simple. The basic syntax is:

df.iloc[x, y]

Where x is the row index/slice and y is the column index/slice. If the second argument is omitted, row slicing is assumed. In your case, you have:

history.iloc[:20] which returns the first 20 rows.
history.iloc[20:] which returns everything after the first 20 rows.
history.iloc[-20], which is interpreted as history.iloc[len(history) - 20] which is the 20th row from the end (negative indices specify indexing from the end).

Consider a dataframe:

Here are the different slice modes in action.

df.iloc[:5]

   A
0  0
1  1
2  2
3  3
4  4

df.iloc[5:]

   A
5  5
6  6
7  7
8  8
9  9

df.iloc[-5]

A    5
Name: 5, dtype: int64

References

Python Difference between iloc indexes

Answers (1)

`DataFrame.iloc`

Related Questions