Pravin Wagh
Pravin Wagh

Reputation: 39

Python Pandas Data Formatting: Difference in passing index and iloc

This is the sample data:

dict_country_gdp = pd.Series([52056.01781,40258.80862,40034.85063,39578.07441],
    index = ['Luxembourg','Norway', 'Japan', 'Switzerland'])

What is the difference between dict_country_gdp[0] and dict_country_gdp.iloc[0]?

While the result is the same, when to use which?

Upvotes: 2

Views: 89

Answers (1)

Sayali Sonawane
Sayali Sonawane

Reputation: 12599

As you are working with one dimensional series, [] or .iloc will give same results.

ONE DIMENSIONAL SERIES:

import pandas as pd

dict_country_gdp = pd.Series([52056.01781, 40258.80862,40034.85063,39578.07441])

dict_country_gdp

Out[]: 
0    52056.01781
1    40258.80862
2    40034.85063
3    39578.07441
dtype: float64

dict_country_gdp[0]
Out[]: 52056.017809999998

dict_country_gdp.iloc[0]
Out[]: 52056.017809999998   

MULTI-DIMENSIONAL SERIES:

dict_country_gdp = pd.Series([52056.01781, 40258.80862,40034.85063,39578.07441],[52056.01781, 40258.80862,40034.85063,39578.07441])

dict_country_gdp 
Out[]: 
52056.01781    52056.01781
40258.80862    40258.80862
40034.85063    40034.85063
39578.07441    39578.07441
dtype: float64

Now in this scenario, you cannot access series using [] operator.

dict_country_gdp[0]
Out[]: KeyError: 0.0

dict_country_gdp.iloc[0]
Out[]: 52056.017809999998 

iloc provides more control while accessing multidimensional series:

dict_country_gdp[0:2]
Out[]: Series([], dtype: float64)

dict_country_gdp.iloc[0:2]
Out[]: 
52056.01781    52056.01781
40258.80862    40258.80862
dtype: float64

Documentation states:

.iloc is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array. .iloc will raise IndexError if a requested indexer is out-of-bounds, except slice indexers which allow out-of-bounds indexing. (this conforms with python/numpy slice semantics). Allowed inputs are:

  • An integer e.g. 5
  • A list or array of integers [4, 3, 0]
  • A slice object with ints 1:7
  • A boolean array
  • A callable function with one argument (the calling Series, DataFrame or Panel) and that returns valid output for indexing (one of the above)

This is why one cannot use [] operator with dataframe objects. Only iloc can be used when it comes to dataframes and multidimensional series.

Upvotes: 1

Related Questions