CristiArde
CristiArde

Reputation: 31

Python - numpy array syntax confusion?

I'm following this SVM classifier example code from a book that I'm reading.

I'm new to Python and have hard time understanding/visualizing all these arrays syntax [:,1] [:,:-1]. Could someone please explain what are the last 3 lines of code supposed to mean/do. I will greatly appreciate it.

Convert string data to numerical data
label_encoder = []
X_encoded = np.empty(X.shape)
for i,item in enumerate(X[0]):
    if item.isdigit():
       X_encoded[:, i] = X[:, i]
    else:
      label_encoder.append(preprocessing.LabelEncoder())
      X_encoded[:, i] = label_encoder[-1].fit_transform(X[:, i])

 X = X_encoded[:, :-1].astype(int)
 y = X_encoded[:, -1].astype(int)

Upvotes: 0

Views: 189

Answers (1)

Arun Karunagath
Arun Karunagath

Reputation: 1578

Numpy arrays allows capabilities way beyond what a python list could do.

Also , in numpy slicing is to denote the dimension of the array.

consider a 3x3 matrix, it had 2 dimensions. Let us see how some operations feels like in python lists and numpy arrays

>>> import numpy as np
>>> py = [[1,2,3],[3,4,5],[4,5,6]]
>>> npa = np.array(py)
>> py[1:3] # [[3, 4, 5], [4, 5, 6]]
>> npa[1:3] # array([[3, 4, 5], [4, 5, 6]])
>>> # get column 2 and 3 from rows 2 and 3 
>>> npa[1:3, 1:3] # row, col

Assuming you are not familiar with list indexing/slicing

py[:] # means every element in the array, also is a shorthand to create a copy

Taking it forward, npa[:,1] will give you an array with every row's ([:,) second column (,1]). ie array([2,4,5])

Similarly, npa[:,:-1] will give an array with every column except last one (,:-1]) for every rows ([:,). ie array([[1,2],[3,4], [4,5]])

Reference is here: https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

Upvotes: 1

Related Questions