Reputation: 21
data = np.loadtxt('In_file', dtype=np.float, delimiter=',')
x_test, y_test = np.split(data, (-1, ), axis=1)
What i konw is this line of code divided the data into two parts,but what does the parameter (-1,) mean?
Upvotes: 2
Views: 245
Reputation: 231530
Often when indexing, -1
means, from-the-end
. (-1,)
is a 1 element tuple.
Its meaning in this context is a little harder to imagine, but a simple test makes it clearer:
In [304]: x=np.arange(10)
In [305]: np.split(x, (-1,))
Out[305]: [array([0, 1, 2, 3, 4, 5, 6, 7, 8]), array([9])]
It split the array, with a last part that is 1 element long. Don't get confused by the tuple notation; it's really expecting a list, e.g. [-1]
:
In [307]: np.split(x, [-1])
Out[307]: [array([0, 1, 2, 3, 4, 5, 6, 7, 8]), array([9])]
We can split with 3 items in the last array, or 3 items in the first.
In [308]: np.split(x, [-3])
Out[308]: [array([0, 1, 2, 3, 4, 5, 6]), array([7, 8, 9])]
In [309]: np.split(x, [3])
Out[309]: [array([0, 1, 2]), array([3, 4, 5, 6, 7, 8, 9])]
Or a 3 way split, with 3 items in the first, 2 in the last:
In [311]: np.split(x, [3,-2])
Out[311]: [array([0, 1, 2]), array([3, 4, 5, 6, 7]), array([8, 9])]
this last split is actually performed with 3 indexing ranges:
In [313]: x[0:3],x[3:-2],x[-2:]
Out[313]: (array([0, 1, 2]), array([3, 4, 5, 6, 7]), array([8, 9]))
Your case is a 2d array, and it's doing the split on columns. So in effect y_test
is the last column, and x_test
is the rest.
Upvotes: 2
Reputation: 17074
Split the array along the column axis on the values in the tuple. (-1, )
will split it 2 parts.
import numpy as np
x = np.arange(9.0).reshape(3,3)
print x,'\n'
a=np.split(x, (-1, ), axis=1)
print a,'\n'
print a[0],'\n'
print a[1],'\n'
Output:
[[ 0. 1. 2.]
[ 3. 4. 5.]
[ 6. 7. 8.]]
[array([[ 0., 1.],
[ 3., 4.],
[ 6., 7.]]), array([[ 2.],
[ 5.],
[ 8.]])]
[[ 0. 1.]
[ 3. 4.]
[ 6. 7.]]
[[ 2.]
[ 5.]
[ 8.]]
Upvotes: 1
Reputation: 5714
numpy.split(ary, indices_or_sections, axis=0)
Indices_or_sections : int or 1-D array If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. If an index exceeds the dimension of the array along axis, an empty sub-array is returned correspondingly. Source documentation
Upvotes: 0