Reputation: 123
Currently I am trying to one hot encode a list of lists that contain single elements. What is a clean Pythonic way to go from representation 2 to representation 1? Additionally I would like to know a clean approach to go from representation 1 to representation 2.
Representation 1
[[1. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[0. 0. 0. 1. 0. 0.]
...
[0. 0. 1. 0. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[0. 0. 1. 0. 0. 0.]]
(256, 6)
Representation 2
[[0.]
[3.]
[3.]
...
[2.]
[3.]
[2.]]
(256, 1)
Upvotes: 2
Views: 505
Reputation: 3023
Using pure basic conditionnal list comprehension, for representation 1 to 2:
r1 = [[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1., 0.]]
len_r1l = len(r1[0]) # length of each sublist, here 6
r2 = [[0], [3], [4]]
r1_r2 = [[i] for l in r1 for i in range(len_r1l) if l[i]==1]
>>> [[0], [3], [4]]
and for representation 2 to 1:
r2_r1 = [[1. if i==idx[0] else 0 for i in range(len_r1l)] for idx in r2]
>>> [[1.0, 0, 0, 0, 0, 0],
[0, 0, 0, 1.0, 0, 0],
[0, 0, 0, 0, 1.0, 0]]
Equivalently by using numpy, with np.nonzero:
# convert to array
r1_np = np.asarray(r1)
r2_np = np.asarray(r2)
r1_r2 = np.nonzero(r1_np)[1]
>>> array([0, 3, 4])
r2_r1 = np.zeros_like(r1_np)
r2_r1[np.arange(r1_r2.shape[0]),r1_r2] = 1.
>>> array([[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1., 0.]])
then if you really want to keep it to list
use np.ndarray.tolist method:
r1_r2.tolist()
>>> [0, 3, 4]
r2_r1.tolist()
>>> [[1.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]]
Benchmarking these answers for intended input of size 256
clearly shows numpy's efficiency:
# representation 1 to 2
%timeit [[i] for l in r1 for i in range(len_r1l) if l[i]==1]
>>> 199 µs ± 431 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit np.nonzero(r1_np)[1]
>>> 13 µs ± 32.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# representation 2 to 1
%timeit [[1. if i==idx[0] else 0 for i in range(len_r1l)] for idx in r2]
>>> 243 µs ± 820 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit r2_r1 = np.zeros_like(r1_np); r2_r1[np.arange(r1_r2.shape[0]),r1_r2] = 1.
>>> 9.42 µs ± 15.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Hope this helps.
Upvotes: 3
Reputation: 1744
Using numpy,
rep_2 = np.where(condition)[1].reshape(rep1.shape[0], 1)
where condition
could be stated in many ways, of which are:
Depending upon your requirement. Convert rep_2 to a list if you so wish.
Upvotes: 0
Reputation: 1012
Representation 1 --> 2:
If you know that every list will have one and only one 1
, you can use list.index
in a list comprehension:
list_of_lists = [ # Your initial list
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]
]
list_of_ones_indices = [[lst.index(1)] for lst in list_of_lists]
# [0, 1, 2]
Representation 2 --> 1:
This numpy solution might be closer to what you're looking for. If you want a pure-Python solution, here you go:
index_list = [1, 2, 3]
LENGTH = 6
one_hot_list = []
# This can also be achieved with a list comprehension and range()
for index in index_list:
one_hot = [0] * LENGTH
one_hot[index[0]] = 1
one_hot_list.append(one_hot)
print(one_hot_list)
# [
# [0, 1, 0, 0, 0, 0],
# [0, 0, 1, 0, 0, 0],
# [0, 0, 0, 1, 0, 0]
# ]
Upvotes: 1
Reputation: 153500
IIUC,
np.argmax(a, axis=1)[:, None]
Using @Yacola setup:
r1 = [[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1., 0.]]
a = np.array(r1)
np.argmax(a, axis=1)[:, None]
Output:
array([[0],
[3],
[4]])
Upvotes: 0
Reputation: 11496
For converting from representation 2 to representation 1, you can use something like keras.np_utils.to_categorical
:
>>> y = [0, 1, 2]
>>> np_utils.to_categorical(y)
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
Upvotes: 0