python - generate sequence by indices / one-hot encoding -
i have sequence s = [4,3,1,0,5]
, num_classes = 6
, want generate numpy matrix m
of shape (len(s), num_classes)
m[i,j] = 1 if s[i] == j else 0
.
is there such function in numpy, can pass s
, num_classes
?
this called 1-of-k or one-hot encoding.
timeit
results:
def b(): m = np.zeros((len(s), num_classes)) m[np.arange(len(s)), s] = 1 return m in [57]: timeit.timeit(lambda: b(), number=1000) out[57]: 0.012787103652954102 in [61]: timeit.timeit(lambda: (np.array(s)[:,none]==np.arange(num_classes))+0, number=1000) out[61]: 0.018411874771118164
since want single 1
per row, can fancy-index using arange(len(s))
along first axis, , using s
along second:
s = [4,3,1,0,5] n = len(s) k = 6 m = np.zeros((n, k)) m[np.arange(n), s] = 1 m => array([[ 0., 0., 0., 0., 1., 0.], [ 0., 0., 0., 1., 0., 0.], [ 0., 1., 0., 0., 0., 0.], [ 1., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 1.]]) m.nonzero() => (array([0, 1, 2, 3, 4]), array([4, 3, 1, 0, 5]))
this can thought of using index (0,4), (1,3), (2,1), (3,0), (4,5).
Comments
Post a Comment