Tfovid
Tfovid

Reputation: 843

What does the `order` argument mean in `tf.keras.utils.normalize()`?

Consider the following code:

import numpy as np

A = np.array([[.8, .6], [.1, 0]])
B1 = tf.keras.utils.normalize(A, axis=0, order=1)
B2 = tf.keras.utils.normalize(A, axis=0, order=2)

print('A:')
print(A)
print('B1:')
print(B1)
print('B2:')
print(B2)

which returns

A:
[[0.8 0.6]
 [0.1 0. ]]
B1:
[[0.88888889 1.        ]
 [0.11111111 0.        ]]
B2:
[[0.99227788 1.        ]
 [0.12403473 0.        ]]

I understand how B1 is computed via order=1 such that each entry in A is divided by the sum of the elements in its column. For example, 0.8 becomes 0.8/(0.8+0.1) = 0.888. However, I just can't figure out how order=2 produces B2 nor can I find any documentation about it.

Upvotes: 2

Views: 796

Answers (3)

Manoj Mohan
Manoj Mohan

Reputation: 6044

However, I just can't figure out how order=2 produces B2 nor can I find any documentation about it.

order=1 means L1 norm while order=2 means L2 norm. For L2 norm, You need to take the square root after summing the individual squares. Which elements to square depends on the axis.

Keras

A = np.array([[.8, .6], [.1, 0]])
B2 = tf.keras.utils.normalize(A, axis=0, order=2)
print(B2)

array([[0.99227788, 1.        ],
       [0.12403473, 0.        ]])

Manual

B2_manual = np.zeros((2,2))
B2_manual[0][0] = 0.8/np.sqrt(0.8 ** 2 + 0.1 ** 2)
B2_manual[1][0] = 0.1/np.sqrt(0.8 ** 2 + 0.1 ** 2)
B2_manual[0][1] = 0.6/np.sqrt(0.6 ** 2 + 0 ** 2)
B2_manual[1][1] =  0 /np.sqrt(0.6 ** 2 + 0 ** 2)
print(B2_manual)

array([[0.99227788, 1.        ],
       [0.12403473, 0.        ]])

You can look up the different types of Norm here: https://en.wikipedia.org/wiki/Norm_(mathematics) Worked examples: https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html

Upvotes: 1

Celius Stingher
Celius Stingher

Reputation: 18377

Passing order 2 in the order parameter, means you will be applying Tikhonov regularization commonly known as L2 or Ridge. L1 and L2 are different regularization techniques, both with pros and cons you can read in detail here in wikipedia and here in kaggle. The approach for L2 is to solve the standard equation for regresison, when calculating residual sum of squares adding an extra term λβTβ which is the square of the transposed matrix Beta (that's why it is called L2, because of the sqaure).

Upvotes: 0

valeriored83
valeriored83

Reputation: 141

Order 1 normalize the input such that the sum of the absolute value of all element is 1 (L1 norm of the input equals 1). Order 2 normalize the input such that the sum of the squared value of all element is 1 (L2 norm of the input equals 1).

Upvotes: 0

Related Questions