## Question or problem about Python programming:

I would like to have the norm of one NumPy array. More specifically, I am looking for an equivalent version of this function

def normalize(v): norm = np.linalg.norm(v) if norm == 0: return v return v / norm

Is there something like that in skearn or numpy?

This function works in a situation where v is the 0 vector.

## How to solve the problem:

### Solution 1:

If you’re using scikit-learn you can use `sklearn.preprocessing.normalize`

:

import numpy as np from sklearn.preprocessing import normalize x = np.random.rand(1000)*10 norm1 = x / np.linalg.norm(x) norm2 = normalize(x[:,np.newaxis], axis=0).ravel() print np.all(norm1 == norm2) # True

### Solution 2:

I would agree that it were nice if such a function was part of the included batteries. But it isn’t, as far as I know. Here is a version for arbitrary axes, and giving optimal performance.

import numpy as np def normalized(a, axis=-1, order=2): l2 = np.atleast_1d(np.linalg.norm(a, order, axis)) l2[l2==0] = 1 return a / np.expand_dims(l2, axis) A = np.random.randn(3,3,3) print(normalized(A,0)) print(normalized(A,1)) print(normalized(A,2)) print(normalized(np.arange(3)[:,None])) print(normalized(np.arange(3)))

### Solution 3:

You can specify ord to get the L1 norm.

To avoid zero division I use eps, but that’s maybe not great.

def normalize(v): norm=np.linalg.norm(v, ord=1) if norm==0: norm=np.finfo(v.dtype).eps return v/norm

### Solution 4:

This might also work for you

import numpy as np normalized_v = v / np.sqrt(np.sum(v**2))

but fails when `v`

has length 0.

### Solution 5:

If you have multidimensional data and want each axis normalized to its max or its sum:

def normalize(_d, to_sum=True, copy=True): # d is a (n x dimension) np array d = _d if not copy else np.copy(_d) d -= np.min(d, axis=0) d /= (np.sum(d, axis=0) if to_sum else np.ptp(d, axis=0)) return d

Uses numpys peak to peak function.

a = np.random.random((5, 3)) b = normalize(a, copy=False) b.sum(axis=0) # array([1., 1., 1.]), the rows sum to 1 c = normalize(a, to_sum=False, copy=False) c.max(axis=0) # array([1., 1., 1.]), the max of each row is 1