One-Hot编码(One-Hot Encoding)是一种常用的编码方法,用于将分类变量转换为数值变量。通俗点说,若变量总共分为类,则One-Hot编码会将变量转换为
维的向量,其中只有一维为1,其余维度为0。
标准代码如下
def to_categorical(x, n_col=None):
# One-hot encoding of nominal values
# If n_col is not provided, determine the number of columns from the input array
if not n_col:
n_col = np.amax(x) + 1
# Initialize a matrix of zeros with shape (number of samples, n_col)
one_hot = np.zeros((x.shape[0], n_col))
# Set the appropriate elements to 1
one_hot[np.arange(x.shape[0]), x] = 1
return one_hot