One-Hot编码(One-Hot Encoding)是一种常用的编码方法,用于将分类变量转换为数值变量。通俗点说,若变量总共分为类,则One-Hot编码会将变量转换为维的向量,其中只有一维为1,其余维度为0。

标准代码如下

def to_categorical(x, n_col=None):
    # One-hot encoding of nominal values
    # If n_col is not provided, determine the number of columns from the input array
    if not n_col:
        n_col = np.amax(x) + 1
    # Initialize a matrix of zeros with shape (number of samples, n_col)
    one_hot = np.zeros((x.shape[0], n_col))
    # Set the appropriate elements to 1
    one_hot[np.arange(x.shape[0]), x] = 1
    return one_hot