Kmeans介绍
Kmeans聚类属于无监督学习算法,目的是将一组数据分成k组,称为k个簇,计算出这k组中每组的中心。
Kmeans算法思想
- 从数据集中随机选取k个点,作为初始化的簇中心。
- 计算每个点到簇中心的距离,并将该点分配到最近的簇中(与那一簇中心的距离最近)。
- 对于2步中重新分配好的簇,重新计算这个簇的中心(大概就是求横纵坐标的均值作为新的中心)。
- 重复2、3步,直到新计算的簇中心不再变化为止。
Kmeans应用于anchors box计算
anchors box用于预测bounding box,当anchor box更接近真实的宽高时,模型的性能越好。Kmeans应用在anchors box的计算就是为了计算出更接近真实宽高的k对值。与上边的Kmeans不同的是不能用欧式几何距离进行分类,而是采用IOU交并比来作为衡量每对值应该划分进哪一簇。IOU可以很好地表示出两对宽高的接近情况,IOU取值为[0, 1]之间,IOU越大就表示这两对宽高比越接近。在Kmeans里,距离就用1-IOU表示。
anchors: python ./make_anchor_list.py \ ${DATASET} \ --max_iters 10 \ --is_random True \ --in_hw ${IMGSIZE} \ --out_hw ${OUTSIZE} \ --anchor_num ${ANCNUM} \ --low ${LOW} \ --high ${HIGH}
make anchors DATASET=voc ANCNUM=2 LOW="0.0 0.0" HIGH="1.0 1.0"
def main(train_set: str, max_iters: int, in_hw: tuple, out_hw: tuple, anchor_num: int, is_random: bool, is_plot: bool, low: list, high: list): X = np.load(f'data/{train_set}_img_ann.npy', allow_pickle=True) in_wh = np.array(in_hw[::-1]) low = np.array(low) high = np.array(high) # NOTE correct boxes for i in range(len(X)): # X[i, 1], X[i, 2] img_wh = X[i, 2][::-1] """ calculate the affine transform factor """ scale = in_wh / img_wh # NOTE affine tranform sacle is [w,h] scale[:] = np.min(scale) # NOTE translation is [w offset,h offset] translation = ((in_wh - img_wh * scale) / 2).astype(int) """ calculate the box transform matrix """ X[i, 1][:, 1:3] = (X[i, 1][:, 1:3] * img_wh * scale + translation) / in_wh X[i, 1][:, 3:5] = (X[i, 1][:, 3:5] * img_wh * scale) / in_wh x = np.vstack(X[:, 1]) x = x[:, 3:] layers = len(out_hw) // 2 if is_random == 'True': initial_centroids = np.hstack((np.random.uniform(low[0], high[0], (layers * anchor_num, 1)), np.random.uniform(low[1], high[1], (layers * anchor_num, 1)))) else: initial_centroids = np.vstack((np.linspace(0.05, 0.3, num=layers * anchor_num), np.linspace(0.05, 0.5, num=layers * anchor_num))) initial_centroids = initial_centroids.T centroids, idx = runkMeans(x, initial_centroids, 10, is_plot) # NOTE : sort by descending , bigger value for layer 0 . centroids = np.array(sorted(centroids, key=lambda x: (-x[0]))) centroids = np.reshape(centroids, (layers, anchor_num, 2)) for l in range(layers): centroids[l] = centroids[l] # grid_wh[l] # NOTE centroids是相对于全局的0-1 if np.any(np.isnan(centroids)): print(ERROR, 'Result have NaN value please Rerun!') else: print(NOTE, f'Now anchors are :\n{centroids}') np.save(f'data/{train_set}_anchor.npy', centroids)
def parse_arguments(argv): parser = argparse.ArgumentParser() parser.add_argument('train_set', type=str, help=NOTE + 'this is train dataset name , the output *.npy file will be {train_set}_anchors.list') parser.add_argument('--max_iters', type=int, help='kmeans max iters', default=10) parser.add_argument('--is_random', type=str, help='wether random generate the center', choices=['True', 'False'], default='True') parser.add_argument('--is_plot', type=str, help='wether show the figure', choices=['True', 'False'], default='True') parser.add_argument('--in_hw', type=int, help='net work input image size', default=(224, 320), nargs='+') parser.add_argument('--out_hw', type=int, help='net work output image size', default=(7, 10, 14, 20), nargs='+') parser.add_argument('--low', type=float, help='Lower bound of random anchor, (x,y)', default=(0.0, 0.0), nargs='+') parser.add_argument('--high', type=float, help='Upper bound of random anchor, (x,y)', default=(1.0, 1.0), nargs='+') parser.add_argument('--anchor_num', type=int, help='single layer anchor nums', default=3) return parser.parse_args(argv)