Kmeans介绍
Kmeans聚类属于无监督学习算法,目的是将一组数据分成k组,称为k个簇,计算出这k组中每组的中心。
Kmeans算法思想
- 从数据集中随机选取k个点,作为初始化的簇中心。
- 计算每个点到簇中心的距离,并将该点分配到最近的簇中(与那一簇中心的距离最近)。
- 对于2步中重新分配好的簇,重新计算这个簇的中心(大概就是求横纵坐标的均值作为新的中心)。
- 重复2、3步,直到新计算的簇中心不再变化为止。
Kmeans应用于anchors box计算
anchors box用于预测bounding box,当anchor box更接近真实的宽高时,模型的性能越好。Kmeans应用在anchors box的计算就是为了计算出更接近真实宽高的k对值。与上边的Kmeans不同的是不能用欧式几何距离进行分类,而是采用IOU交并比来作为衡量每对值应该划分进哪一簇。IOU可以很好地表示出两对宽高的接近情况,IOU取值为[0, 1]之间,IOU越大就表示这两对宽高比越接近。在Kmeans里,距离就用1-IOU表示。
anchors:
python ./make_anchor_list.py \
${DATASET} \
--max_iters 10 \
--is_random True \
--in_hw ${IMGSIZE} \
--out_hw ${OUTSIZE} \
--anchor_num ${ANCNUM} \
--low ${LOW} \
--high ${HIGH}
make anchors DATASET=voc ANCNUM=2 LOW="0.0 0.0" HIGH="1.0 1.0"
def main(train_set: str, max_iters: int, in_hw: tuple, out_hw: tuple,
anchor_num: int, is_random: bool, is_plot: bool, low: list, high: list):
X = np.load(f'data/{train_set}_img_ann.npy', allow_pickle=True)
in_wh = np.array(in_hw[::-1])
low = np.array(low)
high = np.array(high)
# NOTE correct boxes
for i in range(len(X)):
# X[i, 1], X[i, 2]
img_wh = X[i, 2][::-1]
""" calculate the affine transform factor """
scale = in_wh / img_wh # NOTE affine tranform sacle is [w,h]
scale[:] = np.min(scale)
# NOTE translation is [w offset,h offset]
translation = ((in_wh - img_wh * scale) / 2).astype(int)
""" calculate the box transform matrix """
X[i, 1][:, 1:3] = (X[i, 1][:, 1:3] * img_wh * scale + translation) / in_wh
X[i, 1][:, 3:5] = (X[i, 1][:, 3:5] * img_wh * scale) / in_wh
x = np.vstack(X[:, 1])
x = x[:, 3:]
layers = len(out_hw) // 2
if is_random == 'True':
initial_centroids = np.hstack((np.random.uniform(low[0], high[0], (layers * anchor_num, 1)),
np.random.uniform(low[1], high[1], (layers * anchor_num, 1))))
else:
initial_centroids = np.vstack((np.linspace(0.05, 0.3, num=layers * anchor_num), np.linspace(0.05, 0.5, num=layers * anchor_num)))
initial_centroids = initial_centroids.T
centroids, idx = runkMeans(x, initial_centroids, 10, is_plot)
# NOTE : sort by descending , bigger value for layer 0 .
centroids = np.array(sorted(centroids, key=lambda x: (-x[0])))
centroids = np.reshape(centroids, (layers, anchor_num, 2))
for l in range(layers):
centroids[l] = centroids[l] # grid_wh[l] # NOTE centroids是相对于全局的0-1
if np.any(np.isnan(centroids)):
print(ERROR, 'Result have NaN value please Rerun!')
else:
print(NOTE, f'Now anchors are :\n{centroids}')
np.save(f'data/{train_set}_anchor.npy', centroids)
def parse_arguments(argv):
parser = argparse.ArgumentParser()
parser.add_argument('train_set', type=str, help=NOTE + 'this is train dataset name , the output *.npy file will be {train_set}_anchors.list')
parser.add_argument('--max_iters', type=int, help='kmeans max iters', default=10)
parser.add_argument('--is_random', type=str, help='wether random generate the center', choices=['True', 'False'], default='True')
parser.add_argument('--is_plot', type=str, help='wether show the figure', choices=['True', 'False'], default='True')
parser.add_argument('--in_hw', type=int, help='net work input image size', default=(224, 320), nargs='+')
parser.add_argument('--out_hw', type=int, help='net work output image size', default=(7, 10, 14, 20), nargs='+')
parser.add_argument('--low', type=float, help='Lower bound of random anchor, (x,y)', default=(0.0, 0.0), nargs='+')
parser.add_argument('--high', type=float, help='Upper bound of random anchor, (x,y)', default=(1.0, 1.0), nargs='+')
parser.add_argument('--anchor_num', type=int, help='single layer anchor nums', default=3)
return parser.parse_args(argv)

京公网安备 11010502036488号