对于如下分布的几个坐标点, C,D,E 为冗余点,需要剔除两个
首先构建距离矩阵 D,如选用下图的上三角矩阵。然后依次遍历 D(i,j),其中 j>i,再按阈值 T 剔除“小距离”所在行对应的坐标点:
D(i,j)<T
即 CD 坐标点被剔除, E点保留
测试代码:
import matplotlib.pyplot as plt
import numpy as np
def unique(data, thres):
result = data[::-1].copy()
length = len(data)
for i in range(length):
for j in range(i + 1, length):
distance = np.sqrt(np.sum(np.square(data[i] - data[j])))
if distance < thres:
result = np.delete(result, length - i - 1, axis=0)
break
return result[::-1]
def main():
# 数据生成
pts = np.random.randint(0, 100, [10, 2])
# 添加冗余点
pts = np.append(pts, pts[:5] * 1.1, axis=0)
# 剔除
ret = unique(pts, 20)
# 可视化
plt.figure()
plt.subplot(121), plt.scatter(pts[:, 0], pts[:, 1], marker="o"), plt.title("Initial Data")
plt.subplot(122), plt.scatter(ret[:, 0], ret[:, 1], marker="o"), plt.title("Unique Data")
plt.show()
if __name__ == '__main__':
main()
试验结果: