初探图像的二值化

（英文: Binarization）

意将非二值图像经过计算变成二值图像，它进行图像分割(Segmentation)最简单的一种方法，即后续图像处理技术的基础(简化后期的处理，提高处理速度），可以将灰度图像转化成二值图像。一般用它将感兴趣的目标和背景分离，比如：将人脸图像分为皮肤区域和非皮肤区域，将图像文字转换成PDF文字(黑/白)等等。因此，图像二值化可以看做是聚类或者分类。

二值图像即为每个像素只有两个可能值的数字图像，常出现在图像掩码，图像分割，二值化和dithering。

其将大于某个临界灰度值的像素灰度设为灰度极大值，小于这个值的为灰度极小值，从而实现二值化。

根据域值，二值化分为固定阈值和自适应阈值。比较常用的二值化方法有：双峰法，P参数法，迭代法和OTSU法等。

首先，介绍一种比较简单的方法。将图片灰度化后，我们选择127(灰度值范围的一半)作为阈值，即将像素值大于127的像素值全部设为255，小于127的全部设为0.

def easy_binarization(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_gray[img_gray>127] = 255
    img_gray[img_gray<=127] = 0
    plt.imshow(img_gray, cmap='gray')
    plt.show()
    return img_gray

通过这种方法我们得到一张这样的图片：

easy

严谨一点，为了应对每张灰度值大不相同，我们将阈值取为图像本身的平均值。

def mean_binarization(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    threshold = np.mean(img_gray)
    img_gray[img_gray>threshold] = 255
    img_gray[img_gray<=threshold] = 0
    plt.imshow(img_gray, cmap='gray')
    plt.show()
    return img_gray

mean

这样得到的图片有一点区别。我们知道直方图是图像的重要特质，它可以帮助我们分析图像中的灰度变化。因此，如果物体与背景的灰度值对比明显，直方图就会包含双峰(bimodal histogram)，它们分别为图像的前景和背景。而它们之间的谷底即为边缘附近相对较少数目的像素点，一般来讲，这个最小值就为最优二值化的分界点，通过这个点可以把前景和背景很好地分开。

def hist_binarization(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    hist = img_gray.flatten()
    plt.subplot(121)
    plt.hist(hist, 256)

    cnt_hist = collections.Counter(hist)
    begin, end = cnt_hist.most_common(2)[0][0], cnt_hist.most_common(2)[1][0]

    cnt = np.iinfo(np.int16).max
    threshold = 0
    for i in range(begin, end+1):
        if cnt_hist[i] < cnt:
            cnt = cnt_hist[i]
            threshold = i 
    print(f'{threshold}: {cnt}')
    img_gray[img_gray>threshold] = 255
    img_gray[img_gray<=threshold] = 0

    plt.subplot(122)
    plt.imshow(img_gray, cmap='gray')
    plt.show()
    return img_gray

hist

我们通过这种方式所寻找的阈值为148，其区于(143，153)双峰之间，得到了一张与上两种方法大不相同的图像。可以看出，阈值的选取就是二值化实现的关键。

1979年，日本人大津提出了名为OSTU的一种算法(<--<>)，也称最大类间差法。取使得前景和背景两类的类间方差最大时的阈值，btw，matlab中的graythresh便基于这个算法。此外，它可进行多级阈值处理，即OSTU扩展算法，称为多大津算法(Multi OTSU Method)。

我们先将阈值设为t, 将原图转换成转化成灰度图后，将其高与宽存于h,w。通过magic的numpy算法我们直接将小于阈值的灰度值存储在前景front中，大于等于阈值的存在背景back中。即，

# 阈值：t
h, w = img.shape[:2]
front = img[img < t]
back = img[img >= t]

显然，前景与背景的长度和应与h, w的乘积相等，即 len(front) + len(back) == h * w.

现在计算出前景像素数量占总像素数量的比重(front_p)以及背景像素的占比(back_p)，为了计算方差，我们还要计算前景和背景的灰度平均值(front_mean, back_mean)以及总平均灰度值(m)。基于它们之间的关系，总平均灰度值可经过化简，

$\operatorname{m}= \frac{sum(front) + sum(back)}{h*w} = \frac{sum(front) + sum(back)}{len(front)+len(back)} = frontP*frontMean + backP*backMean$

此时，我们便能计算其间的方差(v)

$\operatorname{v}= frontP*(frontMean-m)^2+backP*(backMean-m)^2$

基于上式，我们可以将其化简成

$\operatorname{v}=frontP*backP*(frontMean-backMean)^2$

因此，我们并不需要计算总平均灰度值(m)即可计算其间方差！

def otsu(img):
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    h, w = img.shape[:2]
    threshold_t = 0
    max_g = 0

    for t in range(255):
        front = img[img < t]
        back = img[img >= t]
        front_p = len(front) / (h * w)
        back_p = len(back) / (h * w)
        front_mean = np.mean(front) if len(front) > 0 else 0.
        back_mean = np.mean(back) if len(back) > 0 else 0.

        g = front_p * back_p * ((front_mean - back_mean)**2)
        if g > max_g:
            max_g = g
            threshold_t = t
    print(f"threshold = {threshold_t}")
    img[img < threshold_t] = 0
    img[img >= threshold_t] = 255
    return img

此时得到，

otsu

不得不说，这个算法的确很妙😄。它生成了比上三者都好的成像。👍

关于二值化由浅入深的算法实现暂时到这，我们一起来看一下OpenCV的调用与实现！

在OpenCV中，分为简单的阈值分割(Simple Thresholding, cv2.threshold)与自适应阈值分割(Adaptive Thresholding, cv2.adaptiveThreshold)。

首先，看一下Simple Thresholding，

retval, dst    = cv.threshold(src, thresh, maxval, type[, dst])

第一个参数src为原图，需要注意的是输入的图像需为灰度图！

第二个参数thresh即为阈值，用于对像素值的分类(一般定义为127)。

第三个参数maxval是最大值，即超过阈值后所定义的值(255)。

第四个参数type，在Simple Thresholding中一共有五种不同的方式：

接下来我们看一下通过这五种方式处理后的Lenna,

img = cv2.imread('lenna.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

ret,thresh1 = cv2.threshold(img_gray,127,255,cv2.THRESH_BINARY)
ret,thresh2 = cv2.threshold(img_gray,127,255,cv2.THRESH_BINARY_INV)
ret,thresh3 = cv2.threshold(img_gray,127,255,cv2.THRESH_TRUNC)
ret,thresh4 = cv2.threshold(img_gray,127,255,cv2.THRESH_TOZERO)
ret,thresh5 = cv2.threshold(img_gray,127,255,cv2.THRESH_TOZERO_INV)

titles = ['Original Image','BINARY','BINARY_INV','TRUNC','TOZERO','TOZERO_INV']
images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]

for i in range(6):
    plt.subplot(2,3,i+1),plt.imshow(images[i],'gray')
    plt.title(titles[i])
    plt.xticks([]),plt.yticks([])
plt.figure(figsize=(11,11))
plt.show()

simple threshold

显然它们呈现了五种不同的方式，其中Binary方式与我们之前想实现的趋于相同。

接着我们看一下Adaptive Thresholding,

dst    = cv.adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C[, dst])

其中src, maxValue和thresholdType与Simple Thresholding相同。

在自适应阈值分割中，adaptive method(阈值的计算方式)有两种：

cv.ADAPTIVE_THRESH_MEAN_C: 邻域面积(blockSize * blockSize)的平均值并减去C.
cv.ADAPTIVE_THRESH_GAUSSIAN_C: 邻域面积的高斯加权总和然后减去C.

Similarly, 我们看一下Lenna被它们处理后的结果(对于thresholdType我们在这里选择cv2.THRESH_BINARY）,

ret,th1 = cv2.threshold(img_gray,127,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img_gray,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
            cv2.THRESH_BINARY,11,2)
th3 = cv2.adaptiveThreshold(img_gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
            cv2.THRESH_BINARY,11,2)

titles = ['Original Image', 'Global Thresholding (v = 127)',
            'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]
for i in range(4):
    plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
    plt.title(titles[i])
    plt.xticks([]),plt.yticks([])
plt.show()

adapt thresh

对于这种结果并不是我们想要的，根据OpenCV文档，我们在进行Adaptive Thresholding前需要做一次中值滤波。我们照做，

img_gray = cv2.medianBlur(img_gray, 9)
"""same above"""

adapt thresh mb

尽管，它并没有像之前otsu一样输出一张那样的图片，但是，它将图片中的边框描绘了出来。这样的方式更适合处理文字形式图片，像官方文档给出的对比一样：

opencv at mb

很明显，这种方式可以很好的处理拍照的阴影。

OK，对于图像二值化就先告一段落了，在之后如果有机会我会以我的方式继续研究以及讲解其他二值化的算法，希望这篇文章可以对在读的你提供帮助。如果喜欢，别忘了给我点赞！

感谢阅读！！！

如果对文中有什么疑问，欢迎大家评论，一起探讨！！!

本文参考：

大津二值化算法OTSU的理解

Image Thresholding