图像读取
opencv 无法正确读取用于语义分割
的mask图像, 因为opencv在读取png
格式的图像时, 自动将单通道转换为三通道, 像素值也发生改变
import cv2 as cv
import numpy as np
import PIL.Image as Image
def countColors(img):
m, n = img.shape[:2]
pxtype = list if img.ndim > 2 else int
colors = []
for i in range(m):
for j in range(n):
px = pxtype(img[i, j])
if px in colors:
continue
else:
colors.append(px)
print(colors)
path = "examples/imgs/Person/2008_002176.png"
# 转换的像素值
img = cv.imread(path)
print(img.shape) # (294, 500, 3)
countColors(img) # [[0, 0, 0], [0, 0, 128], [128, 0, 0],
# [0, 128, 0], [0, 128, 128], [128, 0, 128]]
# 真实的像素值
img = np.array(Image.open(path))
print(img.shape) # (294, 500)
img = cv.merge([img, img, img])
countColors(img) # [[0, 0, 0], [1, 1, 1], [4, 4, 4],
# [2, 2, 2], [3, 3, 3], [5, 5, 5]]
# opencv单通道读取
img = cv.imread(path, 0)
print(img.shape) # (294, 500)
img = cv.merge([img, img, img])
countColors(img) # [[0, 0, 0], [38, 38, 38], [14, 14, 14],
# [75, 75, 75], [113, 113, 113], [52, 52, 52]]
可以看出, 原本的mask图像是单通道的, 每个像素值代表了当前像素所属的标签. 当opencv转换后, 是不存在这一概念的, 这样读入的数据是无法用于深度学习训练的, 即使单通道读入也是不一样, 还很麻烦.
避坑指南: 使用 Pillow 库读取图像
图像尺寸变换
opencv的 cv::Size
定义与Python版本中的shape
定义是不一样的.
Size相关的是 W x H
size = cv::Size(imageWidth, imageHeight)
img = cv.resize(img, (imageWidth, imageHeight))
shape相关的是 m x n
rows, cols = img.shape[:2]
imageHeight, imageWidth = img.shape[:2]
避坑指南:
imageHeight, imageWidth = img.shape[:2]
... # processing
dst= cv.resize(dst, (imageWidth, imageHeight), interpolation=cv.INTER_CUBIC)