基于SSIM提取单一背景下目标Mask及Edge

1. Introduction:

  在深度学习的图像领域,获取一个精确的ROI(Region of Interest)是非常重要的一个过程, 现阶段的图像标注过程中,基于效率,人们一般以Point、Rectangle等标准几何图形获取ROI以提取目标特征。但由于识别目标大多数为不规则几何形状,导致ROI区域中会或多或少地出现背景,从而导致提取目标特征时,部分参数会分配给背景。

  考虑到常见的数据采集工作中,背景是单一的,故而考虑用减背景的方式获取目标的edge或mask区域。本文中利用的方法是用SSIM获取到对比图中差异性较大的区域,再从差异性区域中提取轮廓,对轮廓进行分析,以获取到目标精确的edge及mask。

2. SSIM:

  SSIM Index(structural similarity index), 是一种用以衡量两张数位影像相似程度的指标。作为结构相似性理论的实现,结构相似度指数从图像组成的角度将结构信息定义为独立于亮度、对比度的,反映场景中物体结构的属性,并将失真建模为亮度、对比度和结构三个不同因素的组合。用均值作为亮度的估计,标准差作为对比度的估计,协方差作为结构相似程度的度量。

  实际使用时,简化起见,一般会将参数设为latex_1)及latex_2,得到:

latex_3

其中latex_4)为对比图像像素值,latex_5)为对比图像像素值的标准差,latex_6为x与y的协方差。$c_1$, $c_2$为常数。根据作者论文数据表明,一个较好的参数效果为$c_1$, $c_2$为输入数据的取值范围乘以$k_1=0.01, k_2=0.03$的平方。

3.代码实现:

  • 由于RGB图像是3通道的矩阵,且由于摄像头镜头及感光芯片的原因,同种环境下产生的图片也会有差异,为了弱化硬件带来的影响,直接比较其灰度图,并对灰度图进行均值滤波或其他滤波进行优化。
1
2
3
4
5
6
7
8
9
10
11
import cv2

img1 = cv2.imread(img1_path) # RGB [...,::-1]
img2 = cv2.imread(img2_path)

gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

# uniform_filter or gaussian_filter
ux = filter_func(gray1, **filter_args)
uy = filter_func(gray2, **filter_args)
  • 然后根据上面公式进行构建:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import numpy as np

uxx = filter_func(gray1 * gray1, **filter_args)
uyy = filter_func(gray2 * gray2, **filter_args)
uxy = filter_func(gray1 * gray2, **filter_args)

vx = uxx - ux * ux
vy = uyy - uy * uy
vxy = uxy - ux * uy

# Refer to Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P.
# (2004). Image quality assessment: From error visibility to
# structural similarity. IEEE Transactions on Image Processing,
# 13, 600-612.

k1, k2 = 0.01, 0.03
min, max = np.iinfo(gray1.dtype.type).min, np.iinfo(gray1.dtype.type).max
range = max - min
C1, C2 = (k1 * range) ** 2, (k2 * range) ** 2

# 简化上公式: SSIM = (A * B) / (C * D)

A = 2 * ux * uy + C1
B = 2 * vxy + C2
C = ux ** 2 + uy ** 2 + C1
D = vx + vy + C2

SSIM = (A * B) / (C * D)
# If we need ssim index, compute it mean:
ssim_index = np.mean(SSIM)
  • 上面的函数可以返回相似度及差异区域图,我们就利用差异图进行分析并提取其轮廓:
1
2
3
4
5
6
7
8
9

score, diff = ssim(gray1, gray2)
diff = (diff * 255).astype('uint8')

# 提取轮廓
thresh = cv2.threshold(diff, 0, 255,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_NONE)
  • 部分场景中可能存在小区域波动及变化也产生了轮廓,如果我们不需要这些轮廓,可利用目标主体大致占图像的百分比进行排除:
1
2
3
4
5
6
7
8
9

cnts_need = []

for cnt in cnts:
area = cv2.contourArea(cnt)
# gray1.size = gray1.width * gray1.height * gray1.channel
# gray has only 1 channel, So area is equal with gray1.size.
if area > gray1.size * 0.05:
cnts_need.append(cnt)
  • 绘制轮廓:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Box:
for cnt in cnts_need:
(x, y, w, h) = cv2.boundingRect(cnt)
cv2.rectangle(gray1, (x, y), (x + w, y + h), (0, 0, 255), 2)
cv2.rectangle(gray2, (x, y), (x + w, y + h), (0, 0, 255), 2)

# Edge:
point_size = 1
point_color = (255, 255, 255) # BGR
thickness = 0 # 可以為 0 、4、8

for cnt in cnts_need:
for point in cnt:
cv2.circle(gray2, tuple(point[0]), point_size, point_color, thickness)
  • 获取Mask:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
bg = np.zeros((gray2.shape[0], gray2.shape[1]), np.uint8)
# Get mask one by one:
for cnt in cnts_need:
mask = cv2.fillPoly(bg, [cnt], (255,255,255))
res = cv2.bitwise_and(img2, img2, mask=mask)
cv2.imshow('Result', res)
k = cv2.waitKey(0)
if k == 27 or k == ord('q'):
cv2.destroyAllWindows()

# Get all masks:
masks = cv2.fillPoly(bg, cnts_need, (255,255,255))
res = cv2.bitwise_and(img2, img2, mask=mask)
cv2.imshow('Result', res)
k = cv2.waitKey(0)
if k == 27 or k == ord('q'):
cv2.destroyAllWindows()

4. 测试结果:

  • 输入比较图:

Input_original_image

  • 差异图:

Diff

  • 绘制轮廓Box:

Box

  • 绘制轮廓Edge:

Edge

  • 获取Mask:

Mask

Mikoy Chinese wechat
Subscribe my public wechat account by scanning.

--------------------   ENDING   --------------------