基于残差注意力和金字塔上采样的图像语义分割

Image Semantic Segmentation Based on Residual Attention and Pyramid Upsampling

  • 摘要: 针对图像语义分割中, 存在细节信息丢失、分割类别边缘模糊而粗糙的问题, 在编码解码结构的基础上, 结合残差模块和注意力机制, 设计一种残差注意力模块。通过注意力机制加强特征图通道之间的联系, 以提升语义分割的细腻度。为提高模型对多尺度物体的识别能力, 结合金字塔模型, 设计一种金字塔上采样模块。利用编码过程中产生的不同尺度的特征图, 进行不同尺度的语义信息提取, 以加强模型场景识别能力。最后, 对所提出的方法进行实验验证, 与FCN-8s、SegNet、Deeplab-v2、PSPNet等方法相比, 针对VOC 2012, 平均交并比(mIoU)和平均像素精度(mPA)最高分别提高了15.9%和3.57%;针对Cityscape数据集, mIoU和mPA指标分别提高了17.8%和13.3%, 图像语义分割效果得到明显提升。

     

    Abstract: Aiming at the problems in image semantic segmentation such as detail information loss, fuzzy & rough edges of segmentation categories, a residual attention module is designed based on encoder-and-decoder combined with residual modules and attention mechanism. The attention mechanism strengthens the connectivity among feature-map channels to improve the fineness of semantic segmentation. For multi-scale object recognition, a joint pyramid up-sampling module is designed based on pyramid models. It uses different scale feature maps generated during encoding processes to extract semantic information and increases the recognition ability on model scenes. Finally, the proposed scheme is verified by experiments on the VOC2012 and Cityscape data sets. Comparing with FCN-8s、SegNet、Deeplab-v2、PSPNet, the highest mean Intersection over Union (mIoU) and mean Pixel Accuracy(mPA) increased by 15.9% and 3.57% for VOC 2012, 17.8% and 13.3% for Cityscape data set, respectively. The image semantic segmentation effect has been significantly improved.

     

/

返回文章
返回