基于多级特征融合的体素三维目标检测网络

张吴冉; 胡春燕; 陈泽来; 李菲菲

文章摘要

张吴冉,胡春燕,陈泽来,李菲菲.基于多级特征融合的体素三维目标检测网络[J].包装工程,2022,43(15):42-53.
ZHANG Wu-ran,HU Chun-yan,CHEN Ze-lai,LI Fei-fei.Voxel-based 3D Object Detection Network Based on Multi-level Feature Fusion[J].Packaging Engineering,2022,43(15):42-53.

基于多级特征融合的体素三维目标检测网络

Voxel-based 3D Object Detection Network Based on Multi-level Feature Fusion

DOI：10.19554/j.cnki.1001-3563.2022.15.005

中文关键词: 三维目标检测残差融合自适应融合特征增强三重特征融合

英文关键词: ion for 3D Object Detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:10529-10538.

基金项目:上海市高校特聘教授(东方学者)岗位计划(ES2015XX)

作者	单位
张吴冉	上海理工大学光电信息与计算机工程学院，上海 200093
胡春燕	上海理工大学光电信息与计算机工程学院，上海 200093
陈泽来	上海理工大学光电信息与计算机工程学院，上海 200093
李菲菲	上海理工大学医疗器械与食品学院，上海 200093

摘要点击次数:

全文下载次数:

中文摘要:

目的为精确分析点云场景中待测目标的位置和类别信息，提出一种基于多级特征融合的体素三维目标检测网络。方法以2阶段检测算法Voxel−RCNN作为基线模型，在检测一阶段，增加稀疏特征残差密集融合模块，由浅入深地对逐级特征进行传播和复用，实现三维特征充分的交互融合。在二维主干模块中增加残差轻量化高效通道注意力机制，显式增强通道特征。提出多级特征及多尺度核自适应融合模块，自适应地提取各级特征的关系权重，以加权方式实现特征的强融合。在检测二阶段，设计三重特征融合策略，基于曼哈顿距离搜索算法聚合邻域特征，并嵌入深度融合模块和CTFFM融合模块提升格点特征质量。结果实验于自动驾驶数据集KITTI中进行模拟测试，相较于基线网络，在3种难度等级下，一阶段检测模型的行人3D平均精度提升了3.97%，二阶段检测模型的骑行者3D平均精度提升了3.37%。结论结果证明文中方法能够显著提升目标检测性能，且各模块具有较好的移植性，可灵活嵌入到体素类三维检测模型中，带来相应的效果提升。

英文摘要:

The work aims to accurately analyze the location and classification information of the object to be tested in the point cloud scene, and propose a voxel-based 3D object detection network based on multi-level feature fusion. The two-stage Voxel-RCNN was used as the baseline network. In the first stage, the Sparse Feature Residual Dense Fusion Module (SFRDFM) was added to propagate and reuse the level-by-level features from shallow to deep, to achieve full interactive fusion of 3D features. The Residual Light-weight and Efficient Channel Attention (RL-ECA) mechanism was added to the 2D backbone network to explicitly enhance channel feature representation. A multi-level feature and multi-scale kernel adaptive fusion module was proposed to adaptively extract the weight information of the multi-level features, to achieve a strong fusion with a weighted manner. In the second stage, a Triple Feature Fusion Strategy (TFFS) was designed to aggregate neighborhood features based on the Manhattan distance search algorithm, and a Deep Fusion Module (DFM) and a Coarse to Fine Fusion Module (CTFFM) were embedded to improve the quality of grid features. The algorithm in this paper was tested in the autonomous driving data set KITTI. Compared with the baseline network at three difficulty levels, the average 3D accuracy of pedestrians in the first stage detection model was improved by 3.97%, and the average 3D accuracy of cyclists in the second stage detection model was improved by 3.37%. The experimental results prove that the proposed method can effectively improve the performance of object detection, each module has superior portability, and can be flexibly embedded into the voxel-based 3D detection model to bring corresponding improvements.

查看全文查看/发表评论下载PDF阅读器

关闭

关于我们 | 联系我们 | 投诉建议 | 隐私保护 | 用户协议

您是第21497272位访问者渝ICP备15012534号-2

版权所有:《包装工程》编辑部 2014 All Rights Reserved

邮编：400039 电话：023-68795652 Email: designartj@126.com

您是第21497272位访问者 渝ICP备15012534号-2

版权所有:《包装工程》编辑部 2014 All Rights Reserved

邮编：400039 电话：023-68795652 Email: designartj@126.com

您是第21497272位访问者渝ICP备15012534号-2