SIA OpenIR  > 机器人学研究室
面向复杂点云数据的3D目标识别技术研究
Alternative TitleResearch on 3D Object Recognition Technology for Complex Point Cloud Data
刘洪森
Department机器人学研究室
Thesis Advisor唐延东 ; 丛杨
Keyword3D目标识别技术 复杂点云场景 6-DOF位姿估计
Pages114页
Degree Discipline模式识别与智能系统
Degree Name博士
2020-05-27
Degree Grantor中国科学院沈阳自动化研究所
Place of Conferral沈阳
Abstract3D 目标识别技术是一种用于在场景中针对特定目标进行类别检测及6-DOF位姿估计的重要技术。随着机器人、机器视觉和人工智能技术的飞速发展,当前的自动化技术已经突破了传统的示教型场景,逐渐进入更加复杂的柔性作业场景。相对示教型作业中对场景及目标的严格控制,复杂的柔性作业则强调场景及目标在位置、姿态、类别、数量等多方面的任意性,同时需要对混叠、杂乱、弱纹理等多方面的干扰具有稳健性。点云数据可以提供更多的空间立体几何信息,不受尺度甚至是光照变化的影响,只需要考虑平移和旋转变换。另外,在点云空间估计6-DOF 位姿信息要比2D 图像更加准确。然而,针对点云数据的处理因为数据维度高以及结构不规则而难以有效的进行特征表达。同时,复杂点云场景中存在的遮挡、杂乱等问题也为点云数据的处理带来了挑战。针对点云数据的特点及目前处理技术中存在的问题,本文面向复杂点云数据的3D 目标识别技术展开研究,旨在通过研究现有技术面临的问题,设计实现3D 目标识别算法,提高现阶段在复杂场景中的目标识别能力。本文的主要创新性成果如下:(1)基于点云人工统计特征的识别技术研究几何特征缺失(Geometrically Featureless) 的目标由于局部表面不具备丰富的几何特征描述信息,从而导致点云局部特征描述子难以提取到稳健的关键点特征进而影响算法的识别性能。针对此问题,本文提出了一种基于视点方向约束的点云局部投影统计特征(View-Specific Local Projection Statistics, VSLPS) 描述方法。该方法通过引入视点方向约束消除了局部对称表面处关键点特征的模糊性,结合霍夫投票策略将稳健性较低的依赖关键匹配点对生成位姿假设的过程转变为通过位姿投票选择最优模型采样视角的过程。该方法使所有的关键点都有效参与了位姿投票过程,显著解决了此类目标因为几何特征缺失导致的算法稳健性较低问题。(2)基于体素化点云编码特征的识别技术研究2D/2.5D 图像Patch 特征由于缺少空间几何信息而难以显著地解决场景中存在的严重遮挡和杂乱现象。针对此问题,本文提出了一种基于截断符号距离(Truncated Signed Distance Function, TSDF) 的点云体素化编码方法,并使用稀疏自编码特征学习模型提取点云3DPatch 特征,首次实现了基于点云3DPatch 特征的目标识别框架。实验结果验证了相比2D/2.5D 图像Patch 特征,点云3DPatch特征能够有效解决场景中存在的严重遮挡和杂乱问题。(3)基于点云深度特征编码的识别技术研究受限于点云数据的不规则无序结构,针对点云局部特征的提取难以使用规则化数据所适用的深度无监督特征学习模型。针对此问题,本文首次提出了一种可以直接作用于不规则点云数据的堆积点特征深度自编码模型Stacked Point Feature Autoencoder, SPAE)。该模型通过对点云局部多模态属性的深度融合,实现了点云局部特征的稳健提取。经过一系列的对比实验,验证了堆积点特征自编码模型能够有效保持点云局部纹理和几何信息,在特征描述子的鉴别力、稳健性以及算法的平均性能上都取得了与经典算法相比显著的优势。(4)基于点云监督训练模型的识别技术研究对不同目标模型依次建立模型特征库、生成候选目标位姿假设以及假设检验的识别策略不适用于多类多目标的在线识别场景。针对此问题,本文提出了一种基于点云数据的端到端多任务预测模型(One Point, One Object, OPOO),该模型实现了在线同步对输入点云场景进行语义分割以及多类多目标的6-DOF 位姿估计。为了解决模型训练过程中出现的过拟合现象,提出了一种基于增强现实(Augmented Reality, AR) 技术的点云数据集增广方法,该方法可针对具有固定作业场景的点云数据快速生成训练数据集,省去了大量人工标注的时间。一系列的 对比实验验证了基于点云数据的多任务预测模型不仅具有稳健的识别性能,甚至消除了对假设检验策略的依赖,实现了高效稳定的多类多目标6-DOF 位姿估计。
Other Abstract3D object recognition is an important technology for detection and 6-DOF pose estimation of specific targets in a scene. With the rapid development of robot, machine vision and artificial intelligence technology, the current automation technology has broken through some limitations of traditional teaching scene, and is gradually suitable for a more complex flexible working scene. Compared with the strict specification of scene and target in teaching operations, the complex flexible operation emphasizes the arbitrariness of scene and target in position, posture, category, quantity and other aspects, and needs to be robust to the interference of overlapping, clutter, weak texture. Point cloud data can provide more spatial solid geometry information, which is not affected by scale and illumination changes, only need to consider translation and rotation transformation. In addition, it is more accurate to estimate 6-DOF pose information in point cloud space than in 2D image. However, the processing of point cloud data also faces the difficulties of high data dimension and irregular structure. Facing the characteristics of point cloud data and the problems existing in the current processing technology, this dissertation focuses on the research of 3D object recognition methods for complex point cloud data, aiming to improve the ability of object recognition in complex scenes. The main innovative achievements of this dissertation are as follows: (1) Research on artificial statistical characteristics of point cloud Aiming at the problem that the local surface of the object with geometrically featureless does not have rich feature description information, which leads to the reduction of the robustness of the local feature description algorithm of point cloud, in this dissertation, a View-Specific Local Projection Statistics (VSLPS) descriptor is proposed. The VSLPS can eliminate the ambiguity of the feature at the local symmetrical surface by introducing viewpoint direction constraint, and transform the key-point pair matching problem into the optimal model-view matching problem by combining Hough Voting strategy. This makes all the key-points participate in the voting process effectively, and eliminates the reduction of algorithm robustness caused by a large number of key points repetition. (2) Research on voxelization self-coding of point cloud Aiming at the problem of the patch feature of 2D/2.5D image is difficult to solve the occlusion and clutter, a voxelization self-coding method based on the Truncated Signed Distance Function (TSDF) is proposed to realize the object recognition framework based on the 3DPatch feature of point cloud. In order to improve the performance of the 3DPatch, a feature extraction method of point cloud 3DPatch data based on self-coding learning model is proposed. The experimental results show that compared with 2D/2.5D patch features, 3DPatch self-coding features of point cloud can effectively solve the problem of occlusion and clutter in the scene. (3) Research on deep self-coding feature of point cloud Limited by the irregular and disordered structure of point cloud data, it is difficult to use the common deep unsupervised feature learning model for the extraction of local features of point cloud. In order to solve this problem, a Stacked Point Feature Autoencoder (SPAE) that can directly deal with irregular point cloud data is proposed. Through the deep fusion of point cloud multimodal attributes, the performance of the model is improved significantly. After a series of comparative experiments, it is verified that the SPAE model can effectively maintain the local texture and geometric information of point cloud, and it has a significant improvement in feature discrimination, robustness and the overall performance of the algorithm compared with the state-of-the-arts. (4) Research on supervised training model of point cloud The existing 3D object recognition algorithms can be divided into three main steps: building model feature library, generating candidate object pose hypothesis and hypothesis verification. In the case of multi-object, the current algorithm strategy needs to identify in turn, so this method can not be effectively applied to multi-object online recognition. Therefore, this dissertation proposes an end-to-end multi-task supervised training model (one point, one object, OPOO) based on point cloud data. In order to avoid over fitting in network training, an Augmented Reality (AR) based method is proposed. The AR technology can quickly generate augmentation training sets for the point cloud data with fixed operation scenarios, and saves a lot of time of manual annotation. A series of comparative experiments verify that the multi-task prediction model based on point cloud data not only has robust recognition performance, but also eliminates the dependence on hypothesis testing strategy, and achieves efficient and stable multiclass/objective 6-DOF pose estimation.
Language中文
Contribution Rank1
Document Type学位论文
Identifierhttp://ir.sia.cn/handle/173321/27156
Collection机器人学研究室
Affiliation中国科学院沈阳自动化研究所
Recommended Citation
GB/T 7714
刘洪森. 面向复杂点云数据的3D目标识别技术研究[D]. 沈阳. 中国科学院沈阳自动化研究所,2020.
Files in This Item:
File Name/Size DocType Version Access License
面向复杂点云数据的3D目标识别技术研究.(8928KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[刘洪森]'s Articles
Baidu academic
Similar articles in Baidu academic
[刘洪森]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[刘洪森]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.