Enhancing Point Cloud Generation From Various Information Sources by Applying Geometry-aware Folding Operation

dc.contributor.advisorKhan, Latifur
dc.contributor.advisorFumagalli, Andrea
dc.contributor.committeeMemberRuozzi, Nicholas
dc.contributor.committeeMemberD'Orazio, Vito
dc.contributor.committeeMemberDu, Ding-Zhu
dc.creatorLin, Yu
dc.date.accessioned2022-11-29T22:33:03Z
dc.date.available2022-11-29T22:33:03Z
dc.date.created2022-05
dc.date.issued2022-05-01T05:00:00.000Z
dc.date.submittedMay 2022
dc.date.updated2022-11-29T22:33:04Z
dc.description.abstractA plethora of cutting-edge computer vision and graphic applications, such as Augmented Reality (AR), Virtual Reality (VR), automatic vehicles, and robotics, require rapid creation and access to abundant 3D data. Among various 3D data representations, e.g., RGB images, depth images, or voxel grids, point cloud attracts considerable attention from the research community because it offers additional geometric, shape, and scale information in comparison with 2D images and demands less computational resource to process in contrast to other 3D representations, e.g., voxel grids, octree, or triangle meshes. Unfortunately, even with the increasing availability of 3D sensors, the size and variety of 3D point clouds datasets pale when compared to the vast size datasets of other representations. Therefore, it will benefit many applications if we can generate point clouds from other information sources. Point cloud generation is a sub-field of 3D reconstruction, which aims to generate a complete 3D object from other information sources. Conventional methods generally focus on 2D images and heavily rely on the knowledge of multi-view geometry, while multiple 2D views of a target 3D object usually are inaccessible in many real-world scenarios. On the contrary, recent deep learning approaches either dedicate to 3D representations with regular structures, such as voxel grids and octrees, and thus suffer from resolution and scalability issues, or unconsciously ignore the crucial 3D prior knowledge and lead to sub-optimal solutions. To address the aforementioned drawbacks, we explore the possibilities to improve the point cloud generation by developing advanced folding operations and geometry-aware (3D-prioraware) reconstruction networks in this dissertation. Specifically, we start with a novel point cloud generation framework TDPNet that reconstructs complete point clouds by employing a hierarchical manifold decoder and a collection of latent 3D prototypes. Later, we find that applying vanilla folding operation is insufficient for a realistic reconstruction, and using KMeans centroids as the prototype features is unstable and lacks interpretability. Inspired by these observations, we further introduce a novel framework equipped with a collection of Learnable Shape Primitives (L-SHAP), which encode the crucial 3D prior knowledge from training data through an additional folding operation. On the other hand, it’s beneficial to many applications if point clouds can be generated in a few-shot scenario. We tackle this problem by a novel few-shot generation framework FSPG, which simultaneously considers class-agnostic and class-specific 3D priors during the generation process. Finally, we observe that conventional folding operations are implemented by a simple shared-MLP, which increases training difficulty and limits the network’s modeling capability. In order to solve this problem, we incorporate the popular Transformer architecture into a novel attentional folding decoder AttnFold and introduce a Local Semantic Consistency (LSC) regularizer to further boost the model’s capability. Based on our research, we demonstrate that learning flexible data-driven 3D priors and adopting advanced folding operations are effective for point cloud generation under different problem settings.
dc.format.mimetypeapplication/pdf
dc.identifier.uri
dc.identifier.urihttps://hdl.handle.net/10735.1/9540
dc.language.isoen
dc.subjectComputer Science
dc.titleEnhancing Point Cloud Generation From Various Information Sources by Applying Geometry-aware Folding Operation
dc.typeThesis
dc.type.materialtext
thesis.degree.collegeSchool of Engineering and Computer Science
thesis.degree.departmentComputer Science
thesis.degree.grantorThe University of Texas at Dallas
thesis.degree.namePHD

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LIN-PRIMARY-2022-1.pdf
Size:
14.64 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.83 KB
Format:
Plain Text
Description: