Finding Features Via Hull Approximation

Date
2021-05-03
item.page.orcid
Journal Title
Journal ISSN
Volume Title
Publisher
item.page.doi
Abstract

With the recent explosion in data collection, big datasets are becoming more difficult to handle. The size of these datasets can cause algorithms to become intractable in time or space, but noise and outliers can also cause machine learning models to overfit data, yielding poor general performance. Additionally, applications that require users or experts to interpret or inspect data also suffer from big datasets as the sheer size makes it impossible for a human to grasp. These considerations motivate reducing the size and complexity of a dataset by selecting key features which allow one to compactly represent the data. This dissertation approaches the basic task of simplifying a dataset from a geometric angle. First, we view datasets from several types of applications as being naturally modeled by three geometric hulls: the staircase hull, the convex hull, and the conic hull. Then, the goal of this dissertation is to research geometric properties and algorithms related to simplifying each of these hulls which in turn results in a simplified dataset. Further, the dissertation provides experimental evidence that show these methods are useful in practice.

Description
Keywords
Big data, Geometrical models, Computer science
item.page.sponsorship
item.page.rights
Citation