Robust Analysis of Non-Parametric Space-Time Clustering

dc.contributor.advisorGel, Yulia R.
dc.creatorHuang, Xin
dc.date.accessioned2019-04-24T20:20:03Z
dc.date.available2019-04-24T20:20:03Z
dc.date.created2018-08
dc.date.issued2018-08
dc.date.submittedAugust 2018
dc.date.updated2019-04-24T20:20:04Z
dc.description.abstractRecently, the rampant growth of various remote sensing technologies has resulted in a spike of interest in space-time data mining and particularly clustering of environmental time series and spatio-temporal processes. Remarkably, the dynamic data-driven clustering procedures for space-time data that allow the number, shape and distributional properties of clusters to vary, have received a flare of interest in recent years. Despite the potential of the dynamic data-driven clustering procedures, the price for their flexibility is usually a set of parameters that control clustering performance and are to be user-specified – for instance, the value similarity threshold in TRUST; the maximum radius of the neighborhood Eps in DBSCAN; the steepness parameter ⇠ in OPTICS; and the kernel smoothing parameter h in DENCLUE. The choice of these parameters can noticeably impact the number and shape of detected clusters, and ideally should be approached in an objective manner. The goal of this dissertation is to address those challenges by developing new nonparametric data-driven approaches in space-time clustering. First, we propose a new data-driven procedure for optimal selection of these tuning parameters in dynamic clustering algorithms, using the notion of stability probe. We study finite sample performance of DR in conjunction with DBSCAN and TRUST in application to clustering synthetic times series and yearly temperature records in Central Germany. We also utilized DR in studying the ecological trends and water quality in Chesapeake Bay and legislative rhetoric data in the U.S. Senate. Second, when it comes to optimal selection of tuning parameters in density-based clustering procedures such as DBSCAN, OPTICS, and DENCLUE, some additional problems such as existence of clusters with varied densities and existence of outliers need to be addressed. Therefore, we develop a new density-based clustering algorithm named CRAD which is based on a new neighbor searching function with a robust data depth as the dissimilarity measure. Our experiments prove that the new CRAD is highly competitive at detecting clusters with varying densities, compared with the existing algorithms such as DBSCAN, OPTICS and DBCA.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/10735.1/6365
dc.language.isoen
dc.subjectMathematical statistics
dc.subjectSpace and time
dc.subjectRemote sensing—Mathematics
dc.subjectData mining
dc.subjectNonparametric statistics—Data processing
dc.subjectGeology—Statistical methods
dc.titleRobust Analysis of Non-Parametric Space-Time Clustering
dc.typeDissertation
dc.type.materialtext
thesis.degree.departmentStatistics
thesis.degree.grantorThe University of Texas at Dallas
thesis.degree.levelDoctoral
thesis.degree.namePHD

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ETD-5608-030-HUANG-9433.63.pdf
Size:
22.07 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description: