Gel, Yulia R.

Permanent URI for this collection

Yulia R. Gel is a Professor in the Depatment of Mathematical Sciences. She is also a Fellow of the American Statistical Association. Her research interests include:

  • Statistical foundation of data science; machine learning; nonparametrics; high-dimensional data inference
  • Graph mining; inference for random graphs and complex networks: uncertainty quantification in network analysis, bootstrap on networks, network motif and tensor analysis, data depth on networks
  • Time series analysis; spatio-temporal processes; time series of graphs
  • Climate informatics; healthcare, finance, and business predictive analytics


Recent Submissions

Now showing 1 - 5 of 5
  • Item
    Political Rhetoric Through the Lens of Non-Parametric Statistics: Are Our Legislators that Different?
    (Wiley, 2018-11-18) Iliev, Iliyan R.; Huang, Xin; Gel, Yulia R.; Huang, Xin; Gel, Yulia R.
    We present a novel statistical analysis of legislative rhetoric in the US Senate that sheds a light on hidden patterns in the behaviour of Senators as a function of their time in office. Using natural language processing, we create a novel comprehensive data set based on the speeches of all Senators who served on the US Senate Committee on Energy and Natural Resources in 2001-2011. We develop a new measure of congressional speech, based on Senators' attitudes towards the dominant energy interests. To evaluate intrinsically dynamic formation of groups among Senators, we adopt a model-free unsupervised space-time data mining algorithm that has been proposed in the context of tracking dynamic clusters in environmental georeferenced data streams. Our approach based on a two-stage hybrid supervised-unsupervised learning methodology is innovative and data driven and transcends conventional disciplinary borders. We discover that legislators become much more alike after the first few years of their term, regardless of their partisanship and campaign promises.
  • Item
    Complementing the Power of Deep Learning with Statistical Model Fusion: Probabilistic Forecasting of Influenza in Dallas County, Texas, USA
    (Elsevier B.V., 2019-06-08) Soliman, Marwah; Lyubchich, V.; Gel, Yulia R.; 315611866 (Gel, YR); Soliman, Marwah; Gel, Yulia R.
    Influenza is one of the main causes of death, not only in the USA but worldwide. Its significant economic and public health impacts necessitate development of accurate and efficient algorithms for forecasting of any upcoming influenza outbreaks. Most currently available methods for influenza prediction are based on parametric time series and regression models that impose restrictive and often unverifiable assumptions on the data. In turn, more flexible machine learning models and, particularly, deep learning tools whose utility is proven in a wide range of disciplines, remain largely under-explored in epidemiological forecasting. We study the seasonal influenza in Dallas County by evaluating the forecasting ability of deep learning with feedforward neural networks as well as performance of more conventional statistical models, such as beta regression, autoregressive integrated moving average (ARIMA), least absolute shrinkage and selection operators (LASSO), and non-parametric multivariate adaptive regression splines (MARS) models for one week and two weeks ahead forecasting. Furthermore, we assess forecasting utility of Google search queries and meteorological data as exogenous predictors of influenza activity. Finally, we develop a probabilistic forecasting of influenza in Dallas County by fusing all the considered models using Bayesian model averaging. ©2019 The Authors
  • Item
    Forecasting Bitcoin Price with Graph Chainlets
    (Springer Verlag) Akcora, Cuneyt G.; Dey, Asim Kumer; Gel, Yulia R.; Kantarcioglu, Murat; Akcora, Cuneyt G.; Dey, Asim Kumer; Gel, Yulia R.; Kantarcioglu, Murat
    Over the last couple of years, Bitcoin cryptocurrency and the Blockchain technology that forms the basis of Bitcoin have witnessed a flood of attention. In contrast to fiat currencies used worldwide, the Bitcoin distributed ledger is publicly available by design. This facilitates observing all financial interactions on the network, and analyzing how the network evolves in time. We introduce a novel concept of chainlets, or Bitcoin subgraphs, which allows us to evaluate the local topological structure of the Bitcoin graph over time. Furthermore, we assess the role of chainlets on Bitcoin price formation and dynamics. We investigate the predictive Granger causality of chainlets and identify certain types of chainlets that exhibit the highest predictive influence on Bitcoin price and investment risk.
  • Item
    Deep Ensemble Classifiers and Peer Effects Analysis for Churn Forecasting in Retail Banking
    (Springer Verlag) Chen, Y.; Gel, Yulia R.; Lyubchich, V.; Winship, T.; 315611866 (Gel, YR); Gel, Yulia R.
    Modern customer analytics offers retailers a variety of unprecedented opportunities to enhance customer intelligence solutions by tracking individual clients and their peers and studying clientele behavioral patterns. While telecommunication providers have been actively utilizing peer network data to improve their customer analytics for a number of years, there yet exists a very limited knowledge on the peer effects in retail banking. We introduce modern deep learning concepts to quantify the impact of social network variables on bank customer attrition. Furthermore, we propose a novel deep ensemble classifier that systematically integrates predictive capabilities of individual classifiers in a meta-level model, by efficiently stacking multiple predictions using convolutional neural networks. We evaluate our methodology in application to customer retention in a retail financial institution in Canada.
  • Item
    Bootstrap Quantification of Estimation Uncertainties in Network Degree Distributions
    (Springer Nature, 2018-08-20) Gel, Yulia R.; Lyubchich, Vyacheslav; Ramirez Ramirez, L. Leticia; Gel, Yulia R.
    We propose a new method of nonparametric bootstrap to quantify estimation uncertainties in functions of network degree distribution in large ultra sparse networks. Both network degree distribution and network order are assumed to be unknown. The key idea is based on adaptation of the "blocking" argument, developed for bootstrapping of time series and re-tiling of spatial data, to random networks. We first sample a set of multiple ego networks of varying orders that form a patch, or a network block analogue, and then resample the data within patches. To select an optimal patch size, we develop a new computationally efficient and data-driven cross-validation algorithm. The proposed fast patchwork bootstrap (FPB) methodology further extends the ideas for a case of network mean degree, to inference on a degree distribution. In addition, the FPB is substantially less computationally expensive, requires less information on a graph, and is free from nuisance parameters. In our simulation study, we show that the new bootstrap method outperforms competing approaches by providing sharper and better-calibrated confidence intervals for functions of a network degree distribution than other available approaches, including the cases of networks in an ultra sparse regime. We illustrate the FPB in application to collaboration networks in statistics and computer science and to Wikipedia networks.

Works in Treasures @ UT Dallas are made available exclusively for educational purposes such as research or instruction. Literary rights, including copyright for published works held by the creator(s) or their heirs, or other third parties may apply. All rights are reserved unless otherwise indicated by the copyright owner(s).