A Semantics-Centric Data Processing Framework for Fault Diagnosis in Internet of Things
Fault diagnosis is a challenging task in the realm of Internet of Things, where heterogeneous data and proprietary environment may cause serious interoperability problems when a monitoring job is deployed across different domains. Also, along the development of IoT, semantic is always an important direction and is believed to be one of the key enablers of IoT paradigm, in terms of solving interoperability, data integration, data abstraction, knowledge extraction, service discovery, etc. Many fault diagnosis researches have been putting efforts on semantic-based methodologies, but many of them are developed in a competing manner with each other, and it’s very hard to achieve a consensus on a semantic level. Hence, best practices on semantic modeling need to be explored and applied for IoT cases. Further for fault diagnosis purposes, sufficient context information is required to have a holistic view over a general IoT framework. This dissertation is focused on semantic enabled generic data modeling, data management and system behavior modeling for a context-aware fault diagnosis target. Time series data is a common form to be used for system monitoring. For a general representation of monitored data, we propose a semantic model for time series specification, with respect to measurement description, entity entangling, stream property, data provenance, etc. To support better data management, we implement a generic interface for mainstream time-series databases (TSDBs) and fit our proposed data model into TSDB data semantics for an efficient semantic-annotated data access. With the help of sophisticated sharding schema and data compression, we introduce event embedding to handle queries on sparse data spread in the data cluster. And on top of that, we also design a semantic-similarity based query procedure to support infrastructure-wide data stream retrieval and provenance-based data discovery. As part of the context modeling, the system components and behaviors are formally represented using hybrid bond graph (HBG) to handle both continuous and discrete status change of the system. During the graph construction, quantitative relations are mined with light-weighted algorithms using regression analysis to make better use of HBG towards fault diagnosis. With the development of QoS (quality of service) ontology, the semantic context information can be managed to build the channels to bridge fault indicators with performance metrics from corresponding sensors and thus based on reasoning techniques, diagnosis can be achieved at the fundamental component level within the system.