Computational models and tools for analysis, prediction, and control of infectious diseases

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Infectious disease modeling is used to examine pathogen transmission retrospectively and forecast outbreaks preemptively. Model results help public health authorities to optimize disease control measures, preventing catastrophic loss of lives in humans and animals. Yet, several fundamental challenges arise in infectious disease modeling. A critical problem involves modeling new and evolving pathogens for realistic simulations and reliable predictions of outcomes. Another concern is the lack of data related to infectious diseases. Epidemic modelers often face data inadequacy with host networks and disease incidence. This dissertation proposes remedies to challenges associated with infectious disease modeling, outbreak prediction, and host movement data. In response to vector-borne disease modeling challenges, this dissertation first takes a mechanistic approach. To realistically model the infection process, a novel interconnected network model is designed for the mosquito-vectored Zika virus, which links homogeneous vector populations with heterogeneous human contact networks. The model incorporates seasonal variations in mosquito abundance and characterizes hosts based on age group and gender. The aim is to develop a detailed model for an accurate representation of pathogen dynamics while keeping it computationally tractable. An event-based simulation tool is developed based on the non-Markovian Gillespie algorithm. This work investigates effects of seasonal variations on outbreak size, the role of sexual transmission in sustaining the pathogen, and relative contributions of key model parameters using a sensitivity analysis. A framework to improve machine learning performance for predicting dengue fever cases is developed in a data-driven approach. The goal is to fill in temporally limited human case data from spatially adjacent populations. The method ranks and sorts time-series data from peripheral locations around a target location as predictor variables commonly referred to as features. Metrics are computed from windowed time-shifted cross-correlation of incidence data, spatial distance, and historical prevalence to rank feature variables. A window detection method presented in this work analyzes incidence data to identify time intervals with significant outbreaks. The framework achieves improved prediction performance and works well with recurrent neural network (RNN) architectures. Performance gains are compared using different time window allocation methods for three distinct prediction models: linear, long short-term memory (LSTM), and gated recurrent units (GRU). Availability of data also affects applicability of mechanistic models. In the United States, farm animal movements are not tracked by a central authority. Lack of animal movement data is a significant obstacle in using network models to analyze infectious outbreaks in meat-producing industries. As an immediate solution, a novel method is presented to generate movement networks from limited data available in the public domain. A custom configuration model is developed for network generation that uses aggregate data from farm animal movement-related surveys and the U.S. agricultural census. A hypothetical spread of the African swine fever virus (ASFV) is simulated in a generated network to analyze how network structure affects pathogen dispersal. A node centrality-based analysis is performed to identify important farm operation types and evaluate how targeted control measures affect outbreaks. The experience of working with infectious disease models for the U.S. meat-producing industry revealed fundamental problems linked to trust and business data sharing. The U.S. beef cattle industry lacks adequate traceability, as most farm owners consider such data confidential, possibly harming their businesses if exposed. Blockchains, also known as distributed ledgers, have gained popularity in industrial supply chains because of their unique features of data immutability and transparency. A smart contract-based supply chain framework is designed using a private blockchain network. This system supports anonymity for users to protect their identities and lets everyone store data locally while ensuring the blockchain records any change in data with cryptographic proofs. The framework presented contains functionalities to perform business transactions, transfer animal data, conduct anonymous surveys, and trace animals. This work has original contributions in network epidemic models, data-driven prediction tools, network generation algorithms, and data management frameworks. It combines knowledge from social network analysis, graph theory, epidemiology, machine learning, statistics, cryptography, computer networks, and computational science to improve infectious disease modeling, analysis, and control. The knowledge gained here is generalizable to applications beyond specific cases presented in this dissertation.

Description

Keywords

Computational biology, Epidemic models, Machine learning, Predictive models, Numerical simulation, Blockchain

Graduation Month

May

Degree

Doctor of Philosophy

Department

Department of Electrical and Computer Engineering

Major Professor

Caterina M. Scoglio

Date

2021

Type

Dissertation

Citation