Biomedical
Accounting for contact network uncertainty in epidemic inferences
A simulation-based-inference method for epidemic models when the underlying contact network is only partially observed, applied to a real Shark Bay dolphin tuberculosis-skin-disease dataset. The method is genuinely interesting; the manuscript currently has model-definition contradictions and a missing reproducibility package that would draw a desk-reject from PLOS Computational Biology.
Abstract
Inferring the parameters of an epidemic process when the underlying contact network is only partially observed is a fundamental open problem in network epidemiology. We present a simulation-based-inference method that combines a Mixture Density Network compressor with Approximate Bayesian Computation (MDN-ABC) to jointly infer epidemic parameters and the latent contact structure from periodic observations of node disease status and noisy network samples. We validate the approach on simulated epidemics over Erdős–Rényi and log-normal-degree networks, and apply it to tuberculosis-skin-disease (TSD) surveillance data on Shark Bay bottlenose dolphins. The method recovers transmission parameters with calibrated uncertainty under both correctly specified and mildly misspecified network priors, and produces age-resolved susceptibility estimates for the dolphin application.
1. Introduction
A central challenge in network epidemiology is that the contact structure A on which an outbreak unfolds is rarely directly observed. Self-reported contact diaries, automated proximity sensors, and genomic-based reconstruction each give noisy partial views. Standard approaches either ignore network uncertainty (treating A as known) or marginalize over a prior class (e.g. assuming Erdős–Rényi structure). Both can substantially mis-state uncertainty in the inferred transmission parameters when the true contact distribution differs from the assumed prior.
In this work we develop a simulation-based-inference framework that jointly accounts for epidemic parameter uncertainty and contact network uncertainty. The core idea is to parameterize the network observation process explicitly, then use Approximate Bayesian Computation with a Mixture Density Network compression of the joint epidemic-network summary statistics. The result is a calibrated posterior over both the latent network A and the disease parameters θ=(β,γ), requiring only forward simulation of the joint process.
2. Methods: MDN-ABC for epidemic-network joint inference
Let X={Xt}t=0T denote the periodic node-status observations, Y={Yw}w=1W the noisy yearly network samples, and A the latent true contact network. We target the joint posterior
P(θ,A,ϕ∣X,Y)∝P(X∣θ,A)P(Y∣A,ϕ)P(A)P(θ)P(ϕ),
where ϕ parameterizes the network observation process. The likelihood is intractable so we approximate the posterior via simulation: sample (θ′,A′,ϕ′) from the prior, simulate (X′,Y′), compute a learned summary-statistic distance d(⋅,⋅), and accept the proposed parameters within an ABC tolerance band. The MDN compressor is trained offline to minimize the Bayes-optimal summary loss for θ.
3. Simulation study: Erdős–Rényi and log-normal degree networks
We simulate continuous-time SIR epidemics on networks of N=200 nodes. Transmission rate β and recovery rate γ are defined as continuous-time-Markov rates,
β=Δt→0limΔtP(Wt+Δti=I∣Wti=S,∑jCtijWtj>0).
We periodically observe binary disease status of each node every 7 timesteps over 50 timesteps total. The status of the node is considered to be 1 if the node is infected and 0 if susceptible or recovered. We re-simulate the original epidemic 10 times for each scenario to produce posterior bands.
Two ground-truth network classes are used: Erdős–Rényi with mean degree 4, and log-normal-degree with mean degree 4. The observation model assumes Bernoulli edge sampling, Aij∼Bernoulli(ρ), which matches the ER prior but is a misspecified prior for the log-normal case — we use this contrast to characterize how robust posterior coverage is to network-prior misspecification.
4. Application: Tuberculosis-skin-disease in Shark Bay dolphins
We apply MDN-ABC to a real surveillance dataset on Shark Bay bottlenose dolphins, where individuals are followed across years and skin-lesion presence is recorded as a binary observation. Yearly inferred contact networks Yw for w=1,…,5 come from focal-follow association indices.
In Section 4.1 we model TSD as a discrete-time SIR process on the latent contact graph, with age-class-specific susceptibility for calves and adults. Disease status is Wti∈{S,E,I,R} (with the role of E specified later in Sec. 4.1.1) and transmission probabilities are
βc=P(Wt+Δti=E via j∣Wti=S,Wtj=I,Cti=calf).
In Section 4.2 we model the noisy yearly network observations. Observed edge counts are negative-binomially distributed conditional on whether a true edge is present in that year:
Xwij∣Awij=0∼NegBin(n0w,p0w),Xwij∣Awij=1∼NegBin(n1w,p1w).
Priors are n0w,n1w,p0w∼Gamma(2,4) and ρw∼Beta(1,20). The full vector of network observation parameters is ϕ=(n01,…,n05,p01,…,p05,n11,…,n15,p11,…,p15,ρ1,…,ρ5). We fix p0w=1 for all w to address weak identifiability.
5. Results: posterior coverage and age-resolved susceptibility
For the simulation studies, the MDN-ABC posterior recovers the true (β,γ) to within nominal coverage when the network prior matches the data-generating process (ER case), and shows mild but well-calibrated coverage degradation under network-prior misspecification (log-normal case). Network-uncertainty marginalization is essential — fixing A to a point estimate produces severely overconfident posteriors on (β,γ).
For the dolphin TSD application, the joint posterior shows that calf-class susceptibility is materially higher than adult-class susceptibility, with separability driven by the differential transmission-probability priors specified in Section 4.1. We discuss the sensitivity of this conclusion to those priors in Section 6.
6. Discussion
Limitations: in the dolphin application we modeled TSD as a simple SIR disease, which assumes that the incubation period for TSD is short relative to our simulation time-steps. The age-class susceptibility result is partly driven by the choice of unequal priors over calf vs. adult transmission probabilities; sensitivity analyses with weaker priors are reported in the appendix.
Future extensions include: (i) explicit SEIR-with-latent-period dynamics; (ii) joint inference over the network observation process parameters ϕ rather than fixing per-year quantities; and (iii) extension to heterogeneous infectiousness with Weibull infectious-period distributions, the duration of infectiousness being distributed as Weibull with shape γa and scale γb.
References
[1] M. E. J. Newman, SIAM Rev. 45, 167 (2003).
[2] L. A. Meyers, B. Pourbohloul, M. E. J. Newman, et al., J. Theor. Biol. 232, 71 (2005).
[3] A. Vazquez et al., Proc. Natl. Acad. Sci. USA 101, 17940 (2004).
[4] R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Rev. Mod. Phys. 87, 925 (2015).
[5] M. J. Keeling and K. T. D. Eames, J. R. Soc. Interface 2, 295 (2005).
[6] M. Salathé and J. H. Jones, PLOS Comput. Biol. 6, e1000736 (2010).
[7] M. Salathé et al., Proc. Natl. Acad. Sci. USA 107, 22020 (2010).
[8] L. Stack et al., Theor. Popul. Biol. 137, 1 (2021).
[9] J. Sherborne et al., PLOS Comput. Biol. 14, e1006003 (2018).
[10] M. A. Beaumont, W. Zhang, D. J. Balding, Genetics 162, 2025 (2002).
[11] J.-M. Marin et al., Stat. Comput. 22, 1167 (2012).
[12] D. Greenberg, M. Nonnenmacher, J. Macke, ICML (2019).
[13] G. Papamakarios, I. Murray, NeurIPS (2016).
[14] T. McKinley et al., Bayesian Anal. 13, 1 (2018).
[15] K. Csilléry et al., Trends Ecol. Evol. 25, 410 (2010).
[16] K. R. Bisset et al., J. Comput. Sci. 5, 28 (2014).
[17] L. Mancastroppa et al., Phys. Rev. Res. 4, 043196 (2022).
[18] M. Smith et al., Methods Ecol. Evol. 11, 836 (2020).
[19] J. Mann, R. C. Connor, P. L. Tyack, H. Whitehead (eds.), Cetacean Societies (2000).
[37] B. Carpenter et al., J. Stat. Softw. 76, 1 (2017).
[52] M. D. Hoffman and A. Gelman, J. Mach. Learn. Res. 15, 1593 (2014).