Biomedical

Accounting for contact network uncertainty in epidemic inferences

A simulation-based-inference method for epidemic models when the underlying contact network is only partially observed, applied to a real Shark Bay dolphin tuberculosis-skin-disease dataset. The method is genuinely interesting; the manuscript currently has model-definition contradictions and a missing reproducibility package that would draw a desk-reject from PLOS Computational Biology.

Abstract

Inferring the parameters of an epidemic process when the underlying contact network is only partially observed is a fundamental open problem in network epidemiology. We present a simulation-based-inference method that combines a Mixture Density Network compressor with Approximate Bayesian Computation (MDN-ABC) to jointly infer epidemic parameters and the latent contact structure from periodic observations of node disease status and noisy network samples. We validate the approach on simulated epidemics over Erdős–Rényi and log-normal-degree networks, and apply it to tuberculosis-skin-disease (TSD) surveillance data on Shark Bay bottlenose dolphins. The method recovers transmission parameters with calibrated uncertainty under both correctly specified and mildly misspecified network priors, and produces age-resolved susceptibility estimates for the dolphin application.

1. Introduction

A central challenge in network epidemiology is that the contact structure $A$ on which an outbreak unfolds is rarely directly observed. Self-reported contact diaries, automated proximity sensors, and genomic-based reconstruction each give noisy partial views. Standard approaches either ignore network uncertainty (treating $A$ as known) or marginalize over a prior class (e.g. assuming Erdős–Rényi structure). Both can substantially mis-state uncertainty in the inferred transmission parameters when the true contact distribution differs from the assumed prior.

In this work we develop a simulation-based-inference framework that jointly accounts for epidemic parameter uncertainty and contact network uncertainty. The core idea is to parameterize the network observation process explicitly, then use Approximate Bayesian Computation with a Mixture Density Network compression of the joint epidemic-network summary statistics. The result is a calibrated posterior over both the latent network $A$ and the disease parameters $θ = (β, γ)$ , requiring only forward simulation of the joint process.

2. Methods: MDN-ABC for epidemic-network joint inference

Let $X = {X_{t}}_{t = 0}^{T}$ denote the periodic node-status observations, $Y = {Y_{w}}_{w = 1}^{W}$ the noisy yearly network samples, and $A$ the latent true contact network. We target the joint posterior

$P (θ, A, ϕ ∣ X, Y) \propto P (X ∣ θ, A) P (Y ∣ A, ϕ) P (A) P (θ) P (ϕ),$

where $ϕ$ parameterizes the network observation process. The likelihood is intractable so we approximate the posterior via simulation: sample $(θ^{'}, A^{'}, ϕ^{'})$ from the prior, simulate $(X^{'}, Y^{'})$ , compute a learned summary-statistic distance $d (\cdot, \cdot)$ , and accept the proposed parameters within an ABC tolerance band. The MDN compressor is trained offline to minimize the Bayes-optimal summary loss for $θ$ .

3. Simulation study: Erdős–Rényi and log-normal degree networks

We simulate continuous-time SIR epidemics on networks of N=200 nodes. Transmission rate $β$ and recovery rate $γ$ are defined as continuous-time-Markov rates,

$β = Δ t \to 0 lim \frac{P ( W _{t + Δ t}^{i} = I ∣ W _{t}^{i} = S , \sum _{j} C _{t}^{ij} W _{t}^{j} > 0 )}{Δ t} .$

We periodically observe binary disease status of each node every 7 timesteps over 50 timesteps total. The status of the node is considered to be 1 if the node is infected and 0 if susceptible or recovered. We re-simulate the original epidemic 10 times for each scenario to produce posterior bands.

Two ground-truth network classes are used: Erdős–Rényi with mean degree 4, and log-normal-degree with mean degree 4. The observation model assumes Bernoulli edge sampling, $A_{ij} \sim Bernoulli (ρ)$ , which matches the ER prior but is a misspecified prior for the log-normal case — we use this contrast to characterize how robust posterior coverage is to network-prior misspecification.

4. Application: Tuberculosis-skin-disease in Shark Bay dolphins

We apply MDN-ABC to a real surveillance dataset on Shark Bay bottlenose dolphins, where individuals are followed across years and skin-lesion presence is recorded as a binary observation. Yearly inferred contact networks $Y_{w}$ for $w = 1, \dots, 5$ come from focal-follow association indices.

In Section 4.1 we model TSD as a discrete-time SIR process on the latent contact graph, with age-class-specific susceptibility for calves and adults. Disease status is $W_{t}^{i} \in {S, E, I, R}$ (with the role of E specified later in Sec. 4.1.1) and transmission probabilities are

$β_{c} = P (W_{t + Δ t}^{i} = E via j ∣ W_{t}^{i} = S, W_{t}^{j} = I, C_{t}^{i} = calf) .$

In Section 4.2 we model the noisy yearly network observations. Observed edge counts are negative-binomially distributed conditional on whether a true edge is present in that year:

$X_{w}^{ij} ∣ A_{w}^{ij} = 0 \sim NegBin (n_{0 w}, p_{0 w}), X_{w}^{ij} ∣ A_{w}^{ij} = 1 \sim NegBin (n_{1 w}, p_{1 w}) .$

Priors are $n_{0 w}, n_{1 w}, p_{0 w} \sim Gamma (2, 4)$ and $ρ_{w} \sim Beta (1, 20)$ . The full vector of network observation parameters is $ϕ = (n_{01}, \dots, n_{05}, p_{01}, \dots, p_{05}, n_{11}, \dots, n_{15}, p_{11}, \dots, p_{15}, ρ_{1}, \dots, ρ_{5})$ . We fix $p_{0 w} = 1$ for all $w$ to address weak identifiability.

5. Results: posterior coverage and age-resolved susceptibility

For the simulation studies, the MDN-ABC posterior recovers the true $(β, γ)$ to within nominal coverage when the network prior matches the data-generating process (ER case), and shows mild but well-calibrated coverage degradation under network-prior misspecification (log-normal case). Network-uncertainty marginalization is essential — fixing $A$ to a point estimate produces severely overconfident posteriors on $(β, γ)$ .

For the dolphin TSD application, the joint posterior shows that calf-class susceptibility is materially higher than adult-class susceptibility, with separability driven by the differential transmission-probability priors specified in Section 4.1. We discuss the sensitivity of this conclusion to those priors in Section 6.

6. Discussion

Limitations: in the dolphin application we modeled TSD as a simple SIR disease, which assumes that the incubation period for TSD is short relative to our simulation time-steps. The age-class susceptibility result is partly driven by the choice of unequal priors over calf vs. adult transmission probabilities; sensitivity analyses with weaker priors are reported in the appendix.

Future extensions include: (i) explicit SEIR-with-latent-period dynamics; (ii) joint inference over the network observation process parameters $ϕ$ rather than fixing per-year quantities; and (iii) extension to heterogeneous infectiousness with Weibull infectious-period distributions, the duration of infectiousness being distributed as Weibull with shape $γ_{a}$ and scale $γ_{b}$ .

References

[1] M. E. J. Newman, SIAM Rev. 45, 167 (2003).
[2] L. A. Meyers, B. Pourbohloul, M. E. J. Newman, et al., J. Theor. Biol. 232, 71 (2005).
[3] A. Vazquez et al., Proc. Natl. Acad. Sci. USA 101, 17940 (2004).
[4] R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Rev. Mod. Phys. 87, 925 (2015).
[5] M. J. Keeling and K. T. D. Eames, J. R. Soc. Interface 2, 295 (2005).
[6] M. Salathé and J. H. Jones, PLOS Comput. Biol. 6, e1000736 (2010).
[7] M. Salathé et al., Proc. Natl. Acad. Sci. USA 107, 22020 (2010).
[8] L. Stack et al., Theor. Popul. Biol. 137, 1 (2021).
[9] J. Sherborne et al., PLOS Comput. Biol. 14, e1006003 (2018).
[10] M. A. Beaumont, W. Zhang, D. J. Balding, Genetics 162, 2025 (2002).
[11] J.-M. Marin et al., Stat. Comput. 22, 1167 (2012).
[12] D. Greenberg, M. Nonnenmacher, J. Macke, ICML (2019).
[13] G. Papamakarios, I. Murray, NeurIPS (2016).
[14] T. McKinley et al., Bayesian Anal. 13, 1 (2018).
[15] K. Csilléry et al., Trends Ecol. Evol. 25, 410 (2010).
[16] K. R. Bisset et al., J. Comput. Sci. 5, 28 (2014).
[17] L. Mancastroppa et al., Phys. Rev. Res. 4, 043196 (2022).
[18] M. Smith et al., Methods Ecol. Evol. 11, 836 (2020).
[19] J. Mann, R. C. Connor, P. L. Tyack, H. Whitehead (eds.), Cetacean Societies (2000).
[37] B. Carpenter et al., J. Stat. Softw. 76, 1 (2017).
[52] M. D. Hoffman and A. Gelman, J. Mach. Learn. Res. 15, 1593 (2014).

Do not submit yet — strong premise, model-definition gaps, missing reproducibility package

Likely outcome if submitted today: desk reject — chiefly because of missing journal-required elements (Author Summary for PLOS Comput. Biol., ethics/permit statement for the dolphin field data, public availability for analysis data and code, complete references), but also because the dolphin disease model is internally contradictory (described as SIR but using an exposed state E with no E→I transition specified) and the network observation model has a parameter that is simultaneously fixed and assigned a Gamma prior on a [0,1]-bounded probability.

The statistical premise — joint posterior over epidemic parameters and a latent contact network using MDN-ABC — is genuinely strong and the simulation calibration is the right scaffolding. The single most important fix this week is to rewrite the methods around a single explicit generative model for each study (one for the simulation, one for the dolphin application), with a compact table defining every latent variable, observed variable, parameter, prior, time scale, transition rule, and sampling step. Once that table exists, most of the contradictions below collapse into one or two lines of clarifying text.

Recommended target: PLOS Computational Biology remains plausible after substantial revision, especially if framed as a simulation-based inference method for epidemics on uncertain networks. Cascade fallback: Epidemics or Journal of Theoretical Biology if method validation is strengthened; PLOS ONE only if the paper remains primarily a demonstration. Time to submission readiness: 6–10 weeks of focused revision, assuming code/data deposition and additional validation can be completed promptly.

Accounting for contact network uncertainty in epidemic inferences

Abstract

1. Introduction

2. Methods: MDN-ABC for epidemic-network joint inference

3. Simulation study: Erdős–Rényi and log-normal degree networks

4. Application: Tuberculosis-skin-disease in Shark Bay dolphins

5. Results: posterior coverage and age-resolved susceptibility

6. Discussion

References

Negative-binomial probability parameter has both a fixed value and a Gamma prior — at least one must be wrong

Figure 3 legend lists 11 instance labels (0–10) but text says 10 instances; Figure 5 axis labels duplicated

The dolphin disease model is described as SIR but uses an exposed state E with no E→I transition specified

Age-class susceptibility conclusion is partly pre-imposed by the priors, not learned from data

Simulation log-normal scenario uses an Erdős–Rényi observation model — interpret as misspecification, not as a network prior

Binary disease-status labels are reversed in the simulation description

No code repository, archived version, or software-version statement for an SBI methods paper

PLOS Comput. Biol. requires an Author Summary; ethics/permit statement for animal field data is missing

Positioning vs Sherborne et al., Stack et al., and the SBI literature is left implicit

Continuous-time SIR rate definitions conflict with discrete-time simulation language

References are cited but the bibliography appears truncated; data availability statement is missing

Weibull infectious-period parameters use two different notations in the same paragraph

Network parameter vector φ includes p₀w which is fixed; either remove or label as constant

A vs A_t / A_w: static-network and time-indexed-network notation are used inconsistently

Algorithm in Sec. 2.3 has duplicated step numbering and is referenced as "Algorithm 1" without a label