For each major or critical comment surfaced by the pre-submission reviewer report (5.17), here is a draft response paragraph the author can adapt for the rebuttal letter when the journal returns a major-revision decision. Each response is grounded in the same verbatim manuscript evidence the reviewer cited.
Response to Comment 0: Pixel-level predictions are extrapolated far beyond the resolution of the training supervision and lack ground-truth validation
Severity: critical | Stance: partially agrees revises scope
Reviewer Comment:
Author Response:
We agree that the sub-spot interpretation of the dense outputs must be supported more directly. In the revision, we will temper the language around single-cell and subcellular resolution and explicitly frame the 2 µm maps as high-resolution interpolants learned from spot-level supervision. We will add additional validation against histological structures and cell-type marker patterns on the same slides, and we will include a matched-resolution comparison when suitable subcellular data are available in our held-out cohort. We also will report the correlation at each output scale alongside qualitative examples so that readers can judge the biological fidelity of the maps more transparently.
Manuscript Change:
Results and Discussion, section on high-resolution decoding; new supplementary validation figure comparing dense maps with histology and marker co-localization
Response to Comment 1: Cross-scale generalization experiment conflates training-data confounds and lacks a matched-resolution oracle
Severity: critical | Stance: agrees and commits
Reviewer Comment:
Author Response:
We appreciate this concern and agree that the training and evaluation protocols need to be stated more explicitly. In the revision, we will clarify exactly which PixNet weights are used for Table 5 and confirm that no Visium HD supervision leaks into those experiments. We will also add a within-domain oracle trained and tested on matched Visium HD breast cancer slides with held-out test slides, so the cross-scale transfer results can be interpreted relative to a clear upper bound. Finally, we will revise the baseline evaluation protocol so that each comparator is run in its intended mode, including iStar in a super-resolution setting, and we will add a gene-wise biological signal analysis to complement PCC.
Manuscript Change:
Methods and Results around Table 5; new supplementary oracle benchmark and baseline protocol details
Response to Comment 2: Code is promised but not provided; Visium HD data source is mis-named and unversioned
Severity: critical | Stance: agrees and commits
Reviewer Comment:
Author Response:
We agree that the current statement is insufficient for a methods paper. Before resubmission, we will release the code in a versioned public repository with a tagged commit, pinned dependencies, and a runnable example, and we will provide the repository URL in the manuscript. We will also correct the vendor name to 10x Genomics, add direct dataset URLs with access dates and version identifiers, and document the full Visium HD preprocessing and bin-selection pipeline. These changes will make the experimental workflow reproducible by reviewers and readers.
Manuscript Change:
Data and Code Availability; Methods subsection on dataset provenance and preprocessing
Response to Comment 3: Biological interpretability of predictions is not established; gene panel selection biases evaluation away from spatially variable genes
Severity: major | Stance: agrees and commits
Reviewer Comment:
Author Response:
We agree that performance on highly expressed genes alone does not establish biological utility. In the revision, we will add an evaluation on independently identified spatially variable gene panels and report per-gene performance distributions rather than only aggregate PCC. We will also include case studies showing whether predicted maps recover known histological structure, tumor stroma boundaries, and marker gene co-localization. This will allow readers to assess whether the method captures biologically informative spatial patterns beyond broad expression trends.
Manuscript Change:
Results section on biological interpretation; new supplementary analyses on SVG panels and per-gene performance
Response to Comment 4: Headline generalization table omits standard deviations and baseline adaptation protocol
Severity: major | Stance: agrees and commits
Reviewer Comment:
Author Response:
We agree that Table 5 should report variability and protocol details more explicitly. In the revision, we will move the table to a fuller format and include standard deviations, paired statistical tests across slides or folds, and confidence intervals where appropriate. We will also add a detailed description of how each baseline is adapted to 2, 8, and 16 µm settings, and we will state when a method is evaluated in its native mode versus a super-resolution or interpolation mode. This will make the comparison fairer and easier to interpret.
Manuscript Change:
Table 5 and associated Methods text on cross-scale evaluation; expanded supplementary statistics
Response to Comment 5: Critical architecture and training hyperparameters are deferred to supplementary material
Severity: major | Stance: agrees and commits
Reviewer Comment:
Author Response:
We agree that the Methods section should be self-contained. In the revision, we will move the core architecture and training hyperparameters into the main text, including the number of ViT groups, the intermediate feature selection strategy, input tile resolution, patch size, batch size, augmentations, optimizer schedule, gradient clipping, and early stopping criteria. We will also cross-reference these details directly in the architecture figure caption. The supplement will then serve only as a place for additional ablation and implementation specifics.
Manuscript Change:
Main Methods and Figure caption for the architecture schematic
Response to Comment 6: Mixing locally retrained and externally borrowed baseline numbers compromises benchmark fairness
Severity: major | Stance: agrees and commits
Reviewer Comment:
Author Response:
We agree that borrowed benchmark values must be identified unambiguously. In the revision, we will annotate every externally sourced entry in Table 2 and explicitly state the preprocessing, gene panel, normalization, and split protocol used for each. Where feasible, we will rerun the baselines under a unified evaluation harness so that all methods are compared under identical conditions. If a full rerun is not computationally practical, we will reproduce representative baseline numbers locally to verify that the adapted values are consistent with our protocol.
Manuscript Change:
Table 2 caption and Methods subsection on benchmark curation and evaluation protocol
Response to Comment 7: The 'pixels' framing overstates the effective output resolution of the dense map
Severity: major | Stance: partially agrees revises scope
Reviewer Comment:
Author Response:
We appreciate this point and agree that the effective output stride must be stated clearly. In the revision, we will report the native output stride of the decoder in both pixels and micrometers, and we will quantify how the 2 µm outputs are constructed from the underlying prediction grid. If the native stride is coarser than 2 µm, we will revise the title and claims to emphasize multi-scale aggregation and dense map generation rather than literal pixel-resolved prediction. This clarification will better align the framing with the actual representational capacity of the model.
Manuscript Change:
Title, Abstract, and Methods subsection describing decoder output stride and upsampling
Notes for the author
- Responses are written to be concise, concrete, and non-defensive, with commitments limited to feasible revisions and clarifications.
What reviewers will catch — and what to fix first. Reporting guidelines, format compliance, citation completeness, novelty claims, statistics, figures, reproducibility. Address every critical-severity finding here before submitting.