Data Archiving Permissions
Guidance for responsible sharing of health statistics data.
Data Sharing Principles
Data archiving strengthens reproducibility and enables secondary analyses in health statistics. We encourage authors to share de identified datasets and code when possible.
When data cannot be shared due to privacy or regulatory constraints, provide a clear access statement and governance details.
Health statistics benefit from transparent data stewardship and clear documentation of variables and transformations.
Provide documentation for derived variables so future users can reproduce analytic decisions.
Describe how updates or corrections are logged for shared datasets.
Transparent reporting of data provenance and governance supports reproducibility and ethical compliance in health statistics.
Summaries that connect statistical findings to health outcomes improve translation to policy and practice.
If external validation is performed, describe population differences and implications for generalizability.
Repository Guidance
Select repositories that provide persistent identifiers, access governance, and documentation support.
- Institutional repositories with long term access
- Discipline specific data archives
- General repositories with DOI support
- Controlled access platforms for sensitive data
When sharing datasets, provide codebooks, provenance notes, and governance restrictions to protect privacy.
Describe anonymization or de-identification steps to protect participant privacy.
Provide contact points for data access questions or governance approval.
Well structured manuscripts accelerate peer review and help readers apply statistical insights to real world health decisions.
Report software versions and packages to support reproducibility across analytic environments.
Describe any model tuning or hyperparameter selection to support reproducibility in machine learning workflows.
Access Models
Open Access
De identified datasets shared openly with clear licensing.
Controlled Access
Sensitive data shared through approved access requests.
Hybrid Models
Summary datasets shared openly with restricted raw data.
Controlled access repositories may be appropriate for sensitive health datasets; include instructions for access requests.
Link to registered repositories using persistent identifiers for long term access.
State whether synthetic datasets are provided to support transparency.
Provide uncertainty measures such as confidence intervals or credible intervals for key estimates and model outputs.
When combining datasets, document linkage procedures and quality checks for matching accuracy.
If data access is restricted, describe the approval process for qualified researchers and expected timelines.
Documentation
Provide codebooks, data dictionaries, and analytic scripts to support interpretation and reuse.
Document preprocessing steps, variable definitions, and data transformations.
Share analytic code where possible to enable replication and sensitivity checks by other investigators.
Include readme files that summarize file structure and variable naming.
Describe how data requests are reviewed and approved.
Explain how missing data were handled and why chosen strategies were appropriate for the study design.
Highlight ethical safeguards for patient privacy, especially when working with linked or sensitive datasets.
For time series analyses, describe seasonality handling and any interventions or policy changes considered.
Sharing Workflow
Prepare
Organize datasets, codebooks, and variable definitions.
Choose Repository
Select a platform that matches data sensitivity.
Document Access
Describe access steps and governance requirements.
Update
Maintain versioning notes and update links when needed.
For linked datasets, document linkage identifiers and matching accuracy to support confidence in the analysis.
State any embargo periods or access conditions clearly.
Clear statistical reporting improves the interpretability of health evidence for clinicians, policymakers, and research funders.
When presenting predictive models, report calibration, discrimination, and decision curve metrics where relevant.
Include brief rationale for study design choices to support reviewer understanding and methodological transparency.
When reporting health disparities, describe how social determinants and contextual factors are measured.
Support
For data archiving questions, contact [email protected].
If data are updated regularly, describe versioning practices and how changes are tracked over time.
Outline data retention timelines and stewardship responsibilities.
We encourage authors to document assumptions and sensitivity analyses so conclusions remain robust across populations.
Define statistical terminology clearly for multidisciplinary readers who apply methods in clinical settings.
Use tables and figures to communicate effect sizes, uncertainty, and subgroup comparisons clearly.
Include data dictionary summaries or variable definitions for key covariates to improve interpretability.
Share Data Responsibly
Transparent data practices strengthen health statistics research integrity.