Skip to main content
Scienza Health
Last updated: April 2026Reviewed quarterly

The Scienza Health real-world dataset is a proprietary clinical data lake of 12.3 million senior care patients across 14,613 facilities, with 27 billion clinical records spanning 10+ years longitudinally. It includes a 3-million-patient neurodegenerative cohort and 2,500+ speech biomarkers per encounter, and powers Proactive Decision Orders through real-time cohort benchmarking on the GIA® platform.

THE IRREPLACEABLE ASSET

27 Billion Clinical Records.
One Powerful Data Lake.

Real-world evidence from 12.3 million senior care patients across 14,613 facilities — the proprietary dataset behind GIA® AI Co-Clinician.

Key Facts

Patients
12.3M
Facilities
14,613
Records
27B
Years
10+
Biomarkers
2,500+
Brain-health cohort
3M
27B
Clinical Records
12.3M
Patients
14,613
Facilities
10+
Years Longitudinal
Peer-ReviewedEditorially reviewed·

This content is intended for informational purposes and does not constitute medical advice. Editorially reviewed by David Kaiser, CEO of Scienza Health, for accuracy in post-acute care operations.

For partners evaluating the data layer behind GIA® platform, Proactive Decision Orders, and clinical intelligence. Review our clinical governance framework, peer-reviewed research, and EHR integrations.

COMPREHENSIVE INTELLIGENCE

Every clinical signal, captured.

Medications, diagnoses, structured assessments, daily care outcomes — longitudinally linked and queryable in sub-second time.

16.2B
Medication events
209.9M
Diagnosis records
61.9M
MDS assessments
1.3B
Daily outcomes
60
Data tables
2,500+
Speech biomarkers per encounter
BRAIN-HEALTH COHORT

3 Million Neurodegenerative Patients.

One of the largest real-world brain-health cohorts in healthcare — the foundation for speech biomarker validation.

1.7M
Other dementias
624K
Alzheimer's
420K
Parkinson's
268K
Vascular dementia
UNIQUE ADVANTAGE

Voice biomarker data — the differentiator.

Scienza Health is the only senior-care data partner with proprietary voice biomarker capture integrated directly into the longitudinal record. Each patient encounter contributes 2,500+ acoustic and linguistic features — the unstructured signal traditional EHR datasets cannot replicate.

  • Speech-derived markers from natural patient conversation, captured at point of care
  • Validated against peer-reviewed clinical endpoints (academic medical centers, MIT, Mayo, NIH consortium) — see research
  • AUC 0.97 for Parkinson’s detection from conversational speech (peer-reviewed)
  • Continuously expanding: every screening encounter adds to the longitudinal cohort
SOURCES & LINKAGE

Where the signal comes from.

Structured clinical data

  • EHR (PointClickCare-integrated)
  • MDS — Section C (BIMS), G (functional), N (medications)
  • Medication administration records
  • Diagnoses (ICD-10), problem lists
  • Care plans, ADL/IADL functional measures
  • Daily clinical events — transfers, falls, behavioral incidents, vitals

Voice + multimodal data

  • 2,500+ speech biomarkers per encounter
  • Acoustic features — prosody, pitch, timing, articulation
  • Linguistic features — lexical complexity, semantic coherence
  • Computer vision signals (consented video encounters)
  • Outcome linkage — speech features tied to clinical trajectory
OUTCOMES CAPTURED

1.3 billion outcomes are not all equal.

Pharma and payor research questions hinge on the right outcomes, captured with the right structure.

  • Disease progression — functional decline, BIMS score change, ADL/IADL trajectories
  • Adverse events — falls, behavioral incidents, medication-related events, transfers
  • Hospital utilization — avoidable transfers, readmissions, length of stay
  • Mortality and discharge outcomes
  • Treatment response — medication initiation/titration tied to clinical trajectory
FROM DATA TO DECISIONS

How the data powers Proactive Decision Orders.

The dataset isn’t a backend asset that sits in cold storage. It is the engine. Every new patient encounter is benchmarked, in real time, against the millions of like-cohort patients who came before — same demographics, same diagnosis history, same functional baseline, same trajectory shape. Like-cohort outcomes condition the model. The result: highly probable clinical orders before the physician walks in.

THE CLOSED LOOP
  1. New patient encounter → voice + structured data captured
  2. Cohort match → like-patients selected from 12.3M longitudinal records
  3. Outcome distribution → probability surface for next-best clinical actions
  4. Highest-probability orders surfaced to clinician with reasoning
  5. Clinician reviews and approves — every action gates through human judgment
  6. Action and outcome feed back — the loop sharpens

Without longitudinal scale, cohort matching produces noise. Without outcome linkage, probability surfaces are flat. We have both. That is why Proactive Decision Orders work — and why they are difficult to copy.

THE MOAT

Without data, there is no AI.

We have the data.

  • 10+ years longitudinal — trajectory, not snapshots
  • 14,613 facilities across diverse settings — built-in generalizability
  • 3M+ neurodegenerative patients — statistical power where it matters
  • Continuously updated via native PointClickCare integration — daily, not batches
  • Voice biomarker capture at point of care — uncopyable from claims data alone
FOR EVERY STAKEHOLDER

Real-world evidence, every angle.

Pharma & biotech

Identify trial-eligible cohorts, generate post-market evidence, validate drug-target hypotheses against the largest senior-care neurodegenerative cohort in production today.

Talk to RWE team

Payors & Medicare Advantage

Risk stratification at the patient level, HCC documentation completeness, avoidable-utilization signal years before claims data surfaces it.

Talk to plan-partnerships team

Health systems & SNF operators

Benchmark outcomes against the cohort. Power your screening, documentation, and quality programs with the data layer behind GIA® AI Co-Clinician.

Talk to clinical-partnerships team

Research collaborators

Academic medical centers, NIH-funded consortia, brain-health foundations — partner with us on joint studies, data access, and longitudinal research.

Talk to research team
QUALITY & GOVERNANCE

Enterprise-grade. Research-ready.

  • HIPAA compliant. Fully de-identified.
  • 5-Layer Governance. AES-256. Human-in-the-Loop.
  • Continuously updated — new patient encounters and clinical events flow daily.
  • 95% compression: 8TB raw → 400GB queryable. Sub-second query speed via AWS Athena.
  • Python, R, and SQL compatible. 37 dimension + 23 fact tables.
  • Demographically diverse cohorts; detailed breakdowns available under partnership.

See the data behind GIA®.

Decision-grade evidence in 90 days. For pharma, payors, health systems, and research collaborators.

20-minute conversation. No NDA required to start.