dismech - Project Details

Quick Navigation

What is Dismech? Technical Implementation

Schema Docs

Browse the full autogenerated schema reference and class/slot documentation.

🧭 What is Dismech?

Dismech curates disease knowledge in a structured format: mechanistic assertions, clinical features, genetics, interventions, and supporting evidence mapped to ontology terms.

On each Disease page, sections appear when curated data exists for that topic. The guide below summarizes what each section is intended to capture.

Disease Page Section Guide

⚙

Pathophysiology

Granular causal events linking molecular and cellular mechanisms to downstream disease outcomes.

●

Phenotypes

Clinical manifestations, frequencies, and context with phenotype ontology annotation.

🧬

Genetic Associations

Genes and variants linked to disease risk, presentation, inheritance, or subtype structure.

💊

Treatments

Therapies, intended effects, mechanism targets, and phenotype targets when available.

🌍

Environmental Factors

Exposures and contextual factors that influence onset, progression, or severity.

🔬

Biochemical Markers

Laboratory and molecular markers, including expected presence/absence and context.

🔀

Differential Diagnoses

Conditions with overlapping presentation and curated distinguishing features.

🔬

Clinical Trials

Relevant trials with phase, status, and intervention/target phenotype context.

✶

Histopathology

Tissue-level findings and characteristic microscopic patterns when curated.

📊

Related Datasets

Data resources supporting mechanistic or translational analyses.

⬡

Causal Graph

Visual summary of curated upstream/downstream links connecting mechanistic events and outcomes.

{ }

Source YAML

Raw, structured record used to generate the page, useful for auditing and curation review.

Data Curation Model

Curated Asset	Purpose	Location (GitHub)
Disease records	Primary YAML source of truth for pathophysiology, phenotypes, genetics, treatments, and evidence.	`kb/disorders/`
Comorbidity records	Structured associations between conditions with mechanistic and evidence context.	`kb/comorbidities/`
Disease model schema	Defines required structure, allowed terms, and evidence object model.	`src/dismech/schema/dismech.yaml`
Evidence cache	Local reference cache used for repeatable citation/snippet validation.	`references_cache/`

Evidence and Provenance Workflow

Assertions use explicit support status (SUPPORT, PARTIAL, REFUTE, etc.) and evidence source categories.
Reference snippets are validated against source text to reduce hallucinated evidence.
Curation quality is monitored through schema checks, ontology checks, and reference checks.

# Validate one disease file end-to-end
just validate kb/disorders/Some_Disease.yaml

# Validate reference snippets against source content
just validate-references kb/disorders/Some_Disease.yaml

# Run cross-project quality checks
just qc
just compliance-all
just compliance-weighted

Related guidance on hallucination-resistant evidence: AI4Curation: Make IDs Hallucination-Resistant.

Contributor Path

Contribute by curating YAML records, running validations, and submitting pull requests with evidence-backed updates. Start here: CONTRIBUTING.md.

Open issues and curation priorities: GitHub Issues.

🛠 Technical Implementation

Dismech combines deterministic validation/build infrastructure with agentic research and curation workflows. This section maps architecture components to concrete repository files and external tooling.

Rendered schema documentation is available at https://dismech.monarchinitiative.org/elements/.

Core Tools Used in This Project

Tool	Brief Explanation	How It Is Used Here
Claude and Claude Code	LLM assistant tooling used for guided curation, review responses, and targeted repository edits.	Integrated via GitHub workflows such as `claude.yml` and `claude-code-review.yml`.
DRAGON-AI	An ontology-aware agent workflow used for repository interactions and follow-up edits triggered from collaboration events.	Configured in `.github/workflows/dragon-ai.yml` for mention-driven issue/PR/review processing.
Deep Research Agents (Edison, OpenAI, Perplexity)	Provider-based research agents used to populate initial evidence candidates before structured curation and validation.	Run through Deep Research Client and project commands in `project.justfile` (for example, `just research-disorder perplexity ...` or `just research-disorder openai ...`).
GitHub Actions	Automation platform for CI checks, scheduled compliance runs, page generation, deploys, and release exports.	Workflow definitions are maintained in `.github/workflows/`, including build/test, generation, weekly compliance, and agent workflows.
Just and Justfiles	A command runner for repeatable project tasks, similar to lightweight build recipes.	Project commands for validation, page generation, QC dashboards, research, and exports are defined in `project.justfile` and imported via `justfile`.
OAK (Ontology Access Kit)	A unified API/toolkit for ontology lookup, traversal, and term validation across ontology sources.	OAK adapters are configured in `conf/oak_config.yaml` and used by LinkML term-validation steps to check IDs/labels in curated records.
LinkML Language and Tooling	A data modeling language and ecosystem for defining schemas, generating artifacts, and validating data.	The core model is defined in `src/dismech/schema/dismech.yaml`; Dismech uses LinkML validators (schema, term, and reference validation) and related tooling in the QC pipeline.

Architecture: Foundational and Agentic Components

Layer	What It Does	Implementation Link
Data model	Disease classes, slots, enums, descriptor bindings, evidence model.	`src/dismech/schema/dismech.yaml`
Knowledge content	Curated disease and comorbidity records in YAML.	`kb/`
Ontology adapters	OAK-backed ontology resolution and term validation configuration.	`conf/oak_config.yaml`
Rendering pipeline	Generates disorder/comorbidity/classification pages and causal graph views.	`src/dismech/render.py`
Browser UI	Faceted search app, schema-driven field config, generated records.	`app/`
Commands and recipes	Validation, generation, research, and export task entry points.	`project.justfile`
Automation workflows	CI, page generation, compliance loops, docs deploy, release exports.	`.github/workflows/`
Agentic research integration	Provider-based deep research report generation used in curation loops.	Deep Research Client

Build and Validation Commands

# Validate all disorders (schema + terms + references + deep-research QC)
just qc

# Generate pages and browser data
just gen-pages
just gen-browser-data
just gen-all

# Generate compliance dashboard
just gen-dashboard

# Export KGX
just export-kgx

Automation Workflows

The workflow layer is more than standard CI. It combines deterministic quality gates with recurring agent-assisted maintenance loops, then routes results back through normal PR review.

Workflow	Trigger and Scope	Automated Actions	Human Touchpoint
`main.yaml`	Push to `main` and all PRs; path-filtered to changed source, disorder YAML, and comorbidity YAML.	Runs linting, validates only changed disorder/comorbidity files with project recipes, and runs tests when source code changes.	Reviewers inspect failing/passing checks and request edits before merge.
`generate-pages.yaml`	Push to `main` when KB/templates/render/export/QC config changes, plus manual dispatch.	Regenerates disorder/comorbidity pages, browser data, and dashboard; commits outputs and opens an automated PR when diffs exist.	Maintainers review generated-content PRs and spot-check render/dashboard correctness.
`weekly-compliance.yaml`	Weekly cron plus manual dispatch with inputs (`num_files`, `areas_for_improvement`, model).	Runs agentic compliance-improvement passes using compliance metrics and validation loops, then prepares per-file fix PRs.	Humans choose focus areas, review generated fixes, and merge or request correction.
`post-review-agent.yml`	Daily cron plus manual dispatch (`days_back`, `dry_run`, optional PR number, model choice).	Scans unresolved editorial review comments and chooses one action: suggested patch, thread reply, or new issue for broader work.	PR authors accept/reject suggested changes and continue discussion in review threads.
`dragon-ai.yml`	Issue/PR/comment events; runs only on qualifying mentions from allowlisted controllers.	Parses mention intent, builds a structured prompt, and runs headless agent execution tied to GitHub context.	Controllers direct tasks in threads; maintainers review resulting changes/PRs.
`kgx-release.yaml`	Release-oriented export flow.	Builds KGX export artifacts and attaches versioned outputs to releases.	Release maintainers verify export quality and publish release notes/artifacts.

Distinctive pattern: validation and generation are continuous, but every meaningful content change still passes through transparent GitHub PR review with explicit provenance and evidence checks.

Related Tooling and Integrations

Just Ontology Access Kit (OAK) LinkML LinkML Reference Validator LinkML Term Validator LinkML Embeddings Explorer LinkML Browser Deep Research Client