AI4Healthcare Technical Workshop: Pre-Event Guide
This Saturday, AGI House and Mithrl are co-hosting a one-day healthcare AI hackathon at our Hillsborough house. We're bringing together engineers, researchers, and founders to prototype solutions for some of healthcare's most painful problems.
Healthcare represents ~18% of U.S. GDP, yet software penetration remains remarkably low. Administrative overhead alone consumes an estimated $250–300 billion annually in the U.S., with only a fraction currently automated. Prior authorization—one of healthcare's most painful administrative workflows—costs the system tens of billions per year with minimal software adoption.
The labor picture reinforces urgency: AAMC projects a shortage of 37,800–124,000 physicians by 2034, compounded by nursing shortages. AI is increasingly positioned not as incremental improvement but as essential infrastructure.
The momentum is unmistakable. Just two days ago, OpenAI launched ChatGPT Health—a dedicated product allowing users to connect medical records and wellness apps (Apple Health, MyFitnessPal, Peloton, and others) for personalized health conversations. OpenAI reports that over 230 million people globally ask health and wellness questions on ChatGPT every week. This is a strong signal: consumer healthtech is entering a new phase where AI becomes the primary interface for health information.
Early enterprise deployments show similar promise. Ambient documentation tools, billing automation, and prior auth assistants demonstrate measurable time savings. The startup ecosystem is capturing significant share of generative AI spend in healthcare—a window where nimble entrants can out-innovate incumbents before consolidation.
Where AI Is Making Inroads
Clinical Documentation & Ambient AI
Ambient scribe tools—AI that listens to patient encounters and generates structured notes—have emerged as the clearest near-term ROI case. Products from Nuance (DAX), Abridge, and Suki are seeing rapid adoption. The value proposition is straightforward: physicians spend 2+ hours daily on documentation; AI can reclaim much of that time.
Medical Imaging & Diagnostics
Over 900 AI-enabled medical devices have FDA clearance, predominantly in radiology. Applications range from detecting diabetic retinopathy to flagging suspicious nodules on chest CTs. The challenge remains reimbursement—most algorithms lack dedicated payment codes, so adoption depends on demonstrating downstream efficiency or outcome improvements.
Drug Discovery & Development
Generative models can propose thousands of candidate molecules for a given target, dramatically accelerating early-stage R&D. AlphaFold's protein structure predictions have become essential infrastructure for structural biology. However, clinical validation remains the hard constraint—the vast majority of "AI-designed" drugs have yet to prove themselves in Phase II/III trials.
Scheduling & Demand Prediction
Health systems are deploying ML models to forecast patient volume, optimize OR scheduling, and reduce no-shows. These operational AI applications often fly under the radar but deliver concrete cost savings and capacity improvements.
Consumer Healthtech
The ChatGPT Health launch exemplifies a broader trend: AI as the consumer's primary health interface. Wearables generate continuous physiological data; AI synthesizes it into actionable insights. The risk is that consumer-facing AI operates outside traditional clinical guardrails—accuracy and appropriate escalation become critical.
Revenue Cycle & Administrative Automation
Prior auth, claims processing, denial management, and patient outreach represent hundreds of billions in manual labor. These workflows are ripe for automation and represent greenfield opportunity with less direct EHR competition.
Key Dynamics Shaping the Space
EHR Platform Dominance
Epic holds ~35-40% market share among U.S. hospitals and is aggressively embedding AI—ambient note-taking, clinical decision support, and agentic workflows. This creates both threat and opportunity for startups: unless your solution delivers order-of-magnitude improvement, hospitals default to incumbent add-ons. Successful entrants typically position as an intelligence layer atop EHRs rather than replacements.
The Services-to-Software Transition
The real TAM isn't the existing IT budget—it's the vast services budget (call centers, manual review, fax-based workflows) transitioning to software. Prior auth, patient outreach, referral management, and document processing represent greenfield opportunities.
AI vs. AI Arms Race
As providers deploy AI to optimize revenue (coding, appeals, auth requests), payers respond in kind with fraud detection and claim review algorithms. The equilibrium likely involves AI-to-AI negotiation, but explainability and fairness become critical—black-box denials invite regulatory and reputational risk.
Clinical AI: Efficacy vs. Deployment Gap
Technical accuracy is necessary but insufficient. Most imaging algorithms aren't directly reimbursed; adoption depends on demonstrating improved outcomes or efficiency. The implication for builders: focus on end-to-end value, not just model performance.
Workshop Tracks
1. Agentic Platforms for Compound/Therapeutic Discovery
2. Agentic Platforms for Public Biomedical Data Access
3. Agentic Platforms for Healthcare Admin Automation
4. Open Frontier in AI × Healthcare
Project Ideas
All projects below use publicly available APIs and data sources. No specialized compute required.
Project 1: Virtual Medicinal Chemist
Track 1 | Difficulty: Advanced
Build an agent that proposes drug candidates for a given biological target, filters by drug-like properties, and generates an inspectable reasoning trace with literature citations.
Stack:
- LLM orchestration: LangChain, LlamaIndex, or raw function-calling
- Compound databases: ChEMBL API, PubChem PUG REST
- Property calculation: RDKit (open source)
- Literature: PubMed E-utilities, Semantic Scholar API
- Optional generative: REINVENT, MolGPT (open source)
Project 2: Biomedical Data Scout
Track 2 | Difficulty: Intermediate
A conversational agent that finds relevant public datasets for a research question—querying GEO, ClinicalTrials.gov, and similar repositories, then ranking and summarizing results.
Stack:
- NCBI GEO API, ClinicalTrials.gov API
- LLM: GPT-3.5/4, Claude, or open models (Llama 3, Mistral)
- Optional: cellxgene, Synapse APIs for specialized datasets
Project 3: Ambient Clinical Note Assistant
Track 3 | Difficulty: Beginner/Intermediate
An "ambient scribe" that transcribes a doctor-patient conversation and generates structured clinical documentation (SOAP format or similar).
Stack:
- Speech-to-text: Whisper API or local Whisper
- Note generation: GPT-4, Claude, or fine-tuned open model
- Optional: QuickUMLS for terminology normalization
Project 4: Intelligent Intake Processor
Track 3 | Difficulty: Intermediate
A pipeline that OCRs incoming healthcare documents (referrals, insurance letters, lab reports), classifies them, extracts structured fields, and routes appropriately.
Stack:
- OCR: Tesseract, or Google Vision/Amazon Textract
- Classification/extraction: LLM with structured output prompting
- Output: JSON to mock database or dashboard
Project 5: Clinical Trial Matching Agent
Track 4 | Difficulty: Intermediate
Given a patient summary, search ClinicalTrials.gov and rank trials by eligibility match, flagging inclusion/exclusion criteria conflicts.
Stack:
- ClinicalTrials.gov API
- LLM for eligibility parsing and patient-criteria matching
- Optional: Trialstreamer dataset for pre-indexed trials
Project 6: AI Literature Review Assistant
Track 4 | Difficulty: Beginner
An assistant that retrieves PubMed abstracts for a query and synthesizes findings into a structured summary with citations.
Stack:
- PubMed E-utilities
- LLM for summarization
- Citation tracking via PMID
Project 7: Insurance Denial Appeal Generator
Track 3 | Difficulty: Intermediate
Given a denial letter and relevant clinical documentation, generate a structured appeal letter citing medical necessity guidelines and supporting evidence.
Stack:
- Document parsing: PyMuPDF or pdfplumber
- LLM for appeal generation with template adherence
- Optional: RAG over CMS coverage guidelines (publicly available)
Project 8: Medication Interaction Checker with Explanations
Track 4 | Difficulty: Beginner
An agent that takes a medication list, queries drug interaction databases, and provides patient-friendly explanations of risks and alternatives.
Stack:
- RxNorm API (NLM)
- OpenFDA Drug API
- DrugBank or DailyMed
- LLM for explanation generation
Project 9: Radiology Report Structuring Agent
Track 4 | Difficulty: Intermediate
Convert free-text radiology reports into structured data (findings, impressions, measurements, follow-up recommendations) with standardized terminology mapping.
Stack:
- Sample reports: MIMIC-III/IV (with credentialing) or synthetic
- LLM for extraction with structured output
- RadLex ontology for terminology normalization
Project 10: Patient Education Content Generator
Track 4 | Difficulty: Beginner
Given a diagnosis or procedure code, generate personalized, reading-level-appropriate patient education materials in multiple languages.
Stack:
- ICD-10/CPT lookup via public APIs
- LLM for content generation with reading level control
- Translation: LLM-based or LibreTranslate (open source)
Project 11: Social Determinants of Health Extractor
Track 3 | Difficulty: Intermediate
Parse clinical notes to identify and structure social determinants of health (housing instability, food insecurity, transportation barriers) using ICD-10 Z-codes.
Stack:
- Sample notes: MIMIC or synthetic
- LLM for extraction
- SDOH ontologies and Z-code mappings (publicly available)
Project 12: Smart Scheduling Assistant
Track 3 | Difficulty: Intermediate
An agent that optimizes appointment scheduling by predicting no-show risk, suggesting overbooking strategies, and automating patient reminders.
Stack:
- Historical appointment data (synthetic or MIMIC)
- Classification model for no-show prediction
- LLM for natural language reminder generation
- Calendar/scheduling API integration
Project 13: Prior Authorization Auto-Submitter
Track 3 | Difficulty: Advanced
Given a proposed procedure and patient chart, automatically extract required clinical justification, populate payer-specific forms, and draft the submission.
Stack:
- Document parsing for clinical notes
- LLM for medical necessity extraction
- Template filling for common payer forms
- Optional: FHIR integration for structured data
Opportunities for Builders
Healthcare AI is past the hype cycle and into the deployment phase—but massive gaps remain. Here's where we see the most compelling opportunities for founders and technical teams:
1. The "Last Mile" of Clinical AI
Hundreds of FDA-cleared algorithms exist, but most sit unused. The opportunity isn't building another diagnostic model—it's building the integration, workflow, and business model infrastructure that gets AI into clinical practice. Think: deployment platforms, EHR middleware, and outcome measurement systems.
2. Revenue Cycle Automation at Scale
Prior authorization, denial management, and claims follow-up remain shockingly manual. Early entrants like Cohere Health have proven the market; there's room for vertical-specific solutions (oncology auth, imaging auth, DME) and for platforms that work across the payer-provider boundary.
3. AI-Native Clinical Operations
Scheduling, capacity planning, staffing optimization—these operational problems have been tackled with traditional ML, but LLM-powered agents that can reason about constraints and communicate with staff represent a step change. The opportunity is end-to-end automation, not just prediction.
4. Consumer Health Infrastructure
ChatGPT Health validates the demand for AI-powered consumer health tools. But OpenAI isn't building the disease-specific apps, the chronic condition management platforms, or the tools that integrate with clinical care. There's room for verticalized consumer experiences (diabetes management, mental health, maternal health) that go deeper than a general-purpose chatbot.
5. Data Infrastructure for Healthcare AI
Every healthcare AI company rebuilds the same data pipelines—EHR extraction, deidentification, FHIR normalization, synthetic data generation. Platform companies that solve these problems once (like what Databricks did for analytics) could capture enormous value.
6. AI for Drug Development Beyond Discovery
Generative chemistry gets attention, but the harder problems are downstream: trial design optimization, patient recruitment, regulatory submission automation, real-world evidence generation. These are massive cost centers with limited software penetration.
7. Payer-Side Intelligence
Most healthcare AI investment flows to the provider side. But payers face analogous challenges: fraud detection, utilization management, member engagement, care gap identification. The payer market is concentrated (easier sales motion) and less encumbered by EHR incumbents.
8. Global Health Markets
U.S. healthcare is the largest market, but regulatory and reimbursement complexity creates moats for incumbents. Emerging markets—India, Southeast Asia, Latin America, Africa—have different constraints (mobile-first, low-resource settings, different disease burdens) and may be more amenable to AI-native solutions.
9. Trust and Safety Infrastructure
As AI handles more clinical and administrative decisions, the need for audit trails, explainability, bias detection, and human oversight grows. Companies building the "compliance layer" for healthcare AI—analogous to what Vanta does for SOC 2—could become essential infrastructure.
10. Clinician-in-the-Loop Tools
The most successful healthcare AI augments rather than replaces. Tools that make physicians, nurses, and pharmacists more effective—surfacing relevant information, reducing cognitive load, catching errors—will see faster adoption than autonomous systems. The sweet spot is high-value human oversight on AI-generated outputs.
Guiding Principles
- Augmentation over replacement. The best healthcare AI empowers clinicians and administrators; it doesn't claim to replace judgment.
- End-to-end value. Accurate models aren't enough. Demonstrate workflow integration and tangible benefit.
- Explainability matters. Healthcare demands transparency. Black boxes don't fly.
- Start narrow. A working demo on one document type beats a broken prototype attempting everything.
See you at the workshop.


