MedWeight Clinical Assessment System

MedWeight Clinical Assessment System
Architecture, Algorithms, and Clinical Integration Report 
Prepared for the Dr. Michael Lyon — March 2026
CONFIDENTIAL — Internal Use Only
Executive Summary
The MedWeight Clinical Assessment System introduces a three-tier adaptive assessment architecture that transforms how patient profiles are constructed, how coaching sessions are personalized, and how clinical oversight is maintained. Unlike traditional screening workflows where patients complete standardized instruments in isolation, this system embeds assessment into every touchpoint of the patient journey — from onboarding through active coaching — producing a continuously evolving clinical profile that drives real-time coaching adaptation.
The system is built around a foundational principle established during the clinical instrument design process: assessments should not feel like assessments. They should feel like guided reflection inside a coaching dialogue. This principle shapes every design decision, from the MI-reworded conversational delivery of assessment items to the phenotype-driven coaching stance recommendations that flow directly into the AI coaching prompt.
This report covers the complete system: the scoring algorithms, the phenotype classification engine, the three-source profile unification model, the dynamic coaching prompt architecture, and the validation test results from the first instrument deployment (MDOA — Mental Drivers of Obesity Assessment, 64 items).
Key Capabilities Delivered
01
Twenty-plus custom clinical instruments
Designed for obesity medicine (non-copyrighted).
02
Dual delivery
Web-form (onboarding/assigned) and conversational (during coaching sessions).
03
Automated phenotype classification
With coaching stance recommendations.
04
Three-source profile unification
Formal assessments + conversational micro-assessments + screening signal accumulation.
05
Dynamic coaching prompt injection
Claude receives the full patient profile at every coaching interaction.
06
Governance and audit
Every score, every extraction, every profile computation is logged with timestamps.
System Architecture
Three-Tier Adaptive Assessment Model
The assessment system operates across three tiers, each serving a distinct clinical purpose while feeding into the same unified patient profile.
1
Tier 1: Universal Onboarding Assessment
Every patient entering the coaching program completes the MDOA (Mental Drivers of Obesity Assessment) as their onboarding instrument. This 64-item assessment is delivered via a web form accessed through an SMS link. The patient can exit and resume at any point — progress is saved by section. The onboarding provides the initial phenotype classification, identifies primary and secondary drivers, and triggers recommendations for deeper assessment modules. If the patient does not complete onboarding within 3 days, automated SMS reminders are sent at 3-day intervals, with coaching enrollment paused after three unanswered reminders.
2
Tier 2: Triggered Deep-Dive Assessments
Based on the MDOA results, the system identifies which domain-specific instruments should be deployed. For example, if the MDOA mood domain scores at or above 2.0, the system recommends deploying the DOMM (Depression-Obesity Mechanism Measure) and SEIM (Stress-Eating Intensity Measure) for deeper assessment. These triggered modules are tracked in the patient profile and can be assigned by clinical staff via the admin panel or deployed conversationally during coaching sessions.
3
Tier 3: Conversational Micro-Assessments
During every coaching session, the AI coach embeds 2–3 assessment check-in items drawn from each instrument's short-form sentinel set. These items are reworded in Motivational Interviewing (MI) style so they feel like natural coaching questions rather than clinical screening. The patient's responses are scored in real-time by the AI, extracted as structured JSON, and fed into the scoring engine. Over time, this produces a longitudinal signal that tracks domain-level change without requiring the patient to complete additional formal instruments.
Data Flow Architecture
Every assessment interaction — regardless of delivery channel — follows the same pipeline: response capture, scoring engine, profile refresh, prompt injection.
Unified Patient Profile
The patient profile is not a single-source snapshot. It is a composite view assembled from three independent data streams, each contributing different clinical information.
When the profile is refreshed, all three sources are merged into a unified dimensions map. Each clinical dimension (depression, anxiety/stress, reward sensitivity, satiety physiology, loss of control, executive function, shame/stigma, sleep apnea, trauma/PTSD, alcohol/substance) carries data from whichever sources are available. A dimension might have a mechanism score from the MDOA, a formal score from a companion questionnaire, and a conversational signal count with trend — or it might have only one of these. The coaching prompt includes all available data so the AI can make clinically informed decisions.
The MDOA Instrument
The Mental Drivers of Obesity Assessment (MDOA) is the foundational instrument of the system. It is a 64-item, mixed-method assessment organized into 8 sections covering 6 Likert-scored domains, 10 categorical pattern items, and 6 binary risk flags.
Section Structure
The Six Binary Risk Flags
Section H contains six yes/no items that serve as direct clinical triggers. Each flag maps to a specific deep-dive assessment module and acts as a confirmatory signal for phenotype classification. When a Likert domain score is elevated and its corresponding risk flag is also endorsed, the phenotype classification confidence increases.
The risk flags play two distinct roles in the system. First, they are branching triggers: a Yes response recommends deployment of the corresponding deep-dive module. Second, they are phenotype confirmers: a mood domain score of 2.5 alone produces a provisional phenotype classification, but a mood domain score of 2.5 plus item 60 = Yes produces a confirmed classification with a higher confidence weight. This dual role ensures that phenotype detection is both dimensionally sensitive (via Likert scores) and clinically anchored (via patient-confirmed symptom significance).
Scoring Algorithms
Domain Scoring
Each of the six Likert domains (mood, reward, satiety, loc, executive, shame) is scored using the arithmetic mean of its 8 constituent items. The scale runs from 0 (Not at all) to 4 (Very often). Domain scores therefore range from 0.00 to 4.00. This mean-based approach ensures that partial completions (as occur during conversational micro-assessments where only 2–3 items may be captured per session) still produce interpretable domain-level scores, albeit with wider confidence intervals.
Global Score
The global score is the mean of all 48 Likert items (Sections A through F). It represents the overall burden of psychobehavioral drivers affecting the patient's obesity. Items from Sections G (categorical) and H (binary flags) do not contribute to the global score.
Severity Classification
Both global and domain scores are classified into four severity levels using identical thresholds:
Phenotype Classification
The phenotype engine evaluates domain scores and risk flags against a priority-ordered rule set. Eight phenotype classifications are defined, from single-domain phenotypes (mood-driven, reward-dominant, etc.) to complex multi-domain patterns (mixed-pattern, high-complexity). The classification algorithm processes rules in priority order and assigns the highest-strength match as the primary phenotype.
Single-Domain Phenotypes
Each single-domain phenotype requires a domain score at or above 2.5. If the corresponding risk flag is also endorsed (Yes), the match strength is boosted by 0.5 points, indicating a confirmed phenotype.
Complex Multi-Domain Phenotypes
Mixed-Pattern Obesity
Rule: 3 or more domains score ≥ 2.0
Clinical Significance: Multiple mechanisms active; staged, multimodal intervention recommended
High-Complexity Obesity
Rule: Global score ≥ 2.5 AND 1+ risk flags AND 3+ domains ≥ 2.0
Clinical Significance: Severe, multi-mechanism burden; clinical escalation, medication review, intensive follow-up
Coaching Stance Determination
After phenotype classification, the system recommends a coaching stance — the therapeutic approach the AI coach should prioritize. The stance is determined by priority-ordered rules that consider domain severity, risk flags, and phenotype.
Clinical Review Triggers
Global Score ≥ 3.0
Severe global burden triggers automatic clinical review flag.
Any Domain ≥ 3.5
Critical single-domain score triggers automatic clinical review flag.
4+ Risk Flags Endorsed
Four or more simultaneous binary risk flags trigger automatic clinical review.
Flagged patients are marked with an escalation badge in the admin panel and the escalation reason is included in the coaching prompt so the AI coach is aware of the clinical concern.
How Assessment Data Affects Coaching
Dynamic Prompt Injection
Every coaching session prompt is constructed dynamically. When a patient enters a coaching session, the system loads their current assessment profile and injects it directly into the AI's system prompt. The AI coach (Claude) receives the following structured block at the start of every coaching interaction:
=== PATIENT ASSESSMENT PROFILE ===
Primary Phenotype: satiety_glycemic
Primary Driver: satiety/glycemic instability | Secondary: mood/emotional burden
Recommended Coaching Stance: meal stabilization + ACT willingness
Action Size: standard
Domain Scores:
  Depression: PHQ9=8 (mild) MDOA_MOOD=1.6 (mild) [conv: 2 mentions, present]
  Satiety Physiology: MDOA_SATIETY=2.9 (moderate)
  Reward Sensitivity: MDOA_REWARD=1.8 (mild)
  Sleep Apnea: STOPBANG=5 (high_risk)
Active Risk Flags: glycemic instability
=== END ASSESSMENT PROFILE ===
This profile block gives the AI coach precise clinical context. It knows this patient's primary mechanism is satiety/glycemic instability, that they also have mild mood involvement, that their PHQ-9 is mild, that their STOP-BANG indicates high OSA risk, and that glycemic instability has been confirmed by the patient as clinically significant. The recommended coaching stance ('meal stabilization + ACT willingness') instructs the AI to focus on protein-forward meal architecture, glycemic stabilization, and acceptance-based approaches to difficult eating urges.
Micro-Assessment During Sessions
In addition to the profile injection, each coaching session includes a set of MI-reworded sentinel items that the AI is instructed to embed conversationally. The AI does not present these as questionnaire items. Instead, it weaves them into the natural flow of a coaching conversation.
Example: MI-Reworded Assessment Item
Original item (mdoa_02): 'Stress increases my drive to eat.'
MI-reworded version the AI uses: 'When things get stressful lately, have you noticed it pushing you toward eating — even when you're not really hungry?'
Patient responds: 'Yeah, most evenings when I'm stressed I just grab whatever is in the kitchen.'
AI scores this as: mdoa_02 = 3 (Often), confidence = high, and captures the patient's exact language for the clinical record.
When the patient's response is vague, the AI follows a structured escalation: first, it asks one follow-up question to clarify; if still unclear, it confirms with an explicit rating (e.g., 'Would it be fair to say we could score that as a 3 out of 5, where 5 means every single time?'). Each scored item includes a confidence level (high, medium, or low) so clinicians can assess the reliability of conversationally-derived scores.
Conversational Signal Accumulation
Beyond formal and micro-assessments, every regular conversation between the patient and the AI (including casual, non-coaching interactions) contributes to the patient profile through signal accumulation. When the AI's existing screening extraction detects mentions of depression, stress, sleep problems, loss of control, or other clinical concerns, these are tallied in a rolling 14-day window. Trends are computed automatically: 1–2 mentions in the window = 'present', 3–4 = 'rising', 5+ = 'worsening'. These trends appear in the coaching prompt alongside formal scores, giving the AI coach a sense of trajectory even between formal assessment points.
Impact on Coaching Quality
The cumulative effect of these three data streams is that the AI coach operates with continuously improving clinical context. On Day 1, a new patient's coaching prompt might contain only their MDOA onboarding results. By Week 3, the prompt reflects the MDOA phenotype, supplemented by 2–3 micro-assessment snapshots that show domain-level trend, plus conversational signal counts indicating whether mood and stress mentions are increasing or stable, plus any formal questionnaire scores (PHQ-9, GAD-7, etc.) that have been completed. The coaching stance may have shifted from the initial recommendation as the profile evolves.
Critically, the AI coach is never told to change its approach in a heavy-handed way. The assessment context is provided as background information, and the coaching prompt's instructions guide the AI to use all three methodologies (MI, CBT, ACT) fluidly. The phenotype and coaching stance recommendations influence which therapeutic techniques the AI emphasizes, not which ones it uses. A patient classified as 'mood-driven' will receive more emotional validation and behavioral activation focus; a patient classified as 'executive chaos' will receive more concrete planning tools and simpler action steps. But both patients receive warm, empathic coaching grounded in all three modalities.
Validation Test Results
1
Test 1: Moderate Severity (All Items Scored 2)
A synthetic patient with all 48 Likert items scored at 2 ('Sometimes') and the LOC risk flag endorsed.
Domain Scores: mood: 2.0, reward: 2.0, satiety: 2.0, loc: 2.0, executive: 2.0, shame: 2.0
Global Score: 2.0 | Severity: Moderate
Primary Phenotype: Mixed-Pattern Obesity (3+ domains ≥ 2.0)
Coaching Stance: Multimodal: MI + CBT + skills
Triggered Modules: 4 — DOMM, SEIM, LOCEA, CEFRA
Clinical Review: No | Items Scored: 48/48
2
Test 2: Severe Scores (All Items Scored 3, All Flags Yes)
A synthetic patient with maximum clinical burden: all 48 Likert items scored at 3 ('Often') and all 6 risk flags endorsed.
Domain Scores: mood: 3.0, reward: 3.0, satiety: 3.0, loc: 3.0, executive: 3.0, shame: 3.0
Global Score: 3.0 | Severity: Severe
Primary Phenotype: High-Complexity Obesity (global ≥ 2.5, 1+ flags, 3+ domains ≥ 2.0)
Coaching Stance: CBT reframing + self-compassion (shame = severe takes priority)
Triggered Modules: 6 — DOMM, SEIM, LOCEA, CEFRA, NRAF-EF, BIWSSA
Clinical Review: Yes — Global score severe (3.0); 6 risk flags active | Items Scored: 48/48
3
Test 3: Conversational Micro-Checkin (2 Items)
Simulated coaching session where the AI extracted 2 items from a conversational exchange: mdoa_02 (stress/eating, scored 3) and mdoa_17 (hunger return, scored 3).
Domain Scores: mood: 3.0 (1 item), satiety: 3.0 (1 item)
Global Score: 3.0 (from 2 items only) | Items Scored: 2/48 (partial)
Phenotype: mood_driven (mood ≥ 2.5, highest strength match)
Coaching Stance: MI + behavioral activation
Clinical Significance: Partial scores indicate domain-level concern; full assessment recommended
4
Test 4: Three-Source Profile Unification
Simulated profile combining MDOA domain scores, a completed PHQ-9, and conversational signal data.
Depression: PHQ9=14 (moderate), MDOA_MOOD=2.8 (moderate), conversational: 4 mentions (rising)
Anxiety/Stress: conversational: 1 mention (present)
Reward Sensitivity: MDOA_REWARD=2.2 (moderate)
Sleep Apnea: STOPBANG=5 (high_risk)
This demonstrates how the unified profile synthesizes formal psychiatric screening (PHQ-9), mechanism-specific assessment (MDOA domain scores), and naturalistic conversational observation into a single clinical picture.
5
Test 5: Full Coaching Prompt Assembly
Verified that the coaching prompt correctly assembles all 8 dynamic components: coach identity, patient name, session duration, specialization, pathway, RAG content, assessment profile, and micro-assessment items. The JSON example format for the AI's assessment output was validated as syntactically correct. All prompt sections (Adaptive Assessment instructions, Confidence Levels, MI/CBT/ACT methodology guidance, and GUIDELINES) are present and properly formatted.
Admin Panel and Reporting
Patient Detail View
The admin patient detail page now includes an Assessment Profile panel that displays the patient's current phenotype classification (as a colored badge), coaching stance, primary and secondary drivers, domain score bars (color-coded, showing scores from both the MDOA mechanism assessment and companion questionnaires), active risk flag badges, escalation alerts, onboarding status, and a list of recent assessments with their type (full, short-form, or micro-checkin). A manual 'Recalculate Profile' button allows clinicians to force a profile refresh.
Governance and Audit Trail
Every assessment result is stored with complete provenance: who assigned it, how it was administered (web-form vs conversational), the raw responses, the computed domain scores, the phenotype classification, and timestamps for creation, start, completion, and scoring. For conversational micro-assessments, the AI's confidence level for each scored item is preserved, along with the patient's exact language. Coaching session prompts include an assessment_context_json snapshot so auditors can verify exactly what clinical data the AI saw during any given session.
Clinical Escalation Workflow
When the system detects scores that warrant clinical review (global ≥ 3.0, any domain ≥ 3.5, or 4+ risk flags), the patient is flagged with an escalation badge visible in the admin panel. The escalation reason is included in the coaching prompt so the AI coach is aware, and the patient's profile shows the specific scores and flags that triggered the escalation. Clinical staff can review and clear escalation flags after assessment.
Instrument Roadmap
The MDOA is the first of twenty-plus planned instruments. All instruments follow the same JSON definition format, use the same scoring engine, and feed into the same unified patient profile. Future instruments are organized by clinical domain:
Each new instrument is deployed by placing its JSON definition in the assessment_definitions directory and running the loader. No code changes are required. The scoring engine, phenotype classifier, and profile unification logic handle arbitrary instruments and domains automatically.
Summary
The MedWeight Clinical Assessment System represents a fundamental shift from episodic screening to continuous, adaptive clinical profiling. The key innovations are:
Assessments embedded in coaching dialogue
Patients experience guided reflection, not clinical paperwork.
Phenotype-driven coaching
The AI adapts its therapeutic approach based on the patient's dominant mechanism, not a one-size-fits-all protocol.
Three-source profile unification
Formal instruments, conversational micro-assessments, and naturalistic signal accumulation produce a rich, evolving clinical picture.
Continuous refinement
Every interaction makes the profile more accurate without requiring additional patient burden.
Clinical safety nets
Automated escalation flags, audit trails, and governance hashing ensure responsible AI-assisted care.
The system is now live with the MDOA instrument and ready for clinical testing. The remaining nineteen instruments can be deployed incrementally using the same infrastructure, with each addition immediately enriching the patient profile and expanding the AI coach's clinical context.