feat(charka): charka-shadow — validador en sombra + corpus COBOL

El pipeline COBOL->Rust queda completo (7 crates) y validado de punta a punta. charka-shadow certifica que el transpilador preserva la semántica del COBOL original con una ejecución sombra: un intérprete que corre el Ir directamente sobre charka-runtime, sin compilar nada. Es una segunda ruta de ejecución, independiente del código que emite charka-codegen — si la sombra y el transpilado divergieran, sería un bug. - interpret(&Ir) -> Outcome ejecuta el IR y captura las líneas de DISPLAY; run_source(&str) corre el pipeline completo. - Tope de pasos (Halt::StepLimit): un bucle que no termina se corta en vez de colgarse. - Módulos: field (datos -> campos vivos) / interp (el motor). Corpus nuevo crates/modules/charka/corpus/ — 7 programas COBOL de complejidad graduada (01-hola .. 07-clasificar) con sus salidas esperadas verificadas a mano: DISPLAY, aritmética con GIVING, IF/ELSE, PERFORM TIMES/UNTIL, grupos, COMPUTE con paréntesis, ROUNDED, IF anidado con AND. Material de prueba del pipeline entero. 11 tests (los 7 del corpus + fuente vacío, STOP RUN, tope de pasos, error de léxico); fmt + clippy limpios. No hay GnuCOBOL en la máquina: la referencia v1 es el corpus; un modo futuro diferenciará contra el compilador real. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 21:23:07 +00:00
parent e52b3fb572
commit 4d9ce11b1e
23 changed files with 1039 additions and 9 deletions
@@ -2355,6 +2355,17 @@ dependencies = [
 "charka-bcd",
 ]

+[[package]]
+name = "charka-shadow"
+version = "0.1.0"
+dependencies = [
+ "charka-ir",
+ "charka-lexer",
+ "charka-parser",
+ "charka-runtime",
+ "thiserror 2.0.18",
+]
+
 [[package]]
 name = "chasqui-card"
 version = "0.1.0"
@@ -162,6 +162,7 @@ members = [
    "crates/modules/charka/charka-ir",
    "crates/modules/charka/charka-runtime",
    "crates/modules/charka/charka-codegen",
+    "crates/modules/charka/charka-shadow",

    # ============================================================
    # modules/mirada/ — Compositor Wayland
@@ -16,6 +16,7 @@ embebido, dialectos IBM Enterprise) es un esfuerzo multi-mes.
 | `charka-ir`      | lib  | Representación intermedia: el AST con los statements del PROCEDURE ya tipados |
 | `charka-runtime` | lib  | Soporte de ejecución de los programas transpilados: campos `Num` y `Text` |
 | `charka-codegen` | lib  | Emisión de Rust: IR → fuente Rust sobre `charka-runtime` |
+| `charka-shadow`  | lib  | Validador en sombra: intérprete del IR + corpus de prueba |

 ## charka-bcd

@@ -136,15 +137,36 @@ del programa COBOL.
 - Fuera de alcance v1: grupos como campo propio, `REDEFINES`,
  `OCCURS`/tablas, `PERFORM ... THRU` como rango, E/S de ficheros.

+## charka-shadow
+
+El validador: certifica que el pipeline preserva la semántica del
+COBOL original. Lo hace con una **ejecución sombra** — un intérprete
+que corre el `Ir` directamente sobre `charka-runtime`, sin compilar.
+
+- `interpret(&Ir) -> Outcome` — ejecuta el IR y captura las líneas de
+  `DISPLAY`. `run_source(&str)` corre el pipeline completo.
+- El intérprete es una segunda ruta de ejecución, independiente del
+  código que emite `charka-codegen`: si la sombra y el transpilado
+  divergieran, eso delataría un bug.
+- Tope de pasos: un bucle que no termina se corta con
+  `Halt::StepLimit` en vez de colgarse.
+- La referencia v1 es el **corpus** (`corpus/`): 7 programas COBOL de
+  complejidad graduada con sus salidas esperadas verificadas a mano.
+  Un modo futuro, con GnuCOBOL, diferenciará contra el compilador real.
+
+## El corpus
+
+`crates/modules/charka/corpus/` — 7 programas COBOL graduados
+(`01-hola` … `07-clasificar`), cada uno con su `.expected`. Ejercita
+el pipeline completo de punta a punta. Ver su `README.md`.
+
 ## Estado

-`charka-bcd` (22 tests), `charka-lexer` (17 tests), `charka-parser`
-(15 tests), `charka-ir` (17 tests), `charka-runtime` (17 tests) y
-`charka-codegen` (14 tests) implementados y verdes. El pipeline
-COBOL→Rust corre de punta a punta. **Pendiente** — el último crate:
+Pipeline **completo** — `charka-bcd` (22 tests), `charka-lexer` (17),
+`charka-parser` (15), `charka-ir` (17), `charka-runtime` (17),
+`charka-codegen` (14) y `charka-shadow` (11) implementados y verdes.
+COBOL → Rust corre de punta a punta, validado contra el corpus.

-| crate pendiente   | rol                                                  |
-| ----------------- | ---------------------------------------------------- |
-| `charka-shadow`   | validador en sombra (original vs transpilado)        |
-
-Hito intermedio sugerido: subconjunto COBOL'85 puro antes de CICS/SQL.
+Próximo hito mayor: salir del subconjunto COBOL'85 puro hacia CICS,
+SQL embebido y los dialectos IBM Enterprise; ampliar el codegen
+(grupos, `REDEFINES`, `OCCURS`/tablas, E/S de ficheros).
@@ -0,0 +1,16 @@
+[package]
+name = "charka-shadow"
+version.workspace = true
+edition.workspace = true
+rust-version.workspace = true
+license.workspace = true
+authors.workspace = true
+publish.workspace = true
+description = "charka-shadow — validador en sombra del transpilador: un intérprete del IR que ejecuta el programa COBOL y compara su salida contra una referencia."
+
+[dependencies]
+charka-lexer = { path = "../charka-lexer" }
+charka-parser = { path = "../charka-parser" }
+charka-ir = { path = "../charka-ir" }
+charka-runtime = { path = "../charka-runtime" }
+thiserror = { workspace = true }
@@ -0,0 +1,128 @@
+//! El estado de los datos durante la ejecución sombra: el árbol de
+//! `DataItem` del IR se aplana a un mapa de campos vivos.
+//!
+//! La clasificación de PICTURE refleja la de `charka-codegen` — un
+//! futuro refactor la unificaría en `charka-runtime`.
+
+use std::collections::HashMap;
+
+use charka_ir::DataItem;
+use charka_runtime::{Num, Picture, Text};
+
+/// Un campo vivo: numérico o alfanumérico.
+pub(crate) enum Cell {
+    Num(Num),
+    Text(Text),
+}
+
+/// Aplana el árbol de datos en un mapa `nombre COBOL → campo`.
+pub(crate) fn build_fields(data: &[DataItem]) -> HashMap<String, Cell> {
+    let mut map = HashMap::new();
+    collect(data, &mut map);
+    map
+}
+
+/// Recorre el árbol: los grupos no son campos (se recurre en sus
+/// hijos); se saltan los niveles 88/66 y los `FILLER`.
+fn collect(items: &[DataItem], map: &mut HashMap<String, Cell>) {
+    for it in items {
+        if it.level == 88 || it.level == 66 {
+            continue;
+        }
+        if !it.children.is_empty() {
+            collect(&it.children, map);
+            continue;
+        }
+        if it.name == "FILLER" {
+            continue;
+        }
+        if let Some(cell) = make_cell(it.picture.as_deref(), it.value.as_deref()) {
+            map.entry(it.name.to_uppercase()).or_insert(cell);
+        }
+    }
+}
+
+/// Construye un campo desde su PICTURE y su cláusula `VALUE`.
+fn make_cell(pic: Option<&str>, value: Option<&str>) -> Option<Cell> {
+    let up = pic?.to_uppercase();
+    if up.contains('X') || up.contains('A') {
+        return Some(Cell::Text(Text::with_value(
+            pic_width(&up).max(1),
+            &text_value(value),
+        )));
+    }
+    if let Ok(p) = Picture::parse(&up) {
+        return Some(Cell::Num(Num::with_value(p, &numeric_value(value))));
+    }
+    // PICTURE de edición → campo de texto de presentación.
+    Some(Cell::Text(Text::with_value(
+        pic_width(&up).max(1),
+        &text_value(value),
+    )))
+}
+
+/// Cuenta las posiciones de presentación de una PICTURE, expandiendo
+/// la repetición `C(n)`. `S` y `V` no ocupan posición.
+fn pic_width(up: &str) -> usize {
+    let chars: Vec<char> = up.chars().collect();
+    let mut i = 0;
+    let mut total = 0usize;
+    while i < chars.len() {
+        let c = chars[i];
+        i += 1;
+        if c == 'S' || c == 'V' {
+            continue;
+        }
+        let mut count = 1usize;
+        if chars.get(i) == Some(&'(') {
+            i += 1;
+            let start = i;
+            while i < chars.len() && chars[i].is_ascii_digit() {
+                i += 1;
+            }
+            if let Ok(n) = chars[start..i].iter().collect::<String>().parse::<usize>() {
+                count = n;
+            }
+            if chars.get(i) == Some(&')') {
+                i += 1;
+            }
+        }
+        total += count;
+    }
+    total
+}
+
+/// Normaliza el `VALUE` de un campo numérico a un literal parseable.
+fn numeric_value(v: Option<&str>) -> String {
+    let Some(raw) = v else {
+        return "0".to_string();
+    };
+    if matches!(raw.to_uppercase().as_str(), "ZERO" | "ZEROS" | "ZEROES") {
+        return "0".to_string();
+    }
+    if charka_runtime::Decimal::parse(raw).is_ok() {
+        raw.to_string()
+    } else {
+        "0".to_string()
+    }
+}
+
+/// Normaliza el `VALUE` de un campo de texto. El parser envuelve los
+/// literales de texto en comillas simples; aquí se desenvuelven.
+fn text_value(v: Option<&str>) -> String {
+    let Some(raw) = v else {
+        return String::new();
+    };
+    let up = raw.to_uppercase();
+    if matches!(up.as_str(), "SPACE" | "SPACES") {
+        return String::new();
+    }
+    if matches!(up.as_str(), "ZERO" | "ZEROS" | "ZEROES") {
+        return "0".to_string();
+    }
+    if raw.len() >= 2 && raw.starts_with('\'') && raw.ends_with('\'') {
+        raw[1..raw.len() - 1].to_string()
+    } else {
+        raw.to_string()
+    }
+}
@@ -0,0 +1,497 @@
+//! El intérprete del IR: la ejecución «sombra» del programa COBOL.
+//!
+//! Ejecuta el [`Ir`] directamente sobre los tipos de `charka-runtime`,
+//! sin compilar nada. Es una segunda ruta de ejecución, independiente
+//! del código que emite `charka-codegen` — eso es lo que lo hace un
+//! validador: si el intérprete y el transpilado divergen, hay un bug.
+
+use std::collections::HashMap;
+
+use charka_ir::{
+    BinOp, CmpOp, Cond, Expr, Figurative, Ir, Operand, Perform, PerformControl, PerformTarget, Stmt,
+};
+use charka_runtime::{cobol_text_cmp, Decimal, Rounding};
+
+use crate::field::{build_fields, Cell};
+
+/// Tope de pasos: corta los bucles que no terminan (un `PERFORM UNTIL`
+/// con una condición que nunca se cumple) en vez de colgarse.
+const STEP_BUDGET: u64 = 5_000_000;
+
+/// Escala intermedia de la división dentro de una expresión.
+const DIV_SCALE: u8 = 9;
+
+/// El resultado de ejecutar un statement: cómo sigue el control.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+enum Flow {
+    /// Sigue con el statement siguiente.
+    Normal,
+    /// Sale del párrafo actual (`EXIT`).
+    Exit,
+    /// Termina el programa (`STOP RUN`, `GOBACK`, tope de pasos).
+    Stop,
+}
+
+/// La máquina sombra: el estado y el motor de ejecución.
+pub(crate) struct Machine<'a> {
+    ir: &'a Ir,
+    fields: HashMap<String, Cell>,
+    para_index: HashMap<String, usize>,
+    pub output: Vec<String>,
+    budget: u64,
+    pub step_limit_hit: bool,
+    pub stopped: bool,
+}
+
+impl<'a> Machine<'a> {
+    /// Prepara la máquina: aplana los datos e indexa los párrafos.
+    pub(crate) fn new(ir: &'a Ir) -> Self {
+        let mut para_index = HashMap::new();
+        for (i, proc) in ir.procedures.iter().enumerate() {
+            para_index.entry(proc.name.to_uppercase()).or_insert(i);
+        }
+        Self {
+            ir,
+            fields: build_fields(&ir.data),
+            para_index,
+            output: Vec::new(),
+            budget: STEP_BUDGET,
+            step_limit_hit: false,
+            stopped: false,
+        }
+    }
+
+    /// Corre el programa: encadena los párrafos en orden (el «caer» de
+    /// COBOL) hasta un `STOP RUN` o el final.
+    pub(crate) fn run(&mut self) {
+        let ir = self.ir;
+        for i in 0..ir.procedures.len() {
+            if let Flow::Stop = self.exec_block(&ir.procedures[i].body) {
+                self.stopped = true;
+                break;
+            }
+        }
+    }
+
+    // ── Ejecución ─────────────────────────────────────────────────
+
+    /// Consume un paso del presupuesto. `true` si se agotó.
+    fn tick(&mut self) -> bool {
+        if self.budget == 0 {
+            self.step_limit_hit = true;
+            return true;
+        }
+        self.budget -= 1;
+        false
+    }
+
+    fn exec_block(&mut self, stmts: &'a [Stmt]) -> Flow {
+        for s in stmts {
+            match self.exec_stmt(s) {
+                Flow::Normal => {}
+                other => return other,
+            }
+        }
+        Flow::Normal
+    }
+
+    fn exec_stmt(&mut self, stmt: &'a Stmt) -> Flow {
+        if self.tick() {
+            return Flow::Stop;
+        }
+        match stmt {
+            Stmt::Move { from, to } => {
+                for t in to {
+                    self.do_move(from, t);
+                }
+                Flow::Normal
+            }
+            Stmt::Display { items } => {
+                let line: String = items.iter().map(|o| self.eval_text(o)).collect();
+                self.output.push(line);
+                Flow::Normal
+            }
+            Stmt::Accept { .. } => Flow::Normal, // sin entrada: deja el campo igual
+            Stmt::Compute {
+                targets,
+                rounded,
+                expr,
+            } => {
+                let value = self.eval_expr(expr);
+                for t in targets {
+                    self.store(t, value, *rounded);
+                }
+                Flow::Normal
+            }
+            Stmt::Add {
+                addends,
+                to,
+                giving,
+                rounded,
+            } => {
+                let sum = self.fold_sum(addends);
+                if giving.is_empty() {
+                    for t in to {
+                        let cur = self.field_value(t);
+                        self.store(t, cur.add(&sum), *rounded);
+                    }
+                } else {
+                    let base = match to.first() {
+                        Some(first) => sum.add(&self.field_value(first)),
+                        None => sum,
+                    };
+                    for g in giving {
+                        self.store(g, base, *rounded);
+                    }
+                }
+                Flow::Normal
+            }
+            Stmt::Subtract {
+                amounts,
+                from,
+                giving,
+                rounded,
+            } => {
+                let sum = self.fold_sum(amounts);
+                if giving.is_empty() {
+                    for t in from {
+                        let cur = self.field_value(t);
+                        self.store(t, cur.sub(&sum), *rounded);
+                    }
+                } else {
+                    let minuend = from
+                        .first()
+                        .map(|f| self.field_value(f))
+                        .unwrap_or_else(Decimal::zero);
+                    let value = minuend.sub(&sum);
+                    for g in giving {
+                        self.store(g, value, *rounded);
+                    }
+                }
+                Flow::Normal
+            }
+            Stmt::Multiply {
+                left,
+                by,
+                giving,
+                rounded,
+            } => {
+                let value = self.eval_decimal(left).mul(&self.eval_decimal(by));
+                if giving.is_empty() {
+                    if let Operand::Data(name) = by {
+                        self.store(name, value, *rounded);
+                    }
+                } else {
+                    for g in giving {
+                        self.store(g, value, *rounded);
+                    }
+                }
+                Flow::Normal
+            }
+            Stmt::Divide {
+                left,
+                right,
+                by_form,
+                giving,
+                rounded,
+            } => {
+                let (num, den) = if *by_form {
+                    (self.eval_decimal(left), self.eval_decimal(right))
+                } else {
+                    (self.eval_decimal(right), self.eval_decimal(left))
+                };
+                if giving.is_empty() {
+                    if let Operand::Data(name) = right {
+                        let v = divide(num, den, self.target_scale(name));
+                        self.store(name, v, *rounded);
+                    }
+                } else {
+                    for g in giving {
+                        let v = divide(num, den, self.target_scale(g));
+                        self.store(g, v, *rounded);
+                    }
+                }
+                Flow::Normal
+            }
+            Stmt::If {
+                cond,
+                then_branch,
+                else_branch,
+            } => {
+                if self.eval_cond(cond) {
+                    self.exec_block(then_branch)
+                } else {
+                    self.exec_block(else_branch)
+                }
+            }
+            Stmt::Perform(p) => self.exec_perform(p),
+            Stmt::GoTo { target } => {
+                // Aproximación: ejecuta el destino y sale del párrafo.
+                match self.run_paragraph(target) {
+                    Flow::Stop => Flow::Stop,
+                    _ => Flow::Exit,
+                }
+            }
+            Stmt::StopRun | Stmt::Goback => Flow::Stop,
+            Stmt::Exit => Flow::Exit,
+            Stmt::Continue => Flow::Normal,
+            Stmt::Unknown { .. } => Flow::Normal, // verbo no soportado: se omite
+        }
+    }
+
+    fn exec_perform(&mut self, p: &'a Perform) -> Flow {
+        match &p.control {
+            PerformControl::Once => self.run_target(&p.target),
+            PerformControl::Times(n) => {
+                let count = self.count_of(n);
+                for _ in 0..count {
+                    if self.tick() {
+                        return Flow::Stop;
+                    }
+                    if let Flow::Stop = self.run_target(&p.target) {
+                        return Flow::Stop;
+                    }
+                }
+                Flow::Normal
+            }
+            PerformControl::Until(cond) => loop {
+                if self.tick() {
+                    return Flow::Stop;
+                }
+                if self.eval_cond(cond) {
+                    return Flow::Normal;
+                }
+                if let Flow::Stop = self.run_target(&p.target) {
+                    return Flow::Stop;
+                }
+            },
+        }
+    }
+
+    /// Ejecuta una vez el cuerpo de un `PERFORM`. Un `EXIT` dentro de
+    /// él termina esa pasada, no el programa.
+    fn run_target(&mut self, target: &'a PerformTarget) -> Flow {
+        let flow = match target {
+            PerformTarget::Paragraph { name, .. } => self.run_paragraph(name),
+            PerformTarget::Inline(body) => self.exec_block(body),
+        };
+        match flow {
+            Flow::Stop => Flow::Stop,
+            _ => Flow::Normal,
+        }
+    }
+
+    fn run_paragraph(&mut self, name: &str) -> Flow {
+        let Some(&idx) = self.para_index.get(&name.to_uppercase()) else {
+            return Flow::Normal;
+        };
+        let ir = self.ir;
+        match self.exec_block(&ir.procedures[idx].body) {
+            Flow::Stop => Flow::Stop,
+            _ => Flow::Normal,
+        }
+    }
+
+    /// `MOVE from` a un solo campo destino.
+    fn do_move(&mut self, from: &Operand, target: &str) {
+        let key = target.to_uppercase();
+        match self.fields.get(&key) {
+            Some(Cell::Num(_)) => {
+                let v = self.eval_decimal(from);
+                if let Some(Cell::Num(n)) = self.fields.get_mut(&key) {
+                    n.store(v);
+                }
+            }
+            Some(Cell::Text(_)) => {
+                if let Operand::Figurative(fig) = from {
+                    let ch = figurative_fill(*fig);
+                    if let Some(Cell::Text(t)) = self.fields.get_mut(&key) {
+                        t.fill(ch);
+                    }
+                } else {
+                    let s = self.eval_text(from);
+                    if let Some(Cell::Text(t)) = self.fields.get_mut(&key) {
+                        t.store(&s);
+                    }
+                }
+            }
+            None => {}
+        }
+    }
+
+    /// Almacena un valor en un campo, conformándolo a su tipo.
+    fn store(&mut self, name: &str, value: Decimal, rounded: bool) {
+        match self.fields.get_mut(&name.to_uppercase()) {
+            Some(Cell::Num(n)) => {
+                if rounded {
+                    n.store_rounded(value);
+                } else {
+                    n.store(value);
+                }
+            }
+            Some(Cell::Text(t)) => t.store(&value.to_string()),
+            None => {}
+        }
+    }
+
+    // ── Evaluación ────────────────────────────────────────────────
+
+    fn eval_decimal(&self, op: &Operand) -> Decimal {
+        match op {
+            Operand::Num(n) => Decimal::parse(n).unwrap_or_else(|_| Decimal::zero()),
+            Operand::Str(s) => Decimal::parse(s).unwrap_or_else(|_| Decimal::zero()),
+            Operand::Figurative(_) => Decimal::zero(),
+            Operand::Data(name) => match self.fields.get(&name.to_uppercase()) {
+                Some(Cell::Num(n)) => n.value(),
+                Some(Cell::Text(t)) => {
+                    Decimal::parse(t.as_str().trim()).unwrap_or_else(|_| Decimal::zero())
+                }
+                None => Decimal::zero(),
+            },
+        }
+    }
+
+    fn eval_text(&self, op: &Operand) -> String {
+        match op {
+            Operand::Str(s) => s.clone(),
+            Operand::Num(n) => n.clone(),
+            Operand::Figurative(f) => figurative_text(*f).to_string(),
+            Operand::Data(name) => match self.fields.get(&name.to_uppercase()) {
+                Some(Cell::Num(n)) => n.display(),
+                Some(Cell::Text(t)) => t.display(),
+                None => String::new(),
+            },
+        }
+    }
+
+    fn eval_expr(&self, e: &Expr) -> Decimal {
+        match e {
+            Expr::Operand(op) => self.eval_decimal(op),
+            Expr::Neg(inner) => Decimal::zero().sub(&self.eval_expr(inner)),
+            Expr::Binary { op, lhs, rhs } => {
+                let l = self.eval_expr(lhs);
+                let r = self.eval_expr(rhs);
+                match op {
+                    BinOp::Add => l.add(&r),
+                    BinOp::Sub => l.sub(&r),
+                    BinOp::Mul => l.mul(&r),
+                    BinOp::Div => divide(l, r, DIV_SCALE),
+                    BinOp::Pow => pow(&l, &r),
+                }
+            }
+        }
+    }
+
+    fn eval_cond(&self, c: &Cond) -> bool {
+        match c {
+            Cond::Compare { lhs, op, rhs } => {
+                let ord = if self.is_text(lhs) || self.is_text(rhs) {
+                    cobol_text_cmp(&self.eval_text(lhs), &self.eval_text(rhs))
+                } else {
+                    self.eval_decimal(lhs).cmp(&self.eval_decimal(rhs))
+                };
+                match op {
+                    CmpOp::Eq => ord.is_eq(),
+                    CmpOp::Ne => ord.is_ne(),
+                    CmpOp::Lt => ord.is_lt(),
+                    CmpOp::Gt => ord.is_gt(),
+                    CmpOp::Le => ord.is_le(),
+                    CmpOp::Ge => ord.is_ge(),
+                }
+            }
+            Cond::Named(_) => false, // nombres de condición (88): no soportado
+            Cond::Not(inner) => !self.eval_cond(inner),
+            Cond::And(a, b) => self.eval_cond(a) && self.eval_cond(b),
+            Cond::Or(a, b) => self.eval_cond(a) || self.eval_cond(b),
+        }
+    }
+
+    fn is_text(&self, op: &Operand) -> bool {
+        match op {
+            Operand::Str(_) => true,
+            Operand::Data(name) => {
+                matches!(self.fields.get(&name.to_uppercase()), Some(Cell::Text(_)))
+            }
+            _ => false,
+        }
+    }
+
+    /// La suma de una lista de operandos.
+    fn fold_sum(&self, ops: &[Operand]) -> Decimal {
+        let mut acc = Decimal::zero();
+        for o in ops {
+            acc = acc.add(&self.eval_decimal(o));
+        }
+        acc
+    }
+
+    /// El valor actual de un campo por nombre.
+    fn field_value(&self, name: &str) -> Decimal {
+        match self.fields.get(&name.to_uppercase()) {
+            Some(Cell::Num(n)) => n.value(),
+            Some(Cell::Text(t)) => {
+                Decimal::parse(t.as_str().trim()).unwrap_or_else(|_| Decimal::zero())
+            }
+            None => Decimal::zero(),
+        }
+    }
+
+    /// Los dígitos fraccionarios de un campo numérico destino.
+    fn target_scale(&self, name: &str) -> u8 {
+        match self.fields.get(&name.to_uppercase()) {
+            Some(Cell::Num(n)) => n.picture().fraction_digits,
+            _ => 4,
+        }
+    }
+
+    /// El número de repeticiones de un `PERFORM ... TIMES`.
+    fn count_of(&self, op: &Operand) -> usize {
+        let m = self
+            .eval_decimal(op)
+            .rescale(0, Rounding::Truncate)
+            .mantissa();
+        if m < 0 {
+            0
+        } else {
+            m as usize
+        }
+    }
+}
+
+/// División con escala fija; una división por cero da cero.
+fn divide(num: Decimal, den: Decimal, scale: u8) -> Decimal {
+    num.div(&den, scale, Rounding::Truncate)
+        .unwrap_or_else(|_| Decimal::zero())
+}
+
+/// Potencia con exponente entero no negativo; en otro caso da 1.
+fn pow(base: &Decimal, exp: &Decimal) -> Decimal {
+    let e = exp.rescale(0, Rounding::Truncate).mantissa();
+    if !(0..=256).contains(&e) {
+        return Decimal::from_integer(1);
+    }
+    let mut acc = Decimal::from_integer(1);
+    for _ in 0..e {
+        acc = acc.mul(base);
+    }
+    acc
+}
+
+/// El texto que representa una constante figurativa.
+fn figurative_text(f: Figurative) -> &'static str {
+    match f {
+        Figurative::Zero => "0",
+        Figurative::Space => " ",
+        Figurative::Quote => "\"",
+        Figurative::HighValue | Figurative::LowValue | Figurative::Null => "",
+    }
+}
+
+/// El carácter de relleno de una figurativa, para `Text::fill`.
+fn figurative_fill(f: Figurative) -> char {
+    match f {
+        Figurative::Zero => '0',
+        Figurative::Quote => '"',
+        _ => ' ',
+    }
+}
@@ -0,0 +1,154 @@
+//! `charka-shadow` — el validador en sombra del transpilador.
+//!
+//! Certifica que el pipeline de charka (lexer → parser → IR → codegen)
+//! preserva la semántica del programa COBOL original. Lo hace con una
+//! **ejecución sombra**: un intérprete que corre el [`Ir`] directamente
+//! sobre los tipos de `charka-runtime`, sin compilar nada.
+//!
+//! El intérprete es una segunda ruta de ejecución, independiente del
+//! código que emite `charka-codegen`. Si la sombra y el transpilado
+//! produjeran salidas distintas, eso delataría un bug del codegen.
+//!
+//! - [`interpret`] — ejecuta un `Ir` y devuelve su salida.
+//! - [`run_source`] — el pipeline completo, de fuente COBOL a salida.
+//!
+//! La referencia contra la que se comparan los resultados es, en la
+//! v1, un conjunto de salidas esperadas verificadas a mano (el corpus
+//! en `crates/modules/charka/corpus/`). Cuando haya un GnuCOBOL
+//! disponible, un modo futuro podrá diferenciar contra el compilador
+//! de COBOL real — la validación «original vs transpilado» plena.
+//!
+//! El intérprete tiene un tope de pasos: un bucle que no termina se
+//! corta con [`Halt::StepLimit`] en vez de colgarse.
+
+#![forbid(unsafe_code)]
+
+mod field;
+mod interp;
+
+use charka_ir::Ir;
+use interp::Machine;
+
+/// Cómo terminó una ejecución sombra.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum Halt {
+    /// Cayó por el final del PROCEDURE division.
+    Normal,
+    /// Un `STOP RUN` o `GOBACK`.
+    StopRun,
+    /// Se agotó el tope de pasos (un bucle que no termina).
+    StepLimit,
+}
+
+/// El resultado de una ejecución sombra.
+#[derive(Debug, Clone, PartialEq, Eq)]
+pub struct Outcome {
+    /// Las líneas que el programa emitió por `DISPLAY`.
+    pub lines: Vec<String>,
+    /// Cómo terminó.
+    pub halt: Halt,
+}
+
+/// Falla del pipeline previo al intérprete.
+#[derive(Debug, Clone, PartialEq, Eq, thiserror::Error)]
+pub enum ShadowError {
+    #[error("error de léxico: {0}")]
+    Lex(#[from] charka_lexer::LexError),
+    #[error("error de parseo: {0}")]
+    Parse(#[from] charka_parser::ParseError),
+}
+
+/// Ejecuta un [`Ir`] en sombra y captura su salida.
+pub fn interpret(ir: &Ir) -> Outcome {
+    let mut machine = Machine::new(ir);
+    machine.run();
+    let halt = if machine.step_limit_hit {
+        Halt::StepLimit
+    } else if machine.stopped {
+        Halt::StopRun
+    } else {
+        Halt::Normal
+    };
+    Outcome {
+        lines: machine.output,
+        halt,
+    }
+}
+
+/// Corre el pipeline completo: fuente COBOL (formato libre) → salida.
+pub fn run_source(cobol: &str) -> Result<Outcome, ShadowError> {
+    let tokens = charka_lexer::lex(cobol, charka_lexer::SourceFormat::Free)?;
+    let program = charka_parser::parse(&tokens)?;
+    let ir = charka_ir::lower(&program);
+    Ok(interpret(&ir))
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    /// Verifica un programa del corpus contra su salida esperada. La
+    /// comparación ignora los espacios finales de cada línea.
+    fn check(cobol: &str, expected: &str) {
+        let outcome = run_source(cobol).expect("el pipeline no debe fallar");
+        let got: Vec<&str> = outcome.lines.iter().map(|l| l.trim_end()).collect();
+        let want: Vec<&str> = expected.lines().map(|l| l.trim_end()).collect();
+        assert_eq!(got, want, "salida sombra distinta de la esperada");
+    }
+
+    /// Declara un test que corre un programa del corpus.
+    macro_rules! corpus_test {
+        ($name:ident, $file:literal) => {
+            #[test]
+            fn $name() {
+                check(
+                    include_str!(concat!("../../corpus/", $file, ".cob")),
+                    include_str!(concat!("../../corpus/", $file, ".expected")),
+                );
+            }
+        };
+    }
+
+    corpus_test!(corpus_01_hola, "01-hola");
+    corpus_test!(corpus_02_aritmetica, "02-aritmetica");
+    corpus_test!(corpus_03_condicional, "03-condicional");
+    corpus_test!(corpus_04_bucle, "04-bucle");
+    corpus_test!(corpus_05_factorial, "05-factorial");
+    corpus_test!(corpus_06_nomina, "06-nomina");
+    corpus_test!(corpus_07_clasificar, "07-clasificar");
+
+    #[test]
+    fn empty_source_runs_clean() {
+        let outcome = run_source("").expect("pipeline OK");
+        assert!(outcome.lines.is_empty());
+        assert_eq!(outcome.halt, Halt::Normal);
+    }
+
+    #[test]
+    fn stop_run_is_reported() {
+        let outcome = run_source("PROCEDURE DIVISION.\nMAIN.\n DISPLAY 'X'.\n STOP RUN.\n")
+            .expect("pipeline OK");
+        assert_eq!(outcome.lines, vec!["X".to_string()]);
+        assert_eq!(outcome.halt, Halt::StopRun);
+    }
+
+    #[test]
+    fn endless_loop_is_cut_by_the_step_limit() {
+        // `PERFORM UNTIL 1 = 0` nunca se cumple — el tope lo corta.
+        let outcome = run_source(
+            "PROCEDURE DIVISION.\n\
+             MAIN.\n\
+                 PERFORM UNTIL 1 = 0\n\
+                     CONTINUE\n\
+                 END-PERFORM.\n",
+        )
+        .expect("pipeline OK");
+        assert_eq!(outcome.halt, Halt::StepLimit);
+    }
+
+    #[test]
+    fn lex_error_surfaces() {
+        let err = run_source("PROCEDURE DIVISION.\nMAIN.\n DISPLAY 'sin cerrar.\n").unwrap_err();
+        assert!(matches!(err, ShadowError::Lex(_)));
+    }
+}
@@ -0,0 +1,7 @@
+* corpus charka — nivel 1: el programa mínimo
+IDENTIFICATION DIVISION.
+PROGRAM-ID. HOLA.
+PROCEDURE DIVISION.
+MAIN.
+    DISPLAY 'HOLA, MUNDO'.
+    STOP RUN.
@@ -0,0 +1 @@
+HOLA, MUNDO
@@ -0,0 +1,19 @@
+* corpus charka — nivel 2: datos y aritmética con GIVING
+IDENTIFICATION DIVISION.
+PROGRAM-ID. ARITMETICA.
+DATA DIVISION.
+WORKING-STORAGE SECTION.
+01 WS-A     PIC 9(4) VALUE 120.
+01 WS-B     PIC 9(4) VALUE 35.
+01 WS-SUMA  PIC 9(5).
+01 WS-RESTA PIC 9(5).
+01 WS-PROD  PIC 9(8).
+PROCEDURE DIVISION.
+MAIN.
+    ADD WS-A WS-B GIVING WS-SUMA.
+    SUBTRACT WS-B FROM WS-A GIVING WS-RESTA.
+    MULTIPLY WS-A BY WS-B GIVING WS-PROD.
+    DISPLAY 'SUMA=' WS-SUMA.
+    DISPLAY 'RESTA=' WS-RESTA.
+    DISPLAY 'PRODUCTO=' WS-PROD.
+    STOP RUN.
@@ -0,0 +1,3 @@
+SUMA=00155
+RESTA=00085
+PRODUCTO=00004200
@@ -0,0 +1,17 @@
+* corpus charka — nivel 3: condicionales IF / ELSE
+IDENTIFICATION DIVISION.
+PROGRAM-ID. CONDICIONAL.
+DATA DIVISION.
+WORKING-STORAGE SECTION.
+01 WS-NOTA PIC 9(3) VALUE 72.
+PROCEDURE DIVISION.
+MAIN.
+    IF WS-NOTA >= 60
+        DISPLAY 'APROBADO'
+    ELSE
+        DISPLAY 'REPROBADO'
+    END-IF.
+    IF WS-NOTA > 90
+        DISPLAY 'CON HONORES'
+    END-IF.
+    STOP RUN.
@@ -0,0 +1 @@
+APROBADO
@@ -0,0 +1,15 @@
+* corpus charka — nivel 4: PERFORM n TIMES y un párrafo aparte
+IDENTIFICATION DIVISION.
+PROGRAM-ID. BUCLE.
+DATA DIVISION.
+WORKING-STORAGE SECTION.
+01 WS-CONT  PIC 9(3) VALUE 0.
+01 WS-TOTAL PIC 9(5) VALUE 0.
+PROCEDURE DIVISION.
+MAIN.
+    PERFORM SUMAR 5 TIMES.
+    DISPLAY 'TOTAL=' WS-TOTAL.
+    STOP RUN.
+SUMAR.
+    ADD 1 TO WS-CONT.
+    ADD WS-CONT TO WS-TOTAL.
@@ -0,0 +1 @@
+TOTAL=00015
@@ -0,0 +1,16 @@
+* corpus charka — nivel 4: PERFORM UNTIL en línea (factorial)
+IDENTIFICATION DIVISION.
+PROGRAM-ID. FACTORIAL.
+DATA DIVISION.
+WORKING-STORAGE SECTION.
+01 WS-N    PIC 9(2) VALUE 6.
+01 WS-I    PIC 9(2) VALUE 1.
+01 WS-FACT PIC 9(8) VALUE 1.
+PROCEDURE DIVISION.
+MAIN.
+    PERFORM UNTIL WS-I > WS-N
+        MULTIPLY WS-I BY WS-FACT
+        ADD 1 TO WS-I
+    END-PERFORM.
+    DISPLAY 'FACTORIAL DE 6 = ' WS-FACT.
+    STOP RUN.
@@ -0,0 +1 @@
+FACTORIAL DE 6 = 00000720
@@ -0,0 +1,32 @@
+* corpus charka — nivel 5: grupos, COMPUTE con paréntesis, ROUNDED
+IDENTIFICATION DIVISION.
+PROGRAM-ID. NOMINA.
+DATA DIVISION.
+WORKING-STORAGE SECTION.
+01 WS-EMPLEADO.
+   05 WS-NOMBRE  PIC X(10)   VALUE 'ANA'.
+   05 WS-HORAS   PIC 9(3)    VALUE 45.
+   05 WS-TARIFA  PIC 9(3)V99 VALUE 8.50.
+01 WS-BRUTO     PIC 9(6)V99 VALUE 0.
+01 WS-EXTRA     PIC 9(6)V99 VALUE 0.
+01 WS-IMPUESTO  PIC 9(6)V99 VALUE 0.
+01 WS-NETO      PIC 9(6)V99 VALUE 0.
+PROCEDURE DIVISION.
+MAIN-PARA.
+    PERFORM CALCULAR-BRUTO.
+    PERFORM CALCULAR-IMPUESTO.
+    COMPUTE WS-NETO = WS-BRUTO - WS-IMPUESTO.
+    DISPLAY 'EMPLEADO: ' WS-NOMBRE.
+    DISPLAY 'BRUTO: ' WS-BRUTO.
+    DISPLAY 'IMPUESTO: ' WS-IMPUESTO.
+    DISPLAY 'NETO: ' WS-NETO.
+    STOP RUN.
+CALCULAR-BRUTO.
+    IF WS-HORAS > 40
+        COMPUTE WS-EXTRA = (WS-HORAS - 40) * WS-TARIFA
+        COMPUTE WS-BRUTO = 40 * WS-TARIFA + WS-EXTRA
+    ELSE
+        COMPUTE WS-BRUTO = WS-HORAS * WS-TARIFA
+    END-IF.
+CALCULAR-IMPUESTO.
+    COMPUTE WS-IMPUESTO ROUNDED = WS-BRUTO * 0.15.
@@ -0,0 +1,4 @@
+EMPLEADO: ANA
+BRUTO: 00038250
+IMPUESTO: 00005738
+NETO: 00032512
@@ -0,0 +1,23 @@
+* corpus charka — nivel 5: IF anidado y condiciones con AND
+IDENTIFICATION DIVISION.
+PROGRAM-ID. CLASIFICAR.
+DATA DIVISION.
+WORKING-STORAGE SECTION.
+01 WS-EDAD   PIC 9(3) VALUE 0.
+01 WS-INDICE PIC 9(2) VALUE 1.
+PROCEDURE DIVISION.
+MAIN-PARA.
+    PERFORM CLASIFICAR-PARA 4 TIMES.
+    STOP RUN.
+CLASIFICAR-PARA.
+    COMPUTE WS-EDAD = WS-INDICE * 25.
+    IF WS-EDAD < 18
+        DISPLAY 'MENOR DE EDAD'
+    ELSE
+        IF WS-EDAD >= 18 AND WS-EDAD < 65
+            DISPLAY 'ADULTO'
+        ELSE
+            DISPLAY 'ADULTO MAYOR'
+        END-IF
+    END-IF.
+    ADD 1 TO WS-INDICE.
@@ -0,0 +1,4 @@
+ADULTO
+ADULTO
+ADULTO MAYOR
+ADULTO MAYOR
@@ -0,0 +1,32 @@
+# Corpus COBOL de charka
+
+Programas COBOL de prueba, de complejidad **graduada**, para ejercitar
+el pipeline completo del transpilador (lexer → parser → IR → codegen) y
+el validador en sombra `charka-shadow`.
+
+Cada programa `NN-nombre.cob` viene con su `NN-nombre.expected`: la
+salida correcta, una línea por `DISPLAY`.
+
+| programa            | nivel | qué ejercita                                       |
+| ------------------- | ----- | -------------------------------------------------- |
+| `01-hola`           | 1     | el programa mínimo — un `DISPLAY` de literal       |
+| `02-aritmetica`     | 2     | datos, `ADD`/`SUBTRACT`/`MULTIPLY` con `GIVING`    |
+| `03-condicional`    | 3     | `IF` / `ELSE` / `END-IF`                           |
+| `04-bucle`          | 4     | `PERFORM n TIMES`, un párrafo aparte, `ADD` in situ|
+| `05-factorial`      | 4     | `PERFORM UNTIL` en línea, `MULTIPLY` in situ       |
+| `06-nomina`         | 5     | grupos, `COMPUTE` con paréntesis, `ROUNDED`, V99   |
+| `07-clasificar`     | 5     | `IF` anidado, condiciones con `AND`                |
+
+## Formato
+
+Los fuentes están en **formato libre** de COBOL (la línea entera es
+código; `*` al inicio es comentario). El comparador del validador
+**ignora los espacios finales** de cada línea — un campo `PIC X(n)` se
+muestra con su relleno de espacios, que aquí se omite por legibilidad.
+
+## La salida esperada
+
+Las `.expected` se derivaron a mano de la semántica de COBOL'85. Cuando
+GnuCOBOL esté disponible, `charka-shadow` podrá regenerarlas desde el
+compilador de referencia y diferenciar contra él — el modo «sombra»
+pleno (original vs transpilado).
@@ -3,6 +3,31 @@
 Transpilador COBOL → Rust. El módulo más grande del ecosistema (Fase D
 del plan macro) — el parser COBOL completo es un esfuerzo multi-mes.

+### feat(charka-shadow): validador en sombra + corpus COBOL
+
+Crate nuevo `crates/modules/charka/charka-shadow` y un corpus de prueba
+`crates/modules/charka/corpus/` — el pipeline COBOL→Rust queda
+completo y validado de punta a punta.
+
+- `charka-shadow` certifica que el transpilador preserva la semántica
+  del COBOL original con una **ejecución sombra**: un intérprete que
+  corre el `Ir` directamente sobre `charka-runtime`, sin compilar nada.
+  Es una segunda ruta de ejecución, independiente del código que emite
+  `charka-codegen`.
+- `interpret(&Ir) -> Outcome` ejecuta el IR y captura las líneas de
+  `DISPLAY`; `run_source(&str)` corre el pipeline completo (lexer →
+  parser → IR → intérprete).
+- Tope de pasos (`Halt::StepLimit`): un bucle que no termina se corta
+  en vez de colgar la ejecución.
+- **Corpus**: 7 programas COBOL de complejidad graduada — `01-hola`
+  (un `DISPLAY`), `02-aritmetica` (`ADD`/`SUBTRACT`/`MULTIPLY`),
+  `03-condicional` (`IF`/`ELSE`), `04-bucle` (`PERFORM TIMES`),
+  `05-factorial` (`PERFORM UNTIL`), `06-nomina` (grupos, `COMPUTE` con
+  paréntesis, `ROUNDED`, `V99`), `07-clasificar` (`IF` anidado, `AND`).
+  Cada uno con su `.expected` verificada a mano.
+- 11 tests: los 7 programas del corpus + fuente vacío, `STOP RUN`,
+  corte por tope de pasos y propagación de error de léxico.
+
 ### feat(charka-codegen): emisión de Rust desde el IR

 Crate nuevo `crates/modules/charka/charka-codegen` — la etapa final del