feat(charka): E/S de ficheros — SELECT/FD/OPEN/READ/WRITE/CLOSE

El gran hueco que faltaba para el COBOL real: el procesamiento de ficheros secuenciales. Una rebanada vertical por los seis crates. - charka-parser: la ENVIRONMENT division ya no se ignora — se parsea FILE-CONTROL (SELECT name ASSIGN TO "ruta"); del FILE SECTION se asocia cada FD con su registro 01. Program::files. - charka-runtime: tipo CobFile — un fichero «line sequential» (cada registro una línea). Lectura: carga a memoria. Escritura: acumula y vuelca al cerrar. - charka-ir: Ir::files y los statements Open/Close/Read/Write. READ lleva sus bloques AT END / NOT AT END. - charka-codegen: un campo CobFile por fichero en el struct Program; los verbos emiten llamadas al runtime. - charka-shadow: el intérprete hace E/S de ficheros real. - Corpus: programa nuevo 18-fichero — escribe tres líneas, las relee con READ ... AT END y las muestra. Verificado: el intérprete sombra y el crate compilado por scaffold dan la misma salida. Alcance v1: organización line sequential; sin ficheros indexados ni relativos, sin FILE STATUS. Tests: charka-parser 17, charka-runtime 19, charka-ir 30, charka-codegen 25, charka-shadow 23. fmt + clippy limpios. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 22:47:26 +00:00
parent f250fd0765
commit b3278bdb0c
17 changed files with 663 additions and 22 deletions
@@ -239,6 +239,14 @@ fn collect_unknowns(stmts: &[Stmt], out: &mut Vec<String>) {
                }
                collect_unknowns(other, out);
            }
            Stmt::Read {
                at_end,
                not_at_end,
                ..
            } => {
                collect_unknowns(at_end, out);
                collect_unknowns(not_at_end, out);
            }
            Stmt::Perform(p) => {
                if let PerformTarget::Inline(body) = &p.target {
                    collect_unknowns(body, out);
@@ -111,7 +111,9 @@ Tercera etapa: `Program` → `Ir`. Aquí se parsea cada `Sentence` cruda
  miembros (`DataModel::groups`).
 - `SET cond... TO TRUE` — la cara de escritura de los nombres de
  condición (nivel 88): asigna a su dato padre el valor del 88.
- Fuera de alcance v1: E/S de ficheros, CICS, SQL embebido.
+- E/S de ficheros: `Stmt::Open`/`Close`/`Read`/`Write` y `Ir::files`
  (de `SELECT` + `FD`). Sólo organización «line sequential».
 - Fuera de alcance v1: CICS, SQL embebido.
 ## charka-runtime
@@ -128,6 +130,9 @@ COBOL en tiempo de ejecución.
  justifica a la izquierda y rellena/trunca; `fill` mueve figurativas
  (`SPACES`, `ZEROS`).
 - `cobol_text_cmp` — comparación alfanumérica con relleno de espacios.
 - `CobFile` — un fichero «line sequential»: `open_input` carga las
  líneas a memoria, `open_output` acumula, `read`/`write` operan línea
  a línea, `close` vuelca a disco lo escrito.
 - Reexporta `Decimal`/`Picture`/`Rounding` de `charka-bcd` para que el
  código generado sólo necesite `use charka_runtime::*;`.
@@ -158,8 +163,10 @@ del programa COBOL.
 - Verificado de punta a punta: un programa COBOL de demostración
  transpila a Rust que compila contra `charka-runtime` y produce la
  salida correcta.
 - E/S de ficheros «line sequential»: un campo `CobFile` por fichero;
  `OPEN`/`CLOSE`/`READ`/`WRITE` emiten llamadas a `charka-runtime`.
 - Fuera de alcance v1: grupos como campo propio, `REDEFINES`,
-  `OCCURS` de grupo, E/S de ficheros.
+  `OCCURS` de grupo.
 ## charka-shadow
@@ -182,8 +189,8 @@ que corre el `Ir` directamente sobre `charka-runtime`, sin compilar.
 ## El corpus
-`crates/modules/charka/corpus/` — 17 programas COBOL graduados
+`crates/modules/charka/corpus/` — 18 programas COBOL graduados
-(`01-hola` … `17-rangopar`), cada uno con su `.expected`. Ejercita el
+(`01-hola` … `18-fichero`), cada uno con su `.expected`. Ejercita el
 pipeline completo de punta a punta. Ver su `README.md`.
 ## La CLI
@@ -206,4 +213,5 @@ concuerdan.
 Próximo hito mayor: salir del subconjunto COBOL'85 puro hacia CICS,
 SQL embebido y los dialectos IBM Enterprise; ampliar el codegen
-(grupos como campo, `REDEFINES`, `OCCURS` de grupo, E/S de ficheros).
+(grupos como campo, `REDEFINES`, `OCCURS` de grupo, ficheros
 indexados, `CALL` a subprogramas).
@@ -81,6 +81,9 @@ fn emit_struct(em: &mut Emitter, sym: &Symbols) {
        };
        em.line(&format!("{}: {ty},", f.ident));
    }
    for fs in &sym.files {
        em.line(&format!("{}: CobFile,", fs.ident));
    }
    em.dedent();
    em.line("}");
    em.blank();
@@ -99,6 +102,13 @@ fn emit_impl(em: &mut Emitter, sym: &Symbols, ir: &Ir) {
    for f in &sym.fields {
        em.line(&format!("{}: {},", f.ident, field_init(f)));
    }
    for fs in &sym.files {
        em.line(&format!(
            "{}: CobFile::new({}),",
            fs.ident,
            rust_str(&fs.path)
        ));
    }
    em.dedent();
    em.line("}");
    em.dedent();
@@ -438,6 +448,32 @@ mod tests {
        assert!(out.matches("self.p_b();").count() >= 2);
    }
    #[test]
    fn file_io_emits_open_read_write_close() {
        let out = gen("ENVIRONMENT DIVISION.\n\
             INPUT-OUTPUT SECTION.\n\
             FILE-CONTROL.\n\
                 SELECT ARCH ASSIGN TO 'x.dat'.\n\
             DATA DIVISION.\n\
             FILE SECTION.\n\
             FD ARCH.\n\
             01 REG PIC X(10).\n\
             PROCEDURE DIVISION.\n\
             MAIN.\n\
                 OPEN OUTPUT ARCH.\n\
                 WRITE REG FROM 'HI'.\n\
                 CLOSE ARCH.\n\
                 OPEN INPUT ARCH.\n\
                 READ ARCH AT END CONTINUE END-READ.\n\
                 CLOSE ARCH.\n");
        assert!(out.contains("file_arch: CobFile,"));
        assert!(out.contains("CobFile::new(\"x.dat\")"));
        assert!(out.contains("self.file_arch.open_output();"));
        assert!(out.contains("self.file_arch.write("));
        assert!(out.contains("self.file_arch.read()"));
        assert!(out.contains("self.file_arch.close();"));
    }
    #[test]
    fn empty_program_still_compiles_shape() {
        let out = gen("");
@@ -2,8 +2,8 @@
 //! una o varias líneas de código Rust sobre `charka-runtime`.
 use charka_ir::{
-    CmpOp, Cond, InspectOp, Operand, Perform, PerformControl, PerformTarget, Stmt, WhenBranch,
+    CmpOp, Cond, FileMode, InspectOp, Operand, Perform, PerformControl, PerformTarget, Stmt,
-    WhenTest,
+    WhenBranch, WhenTest,
 };
 use crate::emit::Emitter;
@@ -88,6 +88,14 @@ pub(crate) fn emit_stmt(em: &mut Emitter, sym: &Symbols, stmt: &Stmt) {
        Stmt::Inspect { target, op } => emit_inspect(em, sym, target, op),
        Stmt::Initialize { targets } => emit_initialize(em, sym, targets),
        Stmt::SetTrue { conditions } => emit_set_true(em, sym, conditions),
        Stmt::Open { mode, files } => emit_open(em, sym, *mode, files),
        Stmt::Close { files } => emit_close(em, sym, files),
        Stmt::Read {
            file,
            at_end,
            not_at_end,
        } => emit_read(em, sym, file, at_end, not_at_end),
        Stmt::Write { record, from } => emit_write(em, sym, record, from.as_ref()),
        Stmt::Perform(p) => emit_perform(em, sym, p),
        Stmt::GoTo { target } => {
            em.line(&format!(
@@ -493,6 +501,77 @@ fn emit_initialize(em: &mut Emitter, sym: &Symbols, targets: &[Operand]) {
    }
 }
 /// `OPEN {INPUT|OUTPUT} files...`
 fn emit_open(em: &mut Emitter, sym: &Symbols, mode: FileMode, files: &[String]) {
    let method = match mode {
        FileMode::Input => "open_input",
        FileMode::Output => "open_output",
    };
    for f in files {
        match sym.file(f) {
            Some(fs) => em.line(&format!("self.{}.{method}();", fs.ident)),
            None => em.line("// charka: OPEN de fichero no resuelto"),
        }
    }
 }
 /// `CLOSE files...`
 fn emit_close(em: &mut Emitter, sym: &Symbols, files: &[String]) {
    for f in files {
        match sym.file(f) {
            Some(fs) => em.line(&format!("self.{}.close();", fs.ident)),
            None => em.line("// charka: CLOSE de fichero no resuelto"),
        }
    }
 }
 /// `READ file [AT END ...] [NOT AT END ...]` — lee la línea siguiente
 /// en el registro del fichero.
 fn emit_read(em: &mut Emitter, sym: &Symbols, file: &str, at_end: &[Stmt], not_at_end: &[Stmt]) {
    let Some(fs) = sym.file(file) else {
        em.line("// charka: READ de fichero no resuelto");
        return;
    };
    let record_ident = sym.lookup(&fs.record).map(|r| r.ident.clone());
    em.line(&format!("match self.{}.read() {{", fs.ident));
    em.indent();
    em.line("Some(__line) => {");
    em.indent();
    if let Some(rec) = &record_ident {
        em.line(&format!("self.{rec}.store(__line.as_str());"));
    }
    emit_block(em, sym, not_at_end);
    em.dedent();
    em.line("}");
    em.line("None => {");
    em.indent();
    emit_block(em, sym, at_end);
    em.dedent();
    em.line("}");
    em.dedent();
    em.line("}");
 }
 /// `WRITE record [FROM from]` — escribe el registro en su fichero.
 fn emit_write(em: &mut Emitter, sym: &Symbols, record: &str, from: Option<&Operand>) {
    if let Some(src) = from {
        if let Some((lref, _)) = field_ref(sym, &Operand::Data(record.to_string())) {
            em.line(&format!("{lref}.store({});", operand_str(sym, src)));
        }
    }
    match sym.file_of_record(record) {
        Some(fs) => {
            if let Some(rec) = sym.lookup(record) {
                em.line(&format!(
                    "self.{}.write(&self.{}.display());",
                    fs.ident, rec.ident
                ));
            }
        }
        None => em.line("// charka: WRITE de registro no resuelto"),
    }
 }
 /// `SET cond... TO TRUE` — asigna a cada dato padre el valor que hace
 /// verdadero su nombre de condición (nivel 88).
 fn emit_set_true(em: &mut Emitter, sym: &Symbols, conditions: &[String]) {
@@ -24,8 +24,20 @@ pub(crate) struct Field {
    pub occurs: Option<u32>,
 }
-/// Los campos del programa, sus nombres de condición, sus grupos y
+/// Un fichero del programa generado.
-/// sus párrafos.
+pub(crate) struct FileSym {
    /// Nombre COBOL del fichero.
    pub cobol: String,
    /// Identificador Rust del campo `CobFile` (prefijo `file_`).
    pub ident: String,
    /// Ruta a la que está asignado.
    pub path: String,
    /// Nombre COBOL del registro asociado (su `FD`).
    pub record: String,
 }
 /// Los campos del programa, sus nombres de condición, sus grupos, sus
 /// párrafos y sus ficheros.
 pub(crate) struct Symbols {
    pub fields: Vec<Field>,
    by_name: HashMap<String, usize>,
@@ -33,6 +45,8 @@ pub(crate) struct Symbols {
    groups: HashMap<String, Vec<String>>,
    /// Los párrafos en orden: `(nombre COBOL, nombre de método Rust)`.
    pub paragraphs: Vec<(String, String)>,
    /// Los ficheros declarados.
    pub files: Vec<FileSym>,
 }
 impl Symbols {
@@ -79,15 +93,38 @@ impl Symbols {
                (proc.name.to_uppercase(), method)
            })
            .collect();
        let files = ir
            .files
            .iter()
            .map(|f| FileSym {
                cobol: f.name.clone(),
                ident: format!("file_{}", sanitize_ident(&f.name)),
                path: f.path.clone(),
                record: f.record.clone(),
            })
            .collect();
        Self {
            fields,
            by_name,
            conditions,
            groups,
            paragraphs,
            files,
        }
    }
    /// Busca un fichero por su nombre COBOL.
    pub(crate) fn file(&self, name: &str) -> Option<&FileSym> {
        let up = name.to_uppercase();
        self.files.iter().find(|f| f.cobol == up)
    }
    /// Busca el fichero cuyo registro `FD` es `record`.
    pub(crate) fn file_of_record(&self, record: &str) -> Option<&FileSym> {
        let up = record.to_uppercase();
        self.files.iter().find(|f| f.record == up)
    }
    /// Los métodos a llamar para un `PERFORM name [THRU thru]`: el
    /// rango de párrafos desde `name` hasta `thru` inclusive.
    pub(crate) fn paragraph_range(&self, name: &str, thru: Option<&str>) -> Vec<String> {
@@ -1,7 +1,7 @@
 //! Los tipos del IR: el programa COBOL con su PROCEDURE division ya
 //! parseada a instrucciones tipadas.
-pub use charka_parser::{DataItem, Token};
+pub use charka_parser::{DataItem, FileEntry, Token};
 /// Un programa COBOL en representación intermedia.
 #[derive(Debug, Clone, PartialEq, Default)]
@@ -14,6 +14,8 @@ pub struct Ir {
    /// El modelo de datos resuelto: los datos elementales aplanados y
    /// los nombres de condición (nivel 88).
    pub model: crate::model::DataModel,
    /// Los ficheros declarados (`SELECT` + `FD`).
    pub files: Vec<FileEntry>,
    /// Los párrafos del PROCEDURE, con sus statements ya tipados.
    pub procedures: Vec<Procedure>,
 }
@@ -191,6 +193,21 @@ pub enum Stmt {
    /// `SET cond-name... TO TRUE` — hace verdaderos esos nombres de
    /// condición (nivel 88): asigna a su dato padre el valor del 88.
    SetTrue { conditions: Vec<String> },
    /// `OPEN {INPUT|OUTPUT} files...`
    Open { mode: FileMode, files: Vec<String> },
    /// `CLOSE files...`
    Close { files: Vec<String> },
    /// `READ file [AT END at_end] [NOT AT END not_at_end] [END-READ]`
    Read {
        file: String,
        at_end: Vec<Stmt>,
        not_at_end: Vec<Stmt>,
    },
    /// `WRITE record [FROM from]`
    Write {
        record: String,
        from: Option<Operand>,
    },
    /// `PERFORM ...` — ver [`Perform`].
    Perform(Perform),
    /// `GO TO target`
@@ -208,6 +225,15 @@ pub enum Stmt {
    Unknown { verb: String, tokens: Vec<Token> },
 }
 /// El modo de apertura de un fichero.
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 pub enum FileMode {
    /// `OPEN INPUT` — para lectura.
    Input,
    /// `OPEN OUTPUT` — para escritura (crea el fichero de cero).
    Output,
 }
 /// La operación de un `INSPECT`.
 #[derive(Debug, Clone, PartialEq)]
 pub enum InspectOp {
@@ -19,7 +19,8 @@
 //! condiciones `AND`/`OR`/`NOT`), `EVALUATE`/`WHEN`, `STRING`,
 //! `UNSTRING`, `INSPECT`, `PERFORM` (fuera de línea, en línea,
 //! `TIMES`, `UNTIL`, `VARYING`), `GO TO`, `STOP RUN`, `GOBACK`,
-//! `EXIT`, `CONTINUE`. Fuera de alcance: E/S de ficheros, CICS y SQL.
+//! `EXIT`, `CONTINUE`, E/S de ficheros (`OPEN`/`READ`/`WRITE`/`CLOSE`).
 //! Fuera de alcance: CICS y SQL embebido.
 #![forbid(unsafe_code)]
@@ -58,6 +59,7 @@ pub fn lower(program: &Program) -> Ir {
        program_id: program.program_id.clone().unwrap_or_default(),
        data: program.data.clone(),
        model: model::resolve_data(&program.data),
        files: program.files.clone(),
        procedures,
    }
 }
@@ -471,6 +473,39 @@ mod tests {
        }
    }
    #[test]
    fn file_io_statements_parse() {
        let program = ir("ENVIRONMENT DIVISION.\n\
             INPUT-OUTPUT SECTION.\n\
             FILE-CONTROL.\n\
                 SELECT ARCH ASSIGN TO 'datos.dat'.\n\
             DATA DIVISION.\n\
             FILE SECTION.\n\
             FD ARCH.\n\
             01 REG PIC X(20).\n\
             PROCEDURE DIVISION.\n\
             MAIN.\n\
                 OPEN OUTPUT ARCH.\n\
                 WRITE REG FROM 'HOLA'.\n\
                 CLOSE ARCH.\n\
                 OPEN INPUT ARCH.\n\
                 READ ARCH AT END CONTINUE NOT AT END DISPLAY REG END-READ.\n\
                 CLOSE ARCH.\n");
        assert_eq!(program.files.len(), 1);
        assert_eq!(program.files[0].record, "REG");
        let body = &program.procedures[0].body;
        assert!(matches!(
            body[0],
            Stmt::Open {
                mode: FileMode::Output,
                ..
            }
        ));
        assert!(matches!(&body[1], Stmt::Write { record, .. } if record == "REG"));
        assert!(matches!(body[2], Stmt::Close { .. }));
        assert!(matches!(&body[4], Stmt::Read { not_at_end, .. } if not_at_end.len() == 1));
    }
    #[test]
    fn several_statements_in_one_sentence() {
        let b = body("MOVE 1 TO X DISPLAY X STOP RUN.");
@@ -6,7 +6,8 @@
 use charka_parser::TokenKind;
 use crate::ast::{
-    InspectOp, Operand, Perform, PerformControl, PerformTarget, Stmt, WhenBranch, WhenTest,
+    FileMode, InspectOp, Operand, Perform, PerformControl, PerformTarget, Stmt, WhenBranch,
    WhenTest,
 };
 use crate::cursor::{parse_operand, Cursor};
 use crate::expr::{parse_cond, parse_expr};
@@ -46,6 +47,10 @@ fn parse_one_stmt(c: &mut Cursor, stops: &[&str]) -> Stmt {
        "INSPECT" => parse_inspect(c),
        "INITIALIZE" => parse_initialize(c),
        "SET" => parse_set(c),
        "OPEN" => parse_open(c),
        "CLOSE" => parse_close(c),
        "READ" => parse_read(c),
        "WRITE" => parse_write(c),
        "PERFORM" => parse_perform(c),
        "GO" => parse_goto(c),
        "STOP" => parse_stop(c),
@@ -418,6 +423,79 @@ fn parse_initialize(c: &mut Cursor) -> Stmt {
    Stmt::Initialize { targets }
 }
 fn parse_open(c: &mut Cursor) -> Stmt {
    c.bump(); // OPEN
    let mode = if c.eat_word("OUTPUT") || c.eat_word("EXTEND") {
        FileMode::Output
    } else {
        c.eat_word("INPUT");
        c.eat_word("I-O");
        FileMode::Input
    };
    let mut files = Vec::new();
    while let Some(w) = c.peek_word() {
        if is_boundary(&w) || matches!(w.as_str(), "INPUT" | "OUTPUT" | "EXTEND" | "I-O") {
            break;
        }
        c.bump();
        files.push(w);
    }
    skip_to_stmt_boundary(c);
    Stmt::Open { mode, files }
 }
 fn parse_close(c: &mut Cursor) -> Stmt {
    c.bump(); // CLOSE
    let mut files = Vec::new();
    while let Some(name) = parse_one_name(c) {
        files.push(name);
    }
    Stmt::Close { files }
 }
 fn parse_read(c: &mut Cursor) -> Stmt {
    c.bump(); // READ
    let file = parse_one_name(c).unwrap_or_default();
    c.eat_word("NEXT");
    c.eat_word("RECORD");
    if c.eat_word("INTO") {
        let _ = parse_operand(c); // `READ ... INTO`: la v1 lo ignora
    }
    let mut at_end = Vec::new();
    let mut not_at_end = Vec::new();
    loop {
        if c.eat_word("AT") {
            c.eat_word("END");
            at_end = parse_statements(c, &["NOT", "END-READ"]);
        } else if c.eat_word("NOT") {
            c.eat_word("AT");
            c.eat_word("END");
            not_at_end = parse_statements(c, &["END-READ"]);
        } else {
            break;
        }
    }
    c.eat_word("END-READ");
    Stmt::Read {
        file,
        at_end,
        not_at_end,
    }
 }
 fn parse_write(c: &mut Cursor) -> Stmt {
    c.bump(); // WRITE
    let record = parse_one_name(c).unwrap_or_default();
    let from = if c.eat_word("FROM") {
        Some(parse_operand(c))
    } else {
        None
    };
    skip_to_stmt_boundary(c); // p. ej. `AFTER ADVANCING`
    c.eat_word("END-WRITE");
    Stmt::Write { record, from }
 }
 fn parse_inspect(c: &mut Cursor) -> Stmt {
    c.bump(); // INSPECT
    let target = parse_operand(c);
@@ -34,10 +34,21 @@ pub struct Program {
    pub program_id: Option<String>,
    /// Los ítems raíz de la DATA division (cada `01`/`77` con su árbol).
    pub data: Vec<DataItem>,
    /// Los ficheros declarados (`SELECT` + `FD`).
    pub files: Vec<FileEntry>,
    /// Los párrafos de la PROCEDURE division, en orden de aparición.
    pub paragraphs: Vec<Paragraph>,
 }
 /// Un fichero declarado: su nombre lógico, la ruta a la que se asigna
 /// (`ASSIGN TO`) y el dato de registro asociado (el `01` bajo su `FD`).
 #[derive(Debug, Clone, PartialEq, Eq, Default)]
 pub struct FileEntry {
    pub name: String,
    pub path: String,
    pub record: String,
 }
 /// Un ítem de datos de la DATA division: un número de nivel, un nombre
 /// y, opcionalmente, las cláusulas `PICTURE` y `VALUE`. Los ítems de
 /// mayor nivel numérico cuelgan como `children` del que los contiene.
@@ -110,8 +121,17 @@ pub fn parse(tokens: &[Token]) -> Result<Program, ParseError> {
        let body = &tokens[lo..next.max(lo)];
        match kind {
            DivKind::Identification => parse_identification(body, &mut program),
-            DivKind::Environment => {} // la v1 ignora la ENVIRONMENT division
+            DivKind::Environment => parse_environment(body, &mut program),
-            DivKind::Data => program.data = parse_data(body)?,
+            DivKind::Data => {
                let (data, fd_records) = parse_data(body)?;
                program.data = data;
                // Asocia cada `FD` con el registro `01` que le sigue.
                for (fd, record) in fd_records {
                    if let Some(f) = program.files.iter_mut().find(|f| f.name == fd) {
                        f.record = record;
                    }
                }
            }
            DivKind::Procedure => program.paragraphs = parse_procedure(body),
        }
    }
@@ -165,20 +185,76 @@ fn parse_identification(body: &[Token], program: &mut Program) {
 }
 /// Parsea el cuerpo de la DATA division en un árbol de [`DataItem`].
-fn parse_data(body: &[Token]) -> Result<Vec<DataItem>, ParseError> {
+/// Parsea la cláusula `FILE-CONTROL` de la ENVIRONMENT division: cada
 /// `SELECT name ASSIGN TO "ruta"` se registra como un fichero.
 fn parse_environment(body: &[Token], program: &mut Program) {
    for sent in split_sentences(body) {
        if kw(sent.first()).as_deref() != Some("SELECT") {
            continue;
        }
        let Some(name_tok) = sent.get(1) else {
            continue;
        };
        if name_tok.kind != TokenKind::Word {
            continue;
        }
        let mut path = String::new();
        let mut i = 2;
        while i < sent.len() {
            if kw(sent.get(i)).as_deref() == Some("ASSIGN") {
                i += 1;
                if kw(sent.get(i)).as_deref() == Some("TO") {
                    i += 1;
                }
                if let Some(t) = sent.get(i) {
                    path = t.text.clone();
                }
                break;
            }
            i += 1;
        }
        program.files.push(FileEntry {
            name: name_tok.text.to_uppercase(),
            path,
            record: String::new(),
        });
    }
 }
 /// El resultado de parsear la DATA division: el árbol de datos y las
 /// parejas `(FD, registro)` — el `01` que sigue a cada `FD`.
 type DataResult = (Vec<DataItem>, Vec<(String, String)>);
 /// Parsea la DATA division.
 fn parse_data(body: &[Token]) -> Result<DataResult, ParseError> {
    let mut flat = Vec::new();
    let mut fd_records = Vec::new();
    let mut pending_fd: Option<String> = None;
    for sent in split_sentences(body) {
        let Some(first) = sent.first() else { continue };
-        // Sólo las sentencias que arrancan con un número de nivel son
+        // Las sentencias que arrancan con un número de nivel son
-        // entradas de datos; los encabezados de SECTION y las entradas
+        // entradas de datos; las demás son encabezados de SECTION o
-        // FD/SD empiezan con palabra y se ignoran.
+        // de `FD`/`SD`.
        if first.kind != TokenKind::Number {
            if matches!(kw(Some(first)).as_deref(), Some("FD") | Some("SD")) {
                pending_fd = sent
                    .get(1)
                    .filter(|t| t.kind == TokenKind::Word)
                    .map(|t| t.text.to_uppercase());
            }
            continue;
        }
        let level = parse_level(first)?;
-        flat.push(parse_data_entry(level, &sent)?);
+        let entry = parse_data_entry(level, &sent)?;
        // El primer `01` tras un `FD` es su registro.
        if level == 1 {
            if let Some(fd) = pending_fd.take() {
                fd_records.push((fd, entry.name.clone()));
            }
-    Ok(build_tree(flat))
+        }
        flat.push(entry);
    }
    Ok((build_tree(flat), fd_records))
 }
 /// Valida que el token sea un número de nivel COBOL (01-49, 66, 77, 88).
@@ -624,6 +700,26 @@ mod tests {
        assert_eq!(p.data[0].children[1].name, "FILLER");
    }
    #[test]
    fn select_and_fd_captured() {
        let p = parse_src(
            "ENVIRONMENT DIVISION.\n\
             INPUT-OUTPUT SECTION.\n\
             FILE-CONTROL.\n\
                 SELECT CLIENTES ASSIGN TO 'clientes.dat'.\n\
             DATA DIVISION.\n\
             FILE SECTION.\n\
             FD CLIENTES.\n\
             01 REG-CLIENTE PIC X(40).\n\
             WORKING-STORAGE SECTION.\n\
             01 WS-FIN PIC X.\n",
        );
        assert_eq!(p.files.len(), 1);
        assert_eq!(p.files[0].name, "CLIENTES");
        assert_eq!(p.files[0].path, "clientes.dat");
        assert_eq!(p.files[0].record, "REG-CLIENTE");
    }
    #[test]
    fn occurs_clause_captured() {
        let p = parse_src(
@@ -0,0 +1,102 @@
 //! `CobFile` — un fichero secuencial de líneas para el runtime COBOL.
 use std::collections::VecDeque;
 /// Un fichero de organización «line sequential»: cada registro es una
 /// línea de texto. La lectura carga el fichero entero a memoria; la
 /// escritura acumula líneas y las vuelca al cerrar.
 #[derive(Debug)]
 pub struct CobFile {
    path: String,
    state: State,
 }
 #[derive(Debug)]
 enum State {
    Closed,
    /// Abierto para lectura: las líneas que faltan por leer.
    Reading(VecDeque<String>),
    /// Abierto para escritura: las líneas acumuladas.
    Writing(Vec<String>),
 }
 impl CobFile {
    /// Un fichero nuevo, cerrado, asignado a la ruta `path`.
    pub fn new(path: &str) -> Self {
        Self {
            path: path.to_string(),
            state: State::Closed,
        }
    }
    /// `OPEN INPUT`: carga el fichero a memoria. Si no existe, queda
    /// abierto y vacío (la primera lectura dará fin de fichero).
    pub fn open_input(&mut self) {
        let lines = std::fs::read_to_string(&self.path)
            .map(|s| s.lines().map(str::to_string).collect())
            .unwrap_or_default();
        self.state = State::Reading(lines);
    }
    /// `OPEN OUTPUT`: empieza un fichero nuevo y vacío.
    pub fn open_output(&mut self) {
        self.state = State::Writing(Vec::new());
    }
    /// `READ`: la siguiente línea, o `None` en fin de fichero.
    pub fn read(&mut self) -> Option<String> {
        match &mut self.state {
            State::Reading(lines) => lines.pop_front(),
            _ => None,
        }
    }
    /// `WRITE`: agrega una línea (sólo si está abierto para escritura).
    pub fn write(&mut self, line: &str) {
        if let State::Writing(buf) = &mut self.state {
            buf.push(line.to_string());
        }
    }
    /// `CLOSE`: si estaba escribiendo, vuelca el contenido al disco.
    pub fn close(&mut self) {
        if let State::Writing(buf) = &self.state {
            let body: String = buf.iter().map(|l| format!("{l}\n")).collect();
            let _ = std::fs::write(&self.path, body);
        }
        self.state = State::Closed;
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    #[test]
    fn write_then_read_roundtrip() {
        let path = std::env::temp_dir().join("charka-cobfile-test.dat");
        let path = path.to_str().unwrap();
        let mut f = CobFile::new(path);
        f.open_output();
        f.write("PRIMERA");
        f.write("SEGUNDA");
        f.close();
        let mut g = CobFile::new(path);
        g.open_input();
        assert_eq!(g.read().as_deref(), Some("PRIMERA"));
        assert_eq!(g.read().as_deref(), Some("SEGUNDA"));
        assert_eq!(g.read(), None); // fin de fichero
        g.close();
        let _ = std::fs::remove_file(path);
    }
    #[test]
    fn missing_file_reads_as_empty() {
        let mut f = CobFile::new("/charka/no/existe/jamas.dat");
        f.open_input();
        assert_eq!(f.read(), None);
    }
 }
@@ -17,10 +17,12 @@
 #![forbid(unsafe_code)]
 mod file;
 mod num;
 mod text;
 pub use charka_bcd::{Decimal, Picture, Rounding};
 pub use file::CobFile;
 pub use num::Num;
 pub use text::Text;
@@ -8,10 +8,10 @@
 use std::collections::HashMap;
 use charka_ir::{
-    BinOp, CmpOp, Cond, ConditionName, Expr, Figurative, InspectOp, Ir, Operand, Perform,
+    BinOp, CmpOp, Cond, ConditionName, Expr, Figurative, FileMode, InspectOp, Ir, Operand, Perform,
    PerformControl, PerformTarget, Stmt, WhenTest,
 };
-use charka_runtime::{cobol_text_cmp, Decimal, Num, Rounding, Text};
+use charka_runtime::{cobol_text_cmp, CobFile, Decimal, Num, Rounding, Text};
 use crate::field::{build_fields, Cell};
@@ -39,6 +39,7 @@ pub(crate) struct Machine<'a> {
    fields: HashMap<String, Cell>,
    para_index: HashMap<String, usize>,
    conditions: HashMap<String, ConditionName>,
    files: HashMap<String, CobFile>,
    pub output: Vec<String>,
    budget: u64,
    pub step_limit_hit: bool,
@@ -58,11 +59,17 @@ impl<'a> Machine<'a> {
            .iter()
            .map(|c| (c.name.clone(), c.clone()))
            .collect();
        let files = ir
            .files
            .iter()
            .map(|f| (f.name.to_uppercase(), CobFile::new(&f.path)))
            .collect();
        Self {
            ir,
            fields: build_fields(&ir.model),
            para_index,
            conditions,
            files,
            output: Vec::new(),
            budget: STEP_BUDGET,
            step_limit_hit: false,
@@ -320,6 +327,69 @@ impl<'a> Machine<'a> {
                }
                Flow::Normal
            }
            Stmt::Open { mode, files } => {
                for f in files {
                    if let Some(cf) = self.files.get_mut(&f.to_uppercase()) {
                        match mode {
                            FileMode::Input => cf.open_input(),
                            FileMode::Output => cf.open_output(),
                        }
                    }
                }
                Flow::Normal
            }
            Stmt::Close { files } => {
                for f in files {
                    if let Some(cf) = self.files.get_mut(&f.to_uppercase()) {
                        cf.close();
                    }
                }
                Flow::Normal
            }
            Stmt::Read {
                file,
                at_end,
                not_at_end,
            } => {
                let line = self
                    .files
                    .get_mut(&file.to_uppercase())
                    .and_then(|cf| cf.read());
                match line {
                    Some(text) => {
                        let record = self
                            .ir
                            .files
                            .iter()
                            .find(|f| f.name.eq_ignore_ascii_case(file))
                            .map(|f| f.record.clone());
                        if let Some(rec) = record {
                            self.store_text(&Operand::Data(rec), &text);
                        }
                        self.exec_block(not_at_end)
                    }
                    None => self.exec_block(at_end),
                }
            }
            Stmt::Write { record, from } => {
                if let Some(src) = from {
                    let text = self.eval_text(src);
                    self.store_text(&Operand::Data(record.clone()), &text);
                }
                let file = self
                    .ir
                    .files
                    .iter()
                    .find(|f| f.record.eq_ignore_ascii_case(record))
                    .map(|f| f.name.to_uppercase());
                if let Some(file) = file {
                    let line = self.eval_text(&Operand::Data(record.clone()));
                    if let Some(cf) = self.files.get_mut(&file) {
                        cf.write(&line);
                    }
                }
                Flow::Normal
            }
            Stmt::Perform(p) => self.exec_perform(p),
            Stmt::GoTo { target } => {
                // Aproximación: ejecuta el destino y sale del párrafo.
@@ -126,6 +126,7 @@ mod tests {
    corpus_test!(corpus_15_resetear, "15-resetear");
    corpus_test!(corpus_16_bandera, "16-bandera");
    corpus_test!(corpus_17_rangopar, "17-rangopar");
    corpus_test!(corpus_18_fichero, "18-fichero");
    #[test]
    fn empty_source_runs_clean() {
@@ -0,0 +1,35 @@
 * corpus charka — nivel 7: E/S de ficheros (escribir y releer)
 IDENTIFICATION DIVISION.
 PROGRAM-ID. FICHERO.
 ENVIRONMENT DIVISION.
 INPUT-OUTPUT SECTION.
 FILE-CONTROL.
    SELECT DATOS ASSIGN TO '/tmp/charka-corpus-18.dat'
        ORGANIZATION IS LINE SEQUENTIAL.
 DATA DIVISION.
 FILE SECTION.
 FD DATOS.
 01 REGISTRO PIC X(20).
 WORKING-STORAGE SECTION.
 01 WS-FIN  PIC X VALUE 'N'.
   88 FIN-DATOS VALUE 'S'.
 01 WS-CONT PIC 9(3) VALUE 0.
 PROCEDURE DIVISION.
 MAIN.
    OPEN OUTPUT DATOS.
    WRITE REGISTRO FROM 'PRIMERA LINEA'.
    WRITE REGISTRO FROM 'SEGUNDA LINEA'.
    WRITE REGISTRO FROM 'TERCERA LINEA'.
    CLOSE DATOS.
    OPEN INPUT DATOS.
    PERFORM UNTIL FIN-DATOS
        READ DATOS
            AT END SET FIN-DATOS TO TRUE
            NOT AT END
                ADD 1 TO WS-CONT
                DISPLAY REGISTRO
        END-READ
    END-PERFORM.
    CLOSE DATOS.
    DISPLAY 'LINEAS LEIDAS = ' WS-CONT.
    STOP RUN.
@@ -0,0 +1,4 @@
 PRIMERA LINEA
 SEGUNDA LINEA
 TERCERA LINEA
 LINEAS LEIDAS = 003
@@ -26,6 +26,7 @@ salida correcta, una línea por `DISPLAY`.
 | `15-resetear`       | 6     | `INITIALIZE` — resetear datos y grupos             |
 | `16-bandera`        | 5     | `SET` de nombres de condición (nivel 88) a `TRUE`  |
 | `17-rangopar`       | 5     | `PERFORM ... THRU` — un rango de párrafos          |
 | `18-fichero`        | 7     | E/S de ficheros: `SELECT`/`FD`/`OPEN`/`READ`/`WRITE`|
 ## Formato
@@ -3,6 +3,29 @@
 Transpilador COBOL → Rust. El módulo más grande del ecosistema (Fase D
 del plan macro) — el parser COBOL completo es un esfuerzo multi-mes.
 ### feat(charka): E/S de ficheros — SELECT / FD / OPEN / READ / WRITE / CLOSE
 El gran hueco que faltaba para el COBOL real: el procesamiento de
 ficheros secuenciales. Una rebanada vertical por los seis crates.
 - `charka-parser`: la ENVIRONMENT division ya no se ignora — se
  parsea `FILE-CONTROL` (`SELECT name ASSIGN TO "ruta"`); del FILE
  SECTION se asocia cada `FD` con su registro `01`. `Program::files`.
 - `charka-runtime`: tipo `CobFile` — un fichero «line sequential»
  (cada registro una línea). Lectura: carga a memoria. Escritura:
  acumula y vuelca al cerrar.
 - `charka-ir`: `Ir::files` y los statements `Open`/`Close`/`Read`/
  `Write`. `READ` lleva sus bloques `AT END` / `NOT AT END`.
 - `charka-codegen`: un campo `CobFile` por fichero en el `struct
  Program`; los verbos emiten llamadas al runtime.
 - `charka-shadow`: el intérprete hace E/S de ficheros real.
 - Corpus: programa nuevo `18-fichero` — escribe tres líneas a un
  fichero, lo cierra, lo relee con `READ ... AT END` y las muestra.
  Verificado: el intérprete sombra y el crate compilado dan la misma
  salida.
 - Alcance v1: organización «line sequential»; sin ficheros indexados
  ni relativos, sin `FILE STATUS`.
 ### feat(charka): PERFORM ... THRU como rango real de párrafos
 `PERFORM A THRU C` ejecuta A, B y C; antes el transpilador sólo