feat(minga): multi-lenguaje en parser — Python, TypeScript, JavaScript, Go

Minga deja de ser Rust-only. Cualquiera de los cinco dialectos
(Rust + 4 nuevos) se ingresa al CAS por su AST normalizado, hashea
estructuralmente, sincroniza por DHT como cualquier nodo. La
auto-deteccion por extension hace que minga ingest archivo.{py,ts,js,go}
"simplemente funcione".

API nueva en minga_core::parse:
- Funciones por dialecto: python, typescript, javascript, go (~6 LOC
  c/u sobre el parse_with comun). Mas la rust existente.
- Enum Dialect con parse(source) y name() para logging.
- detect_by_extension(ext) -> Option<Dialect>: rs/py/pyi/ts/js/mjs/
  cjs/go (case-insensitive). None para extensiones desconocidas.

Wire en minga-cli:
- cmd_ingest deja de hardcodear parse::rust — usa
  detect_dialect(file)?.parse(...).
- initial_scan + cmd_watch cambian is_rs_file -> is_supported_source.
- CliError::UnsupportedLanguage { path, extension } nuevo, lista las
  extensiones reconocidas en el mensaje.

Notas sobre hashing:
- Hashing estructural (cas::hash_node) funciona para todos. NO es
  alpha-equivalente.
- Hashing alpha-equivalente (alpha::hash_node_alpha) sigue siendo
  Rust-only — cada lenguaje tiene reglas distintas para binder vs
  constructor; implementacion per-language queda como work futuro
  (requiere conocimiento profundo de cada gramatica).
- Sanity test structural_hash_distinguishes_languages verifica que
  "x = 1" parseado como Python != JS — las gramaticas no comparten
  kinds, hashes salen distintos. Importante para evitar colisiones.

Deps nuevas (workspace + minga-core):
- tree-sitter-python 0.23, tree-sitter-typescript 0.23 (modo
  LANGUAGE_TYPESCRIPT, no TSX), tree-sitter-javascript 0.23,
  tree-sitter-go 0.23.

Tests: 9 nuevos en parse::tests (parse basico para 5 dialectos +
detect_by_extension canonical/case-insensitive + name() +
structural_hash_distinguishes_languages). 108 verdes en minga-core,
10 en minga-cli, sin regresion.

Pendientes: alpha-hashing per-language; alpha-Rust documentados en
alpha.rs (if let, while let, let-else, let-chains, or_pattern con
bindings).
This commit is contained in:
Sergio
2026-05-09 16:06:31 +00:00
parent f9a3c33586
commit 4db168253c
7 changed files with 357 additions and 12 deletions
Generated
+44
View File
@@ -5874,7 +5874,11 @@ dependencies = [
"serde-big-array",
"thiserror 2.0.18",
"tree-sitter",
"tree-sitter-go",
"tree-sitter-javascript",
"tree-sitter-python",
"tree-sitter-rust",
"tree-sitter-typescript",
]
[[package]]
@@ -10335,12 +10339,42 @@ dependencies = [
"tree-sitter-language",
]
[[package]]
name = "tree-sitter-go"
version = "0.23.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b13d476345220dbe600147dd444165c5791bf85ef53e28acbedd46112ee18431"
dependencies = [
"cc",
"tree-sitter-language",
]
[[package]]
name = "tree-sitter-javascript"
version = "0.23.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bf40bf599e0416c16c125c3cec10ee5ddc7d1bb8b0c60fa5c4de249ad34dc1b1"
dependencies = [
"cc",
"tree-sitter-language",
]
[[package]]
name = "tree-sitter-language"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "009994f150cc0cd50ff54917d5bc8bffe8cad10ca10d81c34da2ec421ae61782"
[[package]]
name = "tree-sitter-python"
version = "0.23.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3d065aaa27f3aaceaf60c1f0e0ac09e1cb9eb8ed28e7bcdaa52129cffc7f4b04"
dependencies = [
"cc",
"tree-sitter-language",
]
[[package]]
name = "tree-sitter-rust"
version = "0.23.3"
@@ -10351,6 +10385,16 @@ dependencies = [
"tree-sitter-language",
]
[[package]]
name = "tree-sitter-typescript"
version = "0.23.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c5f76ed8d947a75cc446d5fccd8b602ebf0cde64ccf2ffa434d873d7a575eff"
dependencies = [
"cc",
"tree-sitter-language",
]
[[package]]
name = "trice"
version = "0.4.0"