Commit Graph

11 Commits

Author SHA1 Message Date
sergio a9124449f9 feat(shipote): health endpoint + audit log + token-bucket real (fase R)
- Request::Health → Response::Health { version, uptime_ms, alive_*,
  active_flows, dirty }. CLI: shipote health.
- handle_client lee peer_uid una vez al accept. audit_request emite
  info!(target: "audit", uid, action, detail) por mutación (create/stop/
  run/pipeline.*/flow.drop). Reads omitidos. Filtrable con SHIPOTE_LOG=
  warn,audit=info.
- TokenBucket real reemplaza rate_limit_sleep: refill por wall time,
  capacity = 1s de rate, debt negativo dispara sleep proporcional.
  Permite burst real, no chunk-by-chunk uniforme.

85 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 26, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:58:10 +00:00
sergio 18c0344a52 feat(shipote): throughput card + rate-limit + snapshot incremental (fase Q)
- shipote-shell Flow channels card extiende con bytes_total + bytes/s
  por socket. Lookup helper evita borrows en closures.
- DiscernPolicy.max_bytes_per_sec: splitter task hace sleep proporcional
  al tamaño de chunk tras cada broadcast. Token-bucket simple v1.
- WorkspaceManager.dirty: AtomicBool. mark_dirty() en mutaciones que
  afectan al snapshot. save_snapshot skip si clean y path existe.
  restore_snapshot resetea dirty=false (hidratación no es mutation).

85 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 26, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:20:50 +00:00
sergio 3486949d24 feat(shipote): throughput + stats persistente + auth peer (fase P)
- FlowMeter (atomic u64 + rolling window 32 samples) en cada FlowChannel.
  flow_throughput() → (socket, bytes_total, bytes_per_sec). CLI:
  shipote flow throughput. Idle threshold 5s = rate 0.0.
- Snapshot v4 con stats_history persistente por workspace (cap 16).
  PersistedStats separado para evitar Instant. Restore hidrata el VecDeque
  con source="persisted".
- Auth SO_PEERCRED: daemon rechaza peers con uid distinto al propio.
  SHIPOTE_TRUST_ANYONE=1 = escape hatch documentado.

84 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 25, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:58:41 +00:00
sergio 1cce50b290 feat(shipote): collision detection + stats history server-side (fase O)
- Flow socket names usan pipeline_id full (ULID 26 chars) + edge_idx.
  Cero colisiones entre pipelines (ULID es único global). Fallback con
  suffix -N si el path existe (cap 1000 retries).
- WorkspaceState.stats_history (VecDeque cap 64) — workspace_stats
  appendea cada call. API workspace_stats_history(id, tail). Protocol
  WorkspaceStatsHistory. Shell pide history al primer probe → sparkline
  hidratada al boot, sobrevive restart del shell.

84 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 25, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 10:55:21 +00:00
sergio a823c40fe1 feat(shipote): drain shutdown + persist live pipelines + batched query (fase N)
- Daemon SIGTERM/SIGINT: snapshot ANTES, stop_with_grace(1s) de todos
  los workspaces DESPUÉS. Grace permite app-level cleanup.
- Snapshot v3 con live_pipelines: pipeline_supervisors se persisten;
  daemon relanza al restore con sus recursos (Incarnator+DiscernPipeline).
  RestoreOutcome separado para que core no necesite incarnator.
  Forward-compat v1/v2 via #[serde(default)].
- WorkspaceFullSummary: stats+quota+commands+flow_sockets en 1 roundtrip.
  Shell reduce N×4 requests/probe a N×1 + 4 globales.

83 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 24, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 10:48:11 +00:00
sergio c3f9c9e36a feat(shipote): pipeline backoff + quota card + logs follow (fase M)
- PipelineSpec.restart_backoff_ms + restart_max_backoff_ms + restart_max:
  backoff exponencial entre relaunches (anti-thrash). take_pending_restarts
  aplica restart_max (0 = infinito); excedido = supervisor descartado con
  warning. Daemon hace tokio::sleep(backoff) antes del relaunch y escala
  current_backoff x2 hasta el cap.
- shipote-shell card "Quota breaches": probe extiende con WorkspaceQuota
  por workspace. Color rojo si hay breaches, verde si no.
- shipote logs --follow: poll cada 200ms al daemon, imprime suffix nuevo
  hasta que el comando termine. Sin cambios al protocolo. Best-effort:
  si el ring rota más rápido que el poll, se pierden bytes.

83 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 24, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 10:34:27 +00:00
sergio 4c9d1b4c1d feat(shipote): quota enforce + cgroup memory.max + pipeline restart (fase L)
- WorkspaceSpec.quota_enforce: QuotaAction (None|Log|Kill) por recurso
  (mem, nproc). reap_dead aplica policy; Kill usa stop_with_grace(ZERO).
- ente_incarnate::cgroup::apply_rlimits_to_cgroup escribe memory.max y
  pids.max. WorkspaceManager::create_with_id lo invoca si soma.cgroup.path
  y delegation. Kernel hace OOM kill al exceder; falla silenciosa si no
  hay delegation.
- PipelineSpec.restart_on_failure: bool. register_pipeline_supervisor
  retiene spec; reap_dead detecta all-dead + any-failed → push a queue;
  daemon reaper drena y relanza pipeline ENTERO (los pipes intermedios
  no permiten restart parcial).

82 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 24, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 10:22:46 +00:00
sergio 324a0c2d5d feat(shipote): multi-core CPU% + quota report + restart-on-failure (fase K)
- WorkspaceStats.cpu_cores via sysconf cacheado. CLI muestra
  `cpu_pct: 98.7 % (24.7% total / 4 cores)`.
- workspace_quota compara SomaSpec.rlimits contra accounting actual.
  Reporta breaches humanos. NO enforcement automático en v1.
- run_with_options(.., restart_on_failure): si exit != 0, reaper
  relaunch con backoff exponencial 200ms → 30s cap. Inner.restart_specs
  persiste el spec entre intentos.

81 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 22, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 01:32:39 +00:00
sergio d8727a3038 feat(shipote): CPU% + pipeline live-tail + replay por bytes (fase J)
- CPU% derivado server-side entre samples (WorkspaceState.last_cpu_sample).
  100% = 1 core saturado. Primer sample devuelve None (sin baseline).
- shipote pipeline run --tail: tras lanzar, suscribe al primer flow_socket
  y vuelca bytes hasta EOF. Auto-implica --tap.
- DiscernPolicy.replay_bytes: cap adicional por bytes para el replay
  buffer del FlowChannel. evict_for_incoming considera el chunk entrante
  para que post-push el buffer NUNCA exceda los caps.
- shipote-shell: stats history extiende sparkline con %CPU.

80 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 21, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 00:57:04 +00:00
sergio 36dac00c8d feat(shipote): data plane + DAG fan-in/out + stats + lifecycle (fases F-I)
Pipeline runtime:
- Fan-out 1→N (splitter task replica al N consumers) y fan-in N→1 (merger
  task con mpsc + reader-per-input). DAGs no lineales soportados.
- Flow channels: Unix socket + tokio broadcast con replay buffer
  configurable por pipeline (DiscernPolicy.replay_chunks). Subscribers
  externos vía `shipote flow tail <socket>`.
- Templating en specs con `${KEY}` (CLI `--var KEY=VALUE`). Walk
  recursivo sobre serde_json::Value, soporta todos los strings del schema.
- Pipelines guardados (`pipeline save/saved-list/drop/run-saved`)
  persisten con el snapshot.

Lifecycle de comandos:
- Log capture per-stream (stdout/stderr separados) via pipe O_CLOEXEC +
  AsyncFd. CLI `shipote logs <ws> <cmd> --stream {stdout,stderr,both}`.
- Stop graceful con tiempo configurable: SIGTERM → grace → SIGKILL.
  Tanto a nivel workspace como pipeline individual.
- TTL auto-stop ya existente (Fase C) sigue funcionando.

ente-incarnate:
- ChildStdio declarativo (Fase C) + ChildPreExec declarativo nuevo:
  NoNewPrivs, ParentDeathSig, Dumpable, NewSession, Chdir, Umask.
- Aplicación pre-execve async-signal-safe en ambos paths (plain via
  Command::pre_exec, namespaced via callback del clone(2)).

Observabilidad:
- WorkspaceStats: RSS + RSS peak (VmHWM o memory.peak cgroup) + CPU usec
  + uptime. Fuente per-proc o cgroup según delegation.
- shipote-shell con sparkline ASCII por workspace (history cap 24),
  card de flow channels activos, vista de comandos + saved pipelines.
- Tap → broker: cada edge enriquecido con TypeRef se anuncia como Card
  efímera vía SidecarPool (graceful si broker no corre).

Discern:
- Integrado en yahweh-provider-fs (mime_type en EntityNode).
- Integrado en nouser-core::cluster::pick_lens como fallback cuando la
  extensión cae a Lens::Grid.

79 tests pasan: ente-incarnate (16), nouser-core (27), shipote-card (8),
shipote-core (20), shipote-discern (5), yahweh-provider-fs (3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 00:29:46 +00:00
sergio c22d2480b9 shell 2026-05-10 21:58:16 +00:00