feat(shipote): quota enforce + cgroup memory.max + pipeline restart (fase L)

- WorkspaceSpec.quota_enforce: QuotaAction (None|Log|Kill) por recurso
  (mem, nproc). reap_dead aplica policy; Kill usa stop_with_grace(ZERO).
- ente_incarnate::cgroup::apply_rlimits_to_cgroup escribe memory.max y
  pids.max. WorkspaceManager::create_with_id lo invoca si soma.cgroup.path
  y delegation. Kernel hace OOM kill al exceder; falla silenciosa si no
  hay delegation.
- PipelineSpec.restart_on_failure: bool. register_pipeline_supervisor
  retiene spec; reap_dead detecta all-dead + any-failed → push a queue;
  daemon reaper drena y relanza pipeline ENTERO (los pipes intermedios
  no permiten restart parcial).

82 tests pasan (ente-incarnate 16, nouser-core 27, shipote-card 8,
shipote-core 24, shipote-discern 5, yahweh-provider-fs 3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
sergio
2026-05-11 10:22:46 +00:00
parent 324a0c2d5d
commit 4c9d1b4c1d
7 changed files with 401 additions and 5 deletions
+28 -3
View File
@@ -1,8 +1,8 @@
//! Resolución y creación de cgroups v2 para el hijo.
use crate::error::IncarnateError;
use brahman_card::CgroupSpec;
use std::path::PathBuf;
use brahman_card::{CgroupSpec, ResourceLimits};
use std::path::{Path, PathBuf};
/// Cgroup actual del proceso que llama. Lo usamos como prefijo para paths
/// declarados relativos en `CgroupSpec.path`.
@@ -58,8 +58,33 @@ pub fn ensure_cgroup(spec: &CgroupSpec) -> Result<PathBuf, IncarnateError> {
Ok(abs)
}
/// Escribe `memory.max` y `pids.max` al cgroup según `rlimits`. Falla
/// silenciosamente si los archivos no son escribibles (cgroup no
/// delegated). El kernel hace OOM kill cuando `memory.max` se excede,
/// y bloquea forks cuando `pids.max` se alcanza.
///
/// `memory.max` acepta `max` o un número en bytes. `pids.max` igual.
pub fn apply_rlimits_to_cgroup(cgroup_abs: &Path, rlimits: &ResourceLimits) -> Vec<String> {
let mut applied = Vec::new();
if let Some(mem) = rlimits.mem_bytes {
let path = cgroup_abs.join("memory.max");
match std::fs::write(&path, format!("{mem}\n")) {
Ok(_) => applied.push(format!("memory.max={mem}")),
Err(e) => tracing::warn!(?e, path = %path.display(), "memory.max write failed"),
}
}
if let Some(np) = rlimits.nproc {
let path = cgroup_abs.join("pids.max");
match std::fs::write(&path, format!("{np}\n")) {
Ok(_) => applied.push(format!("pids.max={np}")),
Err(e) => tracing::warn!(?e, path = %path.display(), "pids.max write failed"),
}
}
applied
}
/// Mueve `pid` a `cgroup_abs/cgroup.procs`.
pub fn move_to_cgroup(cgroup_abs: &std::path::Path, pid: nix::unistd::Pid) -> Result<(), IncarnateError> {
pub fn move_to_cgroup(cgroup_abs: &Path, pid: nix::unistd::Pid) -> Result<(), IncarnateError> {
let procs = cgroup_abs.join("cgroup.procs");
std::fs::write(&procs, format!("{}\n", pid.as_raw())).map_err(|e| match e.kind() {
std::io::ErrorKind::PermissionDenied => IncarnateError::CgroupNotWritable {