Results for "recursive value"
Fundamental recursive relationship defining optimal value functions.
Combines value estimation (critic) with policy learning (actor).
Average value under a distribution.
Sample mean converges to expected value.
Stores past attention states to speed up autoregressive decoding.
Expected cumulative reward from a state or state-action pair.
Decomposes a matrix into orthogonal components; used in embeddings and compression.
Model optimizes objectives misaligned with human values.
Maximum expected loss under normal conditions.
Inferring and aligning with human preferences.