Search: shared reward — Dictionary of AI

On-Policy Learning Intermediate

Learning only from current policy’s data.

AI Economics & Strategy

Exploration-Exploitation Tradeoff Intermediate

Balancing learning new behaviors vs exploiting known rewards.

AI Economics & Strategy

Alignment Problem Advanced

Ensuring AI systems pursue intended human goals.

AI Safety & Alignment

Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment

Deceptive Alignment Advanced

Model behaves well during training but not deployment.

AI Safety & Alignment

Imitation Learning Advanced

Learning policies from expert demonstrations.

Reinforcement Learning

Power-Seeking Behavior Advanced

Tendency to gain control/resources.

AI Safety & Alignment

Value Learning Intermediate

Inferring and aligning with human preferences.

Governance & Ethics

Results for "shared reward"