Results for "learned objectives"

46 results

Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment
Mesa-Optimizer Advanced

Learned subsystem that optimizes its own objective.

AI Safety & Alignment
Hyperparameters Intermediate

Configuration choices not learned directly (or not typically learned) that govern training or architecture.

Optimization
Highway Network Intermediate

Early architecture using learned gates for skip connections.

AI Economics & Strategy
Specification Gaming Advanced

Model exploits poorly specified objectives.

AI Safety & Alignment
Latent Space Intermediate

The internal space where learned representations live; operations here often correlate with semantics or generative factors.

Foundations & Theory
Representation Learning Intermediate

Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.

Machine Learning
Model Intermediate

A parameterized mapping from inputs to outputs; includes architecture + learned parameters.

Foundations & Theory
Overgeneralization Intermediate

Applying learned patterns incorrectly.

Model Failure Modes
Model-Based RL Advanced

RL using learned or known environment models.

Reinforcement Learning
World Model Frontier

Learned model of environment dynamics.

World Models & Cognition
Deceptive Alignment Advanced

Model behaves well during training but not deployment.

AI Safety & Alignment
Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment
Instrumental Convergence Advanced

Tendency for agents to pursue resources regardless of final goal.

AI Safety & Alignment
Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment
Outer Alignment Advanced

Correctly specifying goals.

AI Safety & Alignment
Corrigibility Advanced

Willingness of system to accept correction or shutdown.

AI Safety & Alignment
Competitive Game Advanced

Agents have opposing objectives.

Agents & Autonomy
Parameters Intermediate

The learned numeric values of a model adjusted during training to minimize a loss function.

Foundations & Theory
Multi-Head Attention Intermediate

Allows model to attend to information from different subspaces simultaneously.

AI Economics & Strategy
Computational Learning Theory Intermediate

A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.

AI Economics & Strategy
Memory Augmentation Intermediate

Extending agents with long-term memory stores.

AI Economics & Strategy
Attention Head Intermediate

A single attention mechanism within multi-head attention.

AI Economics & Strategy
Generative Model Advanced

Models that learn to generate samples resembling training data.

Diffusion & Generative Models
Variational Autoencoder Advanced

Autoencoder using probabilistic latent variables and KL regularization.

Diffusion & Generative Models
Zero-Shot Prompting Intro

Task instruction without examples.

Prompting & Instructions
Catastrophic Forgetting Intermediate

Loss of old knowledge when learning new tasks.

Model Failure Modes
Inverse Reinforcement Learning Advanced

Inferring reward function from observed behavior.

Reinforcement Learning
Lifelong Learning Advanced

Learning without catastrophic forgetting.

Agents & Autonomy
Multi-Agent System Intermediate

Multiple agents interacting cooperatively or competitively.

AI Economics & Strategy

Welcome to AI Glossary

The free, curated AI dictionary built from real, established terms and designed for a clean reading experience.

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.