Search: direct preference optimization

Surrogate Model Advanced

Fast approximation of costly simulations.

AI in Science

Hyperparameters Intermediate

Configuration choices not learned directly (or not typically learned) that govern training or architecture.

Optimization

Momentum Intermediate

Uses an exponential moving average of gradients to speed convergence and reduce oscillation.

Optimization

Learning Rate Intermediate

Controls the size of parameter updates; too high diverges, too low trains slowly or gets stuck.

Foundations & Theory

Gradient Noise Intermediate

Variability introduced by minibatch sampling during SGD.

AI Economics & Strategy

Gradient Clipping Intermediate

Limiting gradient magnitude to prevent exploding gradients.

AI Economics & Strategy

Hessian Matrix Intermediate

Matrix of second derivatives describing local curvature of loss.

AI Economics & Strategy

Policy Gradient Intermediate

Optimizing policies directly via gradient ascent on expected reward.

AI Economics & Strategy

Hessian Advanced

Matrix of curvature information.

Mathematics

Norm Advanced

Measure of vector magnitude; used in regularization and optimization.

Mathematics

Stochastic Approximation Intermediate

Optimization under uncertainty.

Foundations & Theory

Global Minimum Intermediate

Lowest possible loss.

Foundations & Theory

Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment

Artificial Intelligence Intermediate

The field of building systems that perform tasks associated with human intelligence—perception, reasoning, language, planning, and decision-making—via algori...

Foundations & Theory

Machine Learning Intermediate

A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.

Machine Learning

Deep Learning Intermediate

A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.

Deep Learning

Semi-Supervised Learning Intermediate

Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.

Machine Learning

Online Learning Intermediate

Learning where data arrives sequentially and the model updates continuously, often under changing distributions.

Machine Learning

Meta-Learning Intermediate

Methods that learn training procedures or initializations so models can adapt quickly to new tasks with little data.

Machine Learning

Representation Learning Intermediate

Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.

Machine Learning

Parameters Intermediate

The learned numeric values of a model adjusted during training to minimize a loss function.

Foundations & Theory

Objective Function Intermediate

A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.

Optimization

Mean Squared Error Intermediate

Average of squared residuals; common regression objective.

Optimization

Gradient Descent Intermediate

Iterative method that updates parameters in the direction of negative gradient to minimize loss.

Optimization

Adam Intermediate

Popular optimizer combining momentum and per-parameter adaptive step sizes via first/second moment estimates.

Optimization

Stochastic Gradient Descent Intermediate

A gradient method using random minibatches for efficient training on large datasets.

Foundations & Theory

Epoch Intermediate

One complete traversal of the training dataset during training.

Foundations & Theory

Neural Network Intermediate

A parameterized function composed of interconnected units organized in layers with nonlinear activations.

Neural Networks

Exploding Gradient Intermediate

Gradients grow too large, causing divergence; mitigated by clipping, normalization, careful init.

Foundations & Theory

Activation Function Intermediate

Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.

Foundations & Theory

Results for "direct preference optimization"

Welcome to AI Glossary

Search

Browse

3D WordGraph