Results for "direct optimization"
Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.
Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.
Ordering training samples from easier to harder to improve convergence or generalization.
Training across many devices/silos without centralizing raw data; aggregates updates, not data.
Converts logits to probabilities by exponentiation and normalization; common in classification and LMs.
Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.
Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Maliciously inserting or altering training data to implant backdoors or degrade performance.
Reconstructing a model or its capabilities via API queries or leaked artifacts.
AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.
Assigning labels per pixel (semantic) or per instance (instance segmentation) to map object boundaries.
Measures how much information an observable random variable carries about unknown parameters.
Estimating parameters by maximizing likelihood of observed data.
A narrow minimum often associated with poorer generalization.
A wide basin often correlated with better generalization.
Gradually increasing learning rate at training start to avoid divergence.
Attention mechanisms that reduce quadratic complexity.
Recovering training data from gradients.
Inferring sensitive features of training data.
Simultaneous Localization and Mapping for robotics.
Recovering 3D structure from images.
Predicting future values from past observations.
Using production outcomes to improve models.
Cost of model training.
Measures similarity and projection between vectors.
Sensitivity of a function to input perturbations.
Matrix of first-order derivatives for vector-valued functions.
Direction of steepest ascent of a function.
Measures joint variability between variables.