Results for "training loss"
Model relies on irrelevant signals.
Performance drop when moving from simulation to reality.
Differences between training and deployed patient populations.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.
A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).
Popular optimizer combining momentum and per-parameter adaptive step sizes via first/second moment estimates.
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.
An RNN variant using gates to mitigate vanishing gradients and capture longer context.
Converting text into discrete units (tokens) for modeling; subword tokenizers balance vocabulary size and coverage.
The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.
Generates sequences one token at a time, conditioning on past tokens.
A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
Model-generated content that is fluent but unsupported by evidence or incorrect; mitigated by grounding and verification.
Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.
Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.
Standardized documentation describing intended use, performance, limitations, data, and ethical considerations.
Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.
Training a smaller “student” model to mimic a larger “teacher,” often improving efficiency while retaining performance.
Converts logits to probabilities by exponentiation and normalization; common in classification and LMs.
System for running consistent evaluations across tasks, versions, prompts, and model settings.
System design where humans validate or guide model outputs, especially for high-stakes decisions.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.
A model is PAC-learnable if it can, with high probability, learn an approximately correct hypothesis from finite samples.