Results for "positive predictive value"
Limiting gradient magnitude to prevent exploding gradients.
Models that define an energy landscape rather than explicit probabilities.
Attention between different modalities.
Decomposes a matrix into orthogonal components; used in embeddings and compression.
Average value under a distribution.
Agents optimize collective outcomes.
Penalizes confident wrong predictions heavily; standard for classification and language modeling.
A point where gradient is zero but is neither a max nor min; common in deep nets.
Gradients shrink through layers, slowing learning in early layers; mitigated by ReLU, residuals, normalization.
Matrix of second derivatives describing local curvature of loss.
Measures similarity and projection between vectors.
Measures joint variability between variables.
Flat high-dimensional regions slowing training.
Model relies on irrelevant signals.
Using output to adjust future inputs.
Ability to correctly detect disease.
The relationship between inputs and outputs changes over time, requiring monitoring and model updates.
A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).
How well a model performs on new data drawn from the same (or similar) distribution as training.
Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.
A robust evaluation technique that trains/evaluates across multiple splits to estimate performance variability.
Fraction of correct predictions; can be misleading on imbalanced datasets.
The degree to which predicted probabilities match true frequencies (e.g., 0.8 means ~80% correct).
Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.
When some classes are rare, requiring reweighting, resampling, or specialized metrics.
A hidden variable influences both cause and effect, biasing naive estimates of causal impact.
Exponential of average negative log-likelihood; lower means better predictive fit, not necessarily better utility.
Error due to sensitivity to fluctuations in the training dataset.