Results for "data → model"
Loss of old knowledge when learning new tasks.
Ability to inspect and verify AI decisions.
Grouping patients by predicted outcomes.
Inferring the agent’s internal state from noisy sensor data.
Unequal performance across demographic groups.
Predicting borrower default risk.
AI discovering new compounds/materials.
A table summarizing classification outcomes, foundational for metrics like precision, recall, specificity.
Harmonic mean of precision and recall; useful when balancing false positives/negatives matters.
Plots true positive rate vs false positive rate across thresholds; summarizes separability.
The degree to which predicted probabilities match true frequencies (e.g., 0.8 means ~80% correct).
Iterative method that updates parameters in the direction of negative gradient to minimize loss.
Controls the size of parameter updates; too high diverges, too low trains slowly or gets stuck.
Injects sequence order into Transformers, since attention alone is permutation-invariant.
Training objective where the model predicts the next token given previous tokens (causal modeling).
Techniques to understand model decisions (global or local), important in high-stakes and regulated settings.
PEFT method injecting trainable low-rank matrices into layers, enabling efficient fine-tuning.
Reducing numeric precision of weights/activations to speed inference and reduce memory with acceptable accuracy loss.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Stress-testing models for failures, vulnerabilities, policy violations, and harmful behaviors before release.
Measures divergence between true and predicted probability distributions.
Quantifies shared information between random variables.
Measures how much information an observable random variable carries about unknown parameters.
Gradually increasing learning rate at training start to avoid divergence.
Prevents attention to future tokens during training/inference.
Stores past attention states to speed up autoregressive decoding.
Encodes positional information via rotation in embedding space.
Legal or policy requirement to explain AI decisions.
GNN using attention to weight neighbor contributions dynamically.