Results for "shortcut learning"
Expected return of taking action in a state.
Probabilistic energy-based neural network with hidden variables.
Simplified Boltzmann Machine with bipartite structure.
Increasing performance via more data.
Visualization of optimization landscape.
Flat high-dimensional regions slowing training.
Optimization under uncertainty.
Model trained on its own outputs degrades quality.
RL using learned or known environment models.
Predicts next state given current state and action.
Learning action mapping directly from demonstrations.
Identifying suspicious transactions.
AI applied to scientific problems.
Awareness and regulation of internal processes.
A mismatch between training and deployment data distributions that can degrade model performance.
The relationship between inputs and outputs changes over time, requiring monitoring and model updates.
A parameterized mapping from inputs to outputs; includes architecture + learned parameters.
Configuration choices not learned directly (or not typically learned) that govern training or architecture.
A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).
Popular optimizer combining momentum and per-parameter adaptive step sizes via first/second moment estimates.
A gradient method using random minibatches for efficient training on large datasets.
Activation max(0, x); improves gradient flow and training speed in deep nets.
Gradients shrink through layers, slowing learning in early layers; mitigated by ReLU, residuals, normalization.
Techniques that stabilize and speed training by normalizing activations; LayerNorm is common in Transformers.
A high-priority instruction layer setting overarching behavior constraints for a chat model.
Updating a pretrained model’s weights on task-specific data to improve performance or adapt style/behavior.
A preference-based training method optimizing policies directly from pairwise comparisons without explicit RL loops.
Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.
Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.
Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.