Results for "trial-and-error"
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Reconstructing a model or its capabilities via API queries or leaked artifacts.
Attacks that infer whether specific records were in training data, or reconstruct sensitive training examples.
A measure of a model class’s expressive capacity based on its ability to shatter datasets.
Bayesian parameter estimation using the mode of the posterior distribution.
A point where gradient is zero but is neither a max nor min; common in deep nets.
A narrow minimum often associated with poorer generalization.
Limiting gradient magnitude to prevent exploding gradients.
Optimization using curvature information; often expensive at scale.
Early architecture using learned gates for skip connections.
Prevents attention to future tokens during training/inference.
Attention mechanisms that reduce quadratic complexity.
All possible configurations an agent may encounter.
Chooses which experts process each token.
Combines value estimation (critic) with policy learning (actor).
Models trained to decide when to call tools.
Embedding signals to prove model ownership.
Models that define an energy landscape rather than explicit probabilities.
Probabilistic graphical model for structured prediction.
Generative model that learns to reverse a gradual noise process.
Generator produces limited variety of outputs.
Assigning category labels to images.
Pixel-wise classification of image regions.
Generates audio waveforms from spectrograms.
Models time evolution via hidden states.
CNNs applied to time series.
Variable enabling causal inference despite confounding.
Low-latency prediction per request.
Shift in feature distribution over time.
Increasing performance via more data.