Results for "complexity"
Rademacher Complexity
IntermediateMeasures a model’s ability to fit random noise; used to bound generalization error.
Rademacher Complexity is a way to measure how well a learning model can adapt to random patterns in data. Imagine you have a set of points and you randomly assign labels to them, like flipping a coin for each point. Rademacher Complexity helps us understand how well a model can fit those random l...
Measures a model’s ability to fit random noise; used to bound generalization error.
Attention mechanisms that reduce quadratic complexity.
When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.
A measure of a model class’s expressive capacity based on its ability to shatter datasets.
A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.
When a model cannot capture underlying structure, performing poorly on both training and test data.
A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).
A datastore optimized for similarity search over embeddings, enabling semantic retrieval at scale.
Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.
A model is PAC-learnable if it can, with high probability, learn an approximately correct hypothesis from finite samples.
Error due to sensitivity to fluctuations in the training dataset.
Optimization with multiple local minima/saddle points; typical in neural networks.
A wide basin often correlated with better generalization.
Matrix of second derivatives describing local curvature of loss.
The range of functions a model can represent.
Tradeoffs between many layers vs many neurons per layer.
Stores past attention states to speed up autoregressive decoding.
Techniques to handle longer documents without quadratic cost.
All possible configurations an agent may encounter.
Set of all actions available to the agent.
Decomposing goals into sub-tasks.
Simple agent responding directly to inputs.
Cost to run models in production.
Cost of model training.
Measure of vector magnitude; used in regularization and optimization.
Model optimizes objectives misaligned with human values.
Classifying models by impact level.
Maximum system processing rate.
Mathematical guarantees of system behavior.
AI-assisted review of legal documents.