Results for "data → model"
A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.
Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.
Selecting the most informative samples to label (e.g., uncertainty sampling) to reduce labeling cost.
Attacks that infer whether specific records were in training data, or reconstruct sensitive training examples.
Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.
A narrow minimum often associated with poorer generalization.
Systematic error introduced by simplifying assumptions in a learning algorithm.
Models that define an energy landscape rather than explicit probabilities.
Startup latency for services.
Finding mathematical equations from data.
The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
Using same parameters across different parts of a model.
Probabilistic graphical model for structured prediction.
A single attention mechanism within multi-head attention.
Cost to run models in production.
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
Minimizing average loss on training data; can overfit when data is limited or biased.
Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.
Forcing predictable formats for downstream systems; reduces parsing errors and supports validation/guardrails.
Learning from data generated by a different policy.
Trend reversal when data is aggregated improperly.
Updated belief after observing data.
Requirement to preserve relevant data.
Training one model on multiple tasks simultaneously to improve generalization through shared structure.
A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.
Updating a pretrained model’s weights on task-specific data to improve performance or adapt style/behavior.
Fine-tuning on (prompt, response) pairs to align a model with instruction-following behaviors.
Automated testing and deployment processes for models and data workflows, extending DevOps to ML artifacts.
Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.