Results for "data distribution"
A high-capacity language model trained on massive corpora, exhibiting broad generalization and emergent behaviors.
Letting an LLM call external functions/APIs to fetch data, compute, or take actions, improving reliability.
A datastore optimized for similarity search over embeddings, enabling semantic retrieval at scale.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.
Central system to store model versions, metadata, approvals, and deployment state.
Ability to replicate results given same code/data; harder in distributed training and nondeterministic ops.
Attacks that manipulate model instructions (especially via retrieved content) to override system goals or exfiltrate data.
Observing model inputs/outputs, latency, cost, and quality over time to catch regressions and drift.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.
Updating beliefs about parameters using observed evidence and prior distributions.
Using same parameters across different parts of a model.
Techniques to handle longer documents without quadratic cost.
A single attention mechanism within multi-head attention.
Capabilities that appear only beyond certain model sizes.
Chooses which experts process each token.
Classical statistical time-series model.
Central catalog of deployed and experimental models.
Persistent directional movement over time.
Models effects of interventions (do(X=x)).
Model execution path in production.
Agents communicate via shared state.
Cost to run models in production.
Mathematical foundation for ML involving vector spaces, matrices, and linear transformations.
Decomposes a matrix into orthogonal components; used in embeddings and compression.
Number of linearly independent rows or columns.
Task instruction without examples.
Applying learned patterns incorrectly.
Centralized AI expertise group.