Results for "data → model"
Standardized documentation describing intended use, performance, limitations, data, and ethical considerations.
Required descriptions of model behavior and limits.
A mismatch between training and deployment data distributions that can degrade model performance.
Empirical laws linking model size, data, compute to performance.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
Combining simulation and real-world data.
Reconstructing a model or its capabilities via API queries or leaked artifacts.
Running new model alongside production without user impact.
Model-generated content that is fluent but unsupported by evidence or incorrect; mitigated by grounding and verification.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
Central catalog of deployed and experimental models.
Applying learned patterns incorrectly.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.
Information that can identify an individual (directly or indirectly); requires careful handling and compliance.
Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.
Predicts next state given current state and action.
Built-in assumptions guiding learning efficiency and generalization.
Train/test environment mismatch.
Learning physical parameters from data.
Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.
Recovering training data from gradients.
Diffusion performed in latent space for efficiency.
Risk of incorrect financial models.
Probabilistic model for sequential data with latent states.
Task instruction without examples.
Requirement to reveal AI usage in legal decisions.
Models that learn to generate samples resembling training data.
RL without explicit dynamics model.
How well a model performs on new data drawn from the same (or similar) distribution as training.
Systematic review of model/data processes to ensure performance, fairness, security, and policy compliance.