Results for "model-based"
Model-Based RL
AdvancedRL using learned or known environment models.
Model-based reinforcement learning is like having a map while exploring a new city. Instead of wandering around aimlessly, you can look at the map to plan your route and make better decisions about where to go next. In this type of learning, an AI agent first learns how the environment works—like...
Measures divergence between true and predicted probability distributions.
Quantifies shared information between random variables.
Measures how much information an observable random variable carries about unknown parameters.
A narrow minimum often associated with poorer generalization.
Gradually increasing learning rate at training start to avoid divergence.
Stores past attention states to speed up autoregressive decoding.
Encodes positional information via rotation in embedding space.
Empirical laws linking model size, data, compute to performance.
Legal or policy requirement to explain AI decisions.
Graphical model expressing factorization of a probability distribution.
Models that learn to generate samples resembling training data.
Aligns transcripts with audio timestamps.
Models time evolution via hidden states.
Shift in feature distribution over time.
Maintaining alignment under new conditions.
Using limited human feedback to guide large models.
Maximum system processing rate.
Requirement to provide explanations.
Startup latency for services.
High-fidelity virtual model of a physical system.
Combining simulation and real-world data.
Predicting disease progression or survival.
Mechanics of price formation.
Quantifying financial risk.
Agents copy others’ actions.
Groups adopting extreme positions.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Methods that learn training procedures or initializations so models can adapt quickly to new tasks with little data.
Of predicted positives, the fraction that are truly positive; sensitive to false positives.
Of true positives, the fraction correctly identified; sensitive to false negatives.