Results for "meaning-based retrieval"
Dynamic resource allocation.
Continuous loop adjusting actions based on state feedback.
Algorithm computing control actions.
Achieving task performance by providing a small number of examples inside the prompt without weight updates.
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
A preference-based training method optimizing policies directly from pairwise comparisons without explicit RL loops.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Identifying and localizing objects in images, often with confidence scores and bounding rectangles.
Learning only from current policy’s data.
Generating speech audio from text, with control over prosody, speaker identity, and style.
Continuous cycle of observation, reasoning, action, and feedback.
Chooses which experts process each token.
Separates planning from execution in agent architectures.
Detecting unauthorized model outputs or data leaks.
Models that define an energy landscape rather than explicit probabilities.
Probabilistic energy-based neural network with hidden variables.
Simultaneous Localization and Mapping for robotics.
Monte Carlo method for state estimation.
Flat high-dimensional regions slowing training.
Distributed agents producing emergent intelligence.
Classifying models by impact level.
Methods like Adam adjusting learning rates dynamically.
Guaranteed response times.
Software simulating physical laws.
Artificial environment for training/testing agents.
Predicts next state given current state and action.
Directly optimizing control policies.