Results for "sequence modeling"
Samples from the k highest-probability tokens to limit unlikely outputs.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
A system that perceives state, selects actions, and pursues goals—often combining LLM reasoning with tools and memory.
Coordinating tools, models, and steps (retrieval, calls, validation) to deliver reliable end-to-end behavior.
Stores past attention states to speed up autoregressive decoding.
Encodes positional information via rotation in embedding space.
Techniques to handle longer documents without quadratic cost.
Attention mechanisms that reduce quadratic complexity.
Continuous cycle of observation, reasoning, action, and feedback.
Separates planning from execution in agent architectures.
Sequential data indexed by time.
Monte Carlo method for state estimation.
Model execution path in production.
Control without feedback after execution begins.
Optimizing continuous action sequences.
Computing collision-free trajectories.
Fabrication of cases or statutes by LLMs.
Predicting protein 3D structure from sequence.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.
The field of building systems that perform tasks associated with human intelligence—perception, reasoning, language, planning, and decision-making—via algori...
The degree to which predicted probabilities match true frequencies (e.g., 0.8 means ~80% correct).
Iterative method that updates parameters in the direction of negative gradient to minimize loss.
A parameterized function composed of interconnected units organized in layers with nonlinear activations.
Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.
Tracking where data came from and how it was transformed; key for debugging and compliance.
Generating speech audio from text, with control over prosody, speaker identity, and style.
Updating beliefs about parameters using observed evidence and prior distributions.
Using same parameters across different parts of a model.
Allows model to attend to information from different subspaces simultaneously.