Results for "fixed positions"
Encodes positional information via rotation in embedding space.
Injects sequence order into Transformers, since attention alone is permutation-invariant.
Groups adopting extreme positions.
Maximum number of tokens the model can attend to in one forward pass; constrains long-document reasoning.
Search algorithm for generation that keeps top-k partial sequences; can improve likelihood but reduce diversity.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Neural networks that operate on graph-structured data by propagating information along edges.
Diffusion model trained to remove noise step by step.
Extension of convolution to graph domains using adjacency structure.
Autoencoder using probabilistic latent variables and KL regularization.
GNN using attention to weight neighbor contributions dynamically.
Transformer applied to image patches.
Distributed agents producing emergent intelligence.
Internal sensing of joint positions, velocities, and forces.
Study of motion without considering forces.