Results for "compute-data-performance"
Increasing model capacity via compute.
Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.
Regulating access to large-scale compute.
Scaling law optimizing compute vs data.
Empirical laws linking model size, data, compute to performance.
Stored compute or algorithms enabling rapid jumps.
Storing results to reduce compute.
Control using real-time sensor feedback.
Exact likelihood generative models using invertible transforms.
Number of samples per gradient update; impacts compute efficiency, generalization, and stability.
Techniques that fine-tune small additional components rather than all weights to reduce compute and storage.
Directly optimizing control policies.
GNN using attention to weight neighbor contributions dynamically.
Letting an LLM call external functions/APIs to fetch data, compute, or take actions, improving reliability.
Variability introduced by minibatch sampling during SGD.
Measures similarity and projection between vectors.
Attention mechanisms that reduce quadratic complexity.
Approximating expectations via random sampling.
Optimizing policies directly via gradient ascent on expected reward.
Internal representation of environment layout.
When information from evaluation data improperly influences training, inflating reported performance.
Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.
Increasing performance via more data.
Tracking where data came from and how it was transformed; key for debugging and compliance.
Observing model inputs/outputs, latency, cost, and quality over time to catch regressions and drift.
Maliciously inserting or altering training data to implant backdoors or degrade performance.
Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.
Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.
When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.
Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.