Results for "compute-data-performance"
Learning only from current policy’s data.
Detecting unauthorized model outputs or data leaks.
Models that define an energy landscape rather than explicit probabilities.
Learns the score (∇ log p(x)) for generative sampling.
CNNs applied to time series.
Two-network setup where generator fools a discriminator.
Belief before observing data.
Attention between different modalities.
Software pipeline converting raw sensor data into structured representations.
Running predictions on large datasets periodically.
Models estimating recidivism risk.
Learning physical parameters from data.
The relationship between inputs and outputs changes over time, requiring monitoring and model updates.
Of predicted positives, the fraction that are truly positive; sensitive to false positives.
Of true positives, the fraction correctly identified; sensitive to false negatives.
Of true negatives, the fraction correctly identified.
Penalizes confident wrong predictions heavily; standard for classification and language modeling.
Uses an exponential moving average of gradients to speed convergence and reduce oscillation.
Methods to set starting weights to preserve signal/gradient scales across layers.
Techniques that stabilize and speed training by normalizing activations; LayerNorm is common in Transformers.
The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.
A high-priority instruction layer setting overarching behavior constraints for a chat model.
Breaking documents into pieces for retrieval; chunk size/overlap strongly affect RAG quality.
Controlled experiment comparing variants by random assignment to estimate causal effects of changes.
When some classes are rare, requiring reweighting, resampling, or specialized metrics.
Policies and practices for approving, monitoring, auditing, and documenting models in production.
Reducing numeric precision of weights/activations to speed inference and reduce memory with acceptable accuracy loss.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.
Hidden behavior activated by specific triggers, causing targeted mispredictions or undesired outputs.