Results for "dataset documentation"
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.
Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.
A formal privacy framework ensuring outputs do not reveal much about any single individual’s data contribution.
Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.
Training a smaller “student” model to mimic a larger “teacher,” often improving efficiency while retaining performance.
Maliciously inserting or altering training data to implant backdoors or degrade performance.
Tradeoffs between many layers vs many neurons per layer.
Detecting unauthorized model outputs or data leaks.
Models that learn to generate samples resembling training data.
Assigning category labels to images.
Joint vision-language model aligning images and text.
Detects trigger phrases in audio streams.
Persistent directional movement over time.
Running predictions on large datasets periodically.
Cost of model training.
Models whose weights are publicly available.
Applying learned patterns incorrectly.
Deep learning system for protein structure prediction.
Finding mathematical equations from data.