Results for "instruction design"
A high-priority instruction layer setting overarching behavior constraints for a chat model.
Fine-tuning on (prompt, response) pairs to align a model with instruction-following behaviors.
Automated detection/prevention of disallowed outputs (toxicity, self-harm, illegal instruction, etc.).
Task instruction without examples.
System design where humans validate or guide model outputs, especially for high-stakes decisions.
Tendency to trust automated suggestions even when incorrect; mitigated by UI design, training, and checks.
Competition arises without explicit design.
System-level design for general intelligence.
Controlling robots via language.
Designing systems where rational agents behave as desired.
Designing efficient marketplaces.