Search: human values — Dictionary of AI

Cognitive Architecture Frontier

System-level design for general intelligence.

AGI & General Intelligence

Shutdown Problem Advanced

Ensuring AI allows shutdown.

AI Safety & Alignment

Instrumental Convergence Advanced

Tendency for agents to pursue resources regardless of final goal.

AI Safety & Alignment

Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment

Mesa-Optimizer Advanced

Learned subsystem that optimizes its own objective.

AI Safety & Alignment

Robust Alignment Advanced

Maintaining alignment under new conditions.

AI Safety & Alignment

Alignment Tax Advanced

Tradeoff between safety and performance.

AI Safety & Alignment

Tripwire Advanced

Signals indicating dangerous behavior.

AI Safety & Alignment

AI Boxing Advanced

Isolating AI systems.

AI Safety & Alignment

Power-Seeking Behavior Advanced

Tendency to gain control/resources.

AI Safety & Alignment

Cooperative AI Intermediate

Designing AI to cooperate with humans and each other.

Governance & Ethics

Parameters Intermediate

The learned numeric values of a model adjusted during training to minimize a loss function.

Foundations & Theory

Loss Function Intermediate

A function measuring prediction error (and sometimes calibration), guiding gradient-based optimization.

Foundations & Theory

AUC Intermediate

Scalar summary of ROC; measures ranking ability, not calibration.

Foundations & Theory

Mean Squared Error Intermediate

Average of squared residuals; common regression objective.

Optimization

Self-Attention Intermediate

Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.

Transformers & LLMs

SHAP Intermediate

Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.

Foundations & Theory

Forecasting Intermediate

Predicting future values from past observations.

Time Series

Q-Function Intermediate

Expected return of taking action in a state.

AI Economics & Strategy

Likelihood Function Advanced

Probability of data given parameters.

Probability & Statistics

Interpretability Intermediate

Studying internal mechanisms or input influence on outputs (e.g., saliency maps, SHAP, attention analysis).

Foundations & Theory

Curriculum Learning Intermediate

Ordering training samples from easier to harder to improve convergence or generalization.

Foundations & Theory

Language Model Intermediate

A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.

Large Language Models

Automation Bias Intermediate

Tendency to trust automated suggestions even when incorrect; mitigated by UI design, training, and checks.

Foundations & Theory

Speech Recognition Intermediate

Converting audio speech into text, often using encoder-decoder or transducer architectures.

Speech & Audio AI

Memory Augmentation Intermediate

Extending agents with long-term memory stores.

AI Economics & Strategy

Text-to-Speech Intermediate

Generating speech audio from text, with control over prosody, speaker identity, and style.

Speech & Audio AI

Explainability Requirement Intermediate

Legal or policy requirement to explain AI decisions.

AI Economics & Strategy

Multimodal Fusion Intermediate

Combining signals from multiple modalities.

Computer Vision

Neural Vocoder Intermediate

Generates audio waveforms from spectrograms.

Speech & Audio AI

Results for "human values"

Welcome to AI Glossary

Search

Browse

3D WordGraph