Glossary

A/A Test

An experiment whose two arms serve the identical experience, used to audit the experimentation platform itself: a healthy system should reject the null about 5% of the time.

Statistics A/B test Data

Read entry Source

Glossary

A/B Testing

Comparing two versions of a system by randomly assigning users to each and measuring which performs better on a chosen metric, so changes are judged on evidence rather than opinion.

Metrics Decision Making Statistics

Read entry Source

Glossary

Action Chunking

Having a robot policy predict a short sequence of future actions at once instead of one action per timestep, which reduces the number of closed-loop decisions and tames compounding error.

RL Models

Read entry Source

Glossary

Activation Patching

A causal interpretability technique that transplants internal activations from one forward pass into another to identify which components actually drive a behaviour.

Interpretability LLMs

Read entry Source

Glossary

Activation Steering

Modifying a model's behaviour at inference time by adding, amplifying, or suppressing concept directions in its internal activations rather than changing its weights or prompt.

Interpretability LLMs

Read entry Source

Glossary

Active Learning

A training approach where the system selectively chooses which examples should be labeled next.

Data Training

Read entry Source

Glossary

Active Parameters

The parameters a sparse model actually computes with for one token, as opposed to the total it stores; the compute-cost side of the "X total, Y active" description of mixture-of-experts models.

Architecture Models

Read entry Source

Glossary

A reinforcement learning architecture that pairs a policy (the actor) that chooses actions with a learned value function (the critic) that evaluates them, using the critic's feedback to reduce the variance of policy updates.

RL Decision Making

Read entry Source

Glossary

Advantage Function

The gap A(s, a) = Q(s, a) - V(s) between the value of a specific action and the average value of the state, measuring how much better an action is than the policy's usual behaviour.

RL Decision Making

Read entry

Glossary

Agentic Loop

The cycle at the heart of every agent turn, where the agent gathers context, takes an action, checks the result, and repeats until a stop condition is met.

Generative AI LLMs

Read entry Source

Glossary

Arithmetic Intensity

The number of floating-point operations a workload performs per byte of data it moves, which determines whether it is compute-bound or memory-bound.

Hardware Optimization

Read entry Source

Glossary

Artificial General Intelligence

A hypothetical AI system with the broad, flexible cognitive ability of a human across essentially any intellectual task, rather than competence confined to a single narrow domain.

Models Fundamentals

Read entry

Glossary

Attention

A neural mechanism that lets a model weigh which parts of the input matter most when processing or predicting.

Architecture LLMs

Read entry Source

Glossary

Attribution Graph

A graph tracing which internal features causally feed which others on a specific prompt, reconstructing the step-by-step circuit a model used to produce an output.

Interpretability LLMs

Read entry Source

Glossary

Autoencoder

A model that learns a compressed representation of its input and then tries to reconstruct the original input from it.

Architecture Representation

Read entry Source

Glossary

Average Treatment Effect

The mean difference between potential outcomes with and without treatment across a population, E[Y(1) - Y(0)]; the headline quantity most causal analyses target.

Statistics Metrics

Read entry Source

Glossary

Bayesian Optimization

A search strategy that uses a probabilistic model of past evaluations to choose where to sample an expensive function next.

Optimization

Read entry Source

Glossary

Behavior Cloning

Training a policy by supervised learning on expert demonstrations, fitting the mapping from observed states to the expert's actions by maximum likelihood.

RL Training

Read entry Source

Glossary

Best Linear Unbiased Prediction

The optimal estimate of a random effect given the data; a shrinkage estimate that pulls each group's raw mean toward the population mean by a precision-determined factor.

Statistics Models

Read entry Source

Glossary

CUPED

A variance-reduction technique that subtracts the part of an experiment metric predictable from pre-experiment data, shrinking variance by the squared correlation without biasing the effect estimate.

Statistics A/B test Optimization

Read entry Source

Glossary

Catastrophic Forgetting

The abrupt loss of previously learned knowledge when a neural network is trained sequentially on new data, because gradient updates for the new task overwrite shared weights the old tasks relied on.

Training Models

Read entry Source

Glossary

Causal Inference

The discipline of estimating what would happen under an intervention, rather than what is merely associated in observed data, using assumptions about how the data came to be.

Statistics Decision Making

Read entry Source

Glossary

Censoring

A partially observed outcome, most commonly right-censoring, where an observation ends before the event of interest occurs, so the true lifetime is known only to exceed the observed duration.

Statistics Data

Read entry Source

Glossary

Chain of Thought

An intermediate sequence of reasoning steps a model writes out before its final answer, which improves accuracy on multi-step tasks.

LLMs Language

Read entry Source

Glossary

Churn

The event of a customer ending their relationship with a business, and by extension the rate at which a customer base loses members over a period.

Data Decision Making

Read entry Source

Glossary

Classifier-Free Guidance

A sampling technique that sharpens a diffusion model's adherence to its conditioning by extrapolating between its conditional and unconditional predictions.

Generative AI Models

Read entry Source

Glossary

Coding Agent

An LLM-powered tool that can read a codebase, run commands, edit files, and iterate against feedback in a loop until a task is complete, rather than only suggesting snippets for a human to apply.

LLMs Generative AI

Read entry

Glossary

Collaborative Filtering

A recommendation approach that predicts preferences from patterns in many users' interactions rather than item content alone.

Recommenders Personalization

Read entry Source

Glossary

Collider

A variable caused by both treatment and outcome; conditioning on it creates spurious association where none existed.

Statistics Data

Read entry Source

Glossary

Compute-Optimal Scaling

The Chinchilla result that for a fixed training compute budget, loss is minimized by scaling parameters and data together, at roughly 20 training tokens per parameter.

Training LLMs

Read entry Source

Glossary

Conditional Average Treatment Effect

The average treatment effect among units sharing covariates x, written tau(x); the quantity uplift models estimate to decide whom to target.

Statistics Decision Making Models

Read entry Source

Glossary

Confounder

A variable that causes both the treatment and the outcome, opening a backdoor path that makes naive comparisons biased.

Statistics Data

Read entry Source

Glossary

Container

A lightweight, isolated process that packages an application with its dependencies and shares the host operating system's kernel.

Deployment

Read entry Source

Glossary

Container Image

A read-only, layered package of application code, dependencies, and configuration that serves as the blueprint for running containers.

Deployment

Read entry Source

Glossary

Context Rot

The empirical degradation of an LLM's recall and reasoning as its context window fills, long before the advertised token limit is reached.

LLMs Metrics

Read entry Source

Glossary

Context Window

The bounded working memory of an LLM, holding every instruction, file, and tool output the model can currently see. Model performance degrades as it fills, so managing it deliberately is a core skill in agentic work.

LLMs Architecture

Read entry

Glossary

Continual Learning

The ability of a model to keep learning from a stream of new tasks and experiences after deployment, accumulating skills without forgetting old ones or requiring retraining from scratch.

Training LLMs

Read entry Source

Glossary

Continuous Batching

A serving scheduler that admits and retires requests at every decode step, keeping the running batch full instead of waiting for a whole batch to finish.

Deployment Optimization

Read entry Source

Glossary

Corpus

A structured collection of text or other examples that you analyze, search, or use to train and evaluate a model.

Data NLP

Read entry Source

Glossary

Cosine Similarity

A measure of how aligned two vectors are, commonly used to compare embeddings by direction rather than raw magnitude.

Similarity Retrieval

Read entry Source

Glossary

Counterfactual Evaluation

Estimating how a system would have performed under decisions that were not actually taken.

Decision Making Data

Read entry

Glossary

Cross-Validation

An evaluation technique that repeatedly splits the data into training and validation folds so every observation is used for both fitting and scoring.

Training Generalization

Read entry Source

Glossary

Customer Lifetime Value

The expected total (usually discounted) margin a customer generates over their relationship with a business, combining a survival model of how long they stay with a revenue model of what they pay.

Decision Making Data

Read entry Source

Glossary

Data Parallelism

Training parallelism where every chip holds a copy of the model and processes a different slice of the batch, synchronizing gradients once per step.

Training Hardware

Read entry Source

Glossary

Delta Method

A first-order Taylor approximation that carries a variance through a smooth function of noisy quantities, giving standard errors for transformed estimates and ratios of means.

Statistics

Read entry

Glossary

Difference-in-Differences

A design comparing treated and untreated groups' changes over a treatment date, cancelling stable group differences and shared shocks under the parallel-trends assumption.

Statistics Decision Making

Read entry Source

Glossary

Diffusion Model

A generative model that learns to reverse a gradual noising process, creating new samples by iteratively denoising pure noise.

Generative AI Models

Read entry Source

Glossary

Directed Acyclic Graph

A graph of variables and one-way causal arrows with no cycles, used to encode causal assumptions and read off what to adjust for.

Statistics Models

Read entry Source

Glossary

Do-Operator

Pearl's notation do(T = t) for setting a variable by intervention rather than observing it, separating P(Y given do(T)) from the ordinary conditional P(Y given T).

Statistics Decision Making

Read entry Source

Glossary

Domain Randomization

Training one policy across many randomly perturbed versions of a simulator (varied physics, visuals, latencies) so that the real world falls inside the training distribution.

RL Training

Read entry Source

Glossary

Double Machine Learning

A recipe for using flexible ML models in causal estimation, predicting both outcome and treatment from confounders, and regressing residual on residual with cross-fitting.

Statistics Models Training

Read entry Source

Glossary

Draft Model

A small, fast model that proposes candidate tokens for a larger target model to verify in bulk during speculative decoding.

LLMs Optimization

Read entry Source

Glossary

ELBO

The evidence lower bound, a tractable quantity that can be maximized in place of an intractable likelihood when training latent-variable models.

Training Optimization

Read entry

Glossary

Elastic Weight Consolidation

A continual-learning method that fights catastrophic forgetting by adding a quadratic penalty anchoring each weight near its old-task value, scaled by the Fisher information measuring how much the old task depends on that weight.

Training Optimization

Read entry Source

Glossary

Embedding

A dense numeric representation that places similar items close together in vector space.

Representation NLP

Read entry Source

Glossary

Expected Value

The average outcome you would expect from a decision after weighting each possible result by its probability.

Decision Making Optimization

Read entry Source

Glossary

Expert Parallelism

Training and serving parallelism for mixture-of-experts models where different experts live on different chips and tokens travel to their assigned experts via AllToAll communication.

Training Hardware

Read entry Source

Glossary

Exponential Family

A class of probability distributions sharing one algebraic form, whose natural parameter determines the canonical link function of the corresponding GLM.

Statistics Models

Read entry Source

Glossary

Feature Engineering

The process of turning raw data into model-ready signals that better represent the behaviour or structure a model needs to learn.

Data Training

Read entry Source

Glossary

Fine-Tuning

The process of continuing training on a pre-trained model so it becomes better suited to a narrower task or domain.

Training LLMs

Read entry Source

Glossary

Fisher Information

A measure of how much a model's likelihood changes as a parameter moves, computed as the expected squared score; near a maximum-likelihood solution it equals the expected curvature of the negative log-likelihood.

Statistics Optimization

Read entry Source

Glossary

Fixed Effect

An ordinary regression coefficient treated as an unknown constant shared across all observations, in contrast to a random effect drawn from a distribution per group.

Models Statistics

Read entry Source

Glossary

FlashAttention

An exact, IO-aware implementation of attention that avoids materializing the full score matrix, cutting memory use and wall-clock time without changing outputs.

Architecture LLMs Optimization

Read entry Source

Glossary

Generalized Estimating Equations

A method for clustered data that fits the population-averaged (marginal) model directly, using a working correlation structure and robust standard errors instead of random effects.

Models Statistics

Read entry Source

Glossary

Generalized Linear Mixed Model

A GLM whose linear predictor contains random effects, giving hierarchical versions of logistic and Poisson regression; its likelihood involves an integral with no closed form, so fitting relies on approximations.

Models Statistics

Read entry Source

Glossary

Generalized Linear Model

A model family where a weighted sum of features predicts the mean of an exponential-family distribution through a link function; linear and logistic regression are its best-known members.

Models Statistics

Read entry Source

Glossary

Global Workspace Theory

A neuroscience theory holding that conscious access arises when information wins entry to a small-capacity broadcast system that shares it across the brain's specialized unconscious processors.

Interpretability Decision Making

Read entry Source

Glossary

Gradient Boosting

An ensemble technique that builds models sequentially, with each new model trained to correct the errors of the ensemble so far.

Models Training

Read entry Source

Glossary

Group Relative Policy Optimization

A reinforcement learning algorithm that drops PPO's value network and instead estimates advantages by comparing a group of sampled answers to the same prompt against their own average reward.

RL Training LLMs

Read entry Source

Glossary

Grouped-Query Attention

An attention variant where groups of query heads share a single key/value head, shrinking the KV cache with little quality loss.

Architecture LLMs Optimization

Read entry Source

Glossary

Guardrail Metric

A metric an experiment must not harm, tested for non-inferiority alongside the success metric and empowered to veto a launch regardless of wins elsewhere.

Metrics Decision Making A/B test

Read entry

Glossary

Hazard Function

The instantaneous or per-period risk of an event occurring at time t, conditional on having survived up to t. High early hazard means early failures; spikes mark risky moments like contract expiries.

Statistics Models

Read entry Source

Glossary

High-Bandwidth Memory

Stacked DRAM packaged next to an accelerator's compute die, holding model weights and activations and feeding them to the chip at terabytes per second.

Hardware

Read entry Source

Glossary

Hyperparameter

A configuration setting of a learning algorithm, such as tree depth or learning rate, chosen before training rather than learned from the data.

Models Training Optimization

Read entry Source

Glossary

Imitation Learning

The family of methods that learn a policy from demonstrations of the desired behavior rather than from a reward signal, spanning behavior cloning, interactive methods like DAgger, and inverse RL.

RL Training

Read entry Source

Glossary

Inference

The stage where a trained model is used to make predictions or generate outputs on new inputs.

Deployment Models

Read entry Source

Glossary

Instrumental Variable

A variable that shifts treatment uptake but touches the outcome only through treatment, letting a ratio of two clean comparisons recover a causal effect despite unmeasured confounding.

Statistics Decision Making

Read entry Source

Glossary

Inter-Token Latency

The time between successive output tokens once a response has started streaming, set by the memory-bound decode phase.

Deployment Metrics

Read entry Source

Glossary

Interpretable Model

A model whose behaviour can be understood directly enough for humans to inspect, validate, and act on its reasoning.

Models Decision Making

Read entry Source

Glossary

Intraclass Correlation

The correlation between two observations from the same group, equal to the share of outcome variance that lives at the group level; it drives how much information clustered data really contains.

Statistics Data

Read entry Source

Glossary

Isotonic Regression

A regression method that fits the closest sequence or function that obeys a monotonic ordering constraint.

Models Optimization

Read entry Source

Glossary

J-Space

The sparse set of verbalizable concept patterns a language model holds in its middle-to-late layers, identified via the Jacobian lens and argued by Anthropic to function as a global workspace.

Interpretability LLMs

Read entry Source

Glossary

Jacobian Lens

An interpretability technique that decodes intermediate activations by first transporting them to the output basis through the network's averaged input-output Jacobian, revealing what the model is disposed to say later.

Interpretability LLMs

Read entry Source

Glossary

KL Divergence

A measure of how one probability distribution differs from another; asymmetric, so the direction you compute it in changes what an optimizer learns.

Optimization Models

Read entry Source

Glossary

KV Cache

A store of every past token's attention key and value vectors, reused at each generation step so the model never recomputes them.

Architecture LLMs Optimization

Read entry

Glossary

Kaplan-Meier Estimator

The classical nonparametric estimator of a survival function from censored data, built as a product of conditional survival fractions, one factor of (1 - d/n) per observed event time.

Statistics Models

Read entry Source

Glossary

Knowledge Distillation

Training a smaller model to imitate the outputs of a larger, stronger model so it inherits much of the capability at lower cost.

Training Models

Read entry Source

Glossary

Large Language Model

A very large language model, usually transformer-based, trained on broad text data and adapted for many downstream tasks.

LLMs Language

Read entry Source

Glossary

Latent Space

A compressed internal representation space where a model does its work, keeping the meaningful structure of the data while discarding redundant detail.

Representation Generative AI

Read entry

Glossary

Learning Rate

A training hyperparameter that controls how large each parameter update is during optimization.

Training Optimization

Read entry Source

Glossary

Linear Attention

An attention family that replaces the softmax with a factorizable similarity, collapsing the key/value history into a fixed-size state and making cost linear in sequence length.

Architecture LLMs Optimization

Read entry Source

Glossary

Linear Probe

A small supervised classifier trained on a network's internal activations to test whether a given property is linearly represented in them.

Interpretability LLMs

Read entry Source

Glossary

Linear Regression

A model predicting the mean of a continuous outcome as a weighted sum of features, classically fitted by least squares.

Models Statistics

Read entry Source

Glossary

Link Function

The transformation in a GLM that connects the unbounded weighted sum of features to the legal range of the outcome's mean, such as the logit for probabilities.

Models Statistics

Read entry Source

Glossary

Log-Rank Test

The standard hypothesis test for whether two or more groups share the same survival curve, comparing each group's observed events against its expected share at every event time.

Statistics

Read entry Source

Glossary

Logistic Regression

A GLM for binary outcomes that models the log-odds as a weighted sum of features, yielding calibrated probabilities through the sigmoid.

Models Statistics

Read entry Source

Glossary

Logit Lens

An interpretability technique that applies a model's unembedding matrix to intermediate-layer activations, revealing which output tokens the model is leaning toward at each layer.

Interpretability LLMs

Read entry Source

Glossary

Loop Engineering

Designing systems that prompt, verify, and re-dispatch coding agents automatically, so the human builds the loop that runs the agent instead of typing each prompt by hand.

Generative AI LLMs

Read entry Source

Glossary

Markov Decision Process

The standard formalism for sequential decision problems, defined by states, actions, transition dynamics, a reward function, and a discount factor, where the next state depends only on the current state and action.

RL Decision Making

Read entry Source

Glossary

Maximum Likelihood Estimation

Fitting a model by choosing the parameters under which the observed data would have been most probable.

Optimization Statistics

Read entry Source

Glossary

Mechanistic Interpretability

The research program of reverse-engineering the internal computations of neural networks into human-understandable components such as features, directions, and circuits.

Interpretability LLMs

Read entry Source

Glossary

Minimum Detectable Effect

The smallest true effect an experiment is designed to detect reliably at its chosen power and significance level; the honesty clause of a power calculation.

Statistics Metrics A/B test

Read entry

Glossary

Mixed Effects Model

A regression containing both fixed effects shared by all observations and random effects drawn from a distribution per group, so grouped data is modeled with partial pooling instead of ignored or overfit.

Models Statistics

Read entry Source

Glossary

Mixture of Experts

A neural architecture that routes each input to a small subset of specialist subnetworks, so a model can hold huge total capacity while spending only a fraction of it per token.

Architecture Models

Read entry Source

Glossary

Model FLOPs Utilization

The fraction of a hardware fleet's peak compute that ends up in a model's actual forward and backward passes; large training runs typically achieve around 40%.

Training Hardware

Read entry Source

Glossary

Model Predictive Control

A classical control strategy that re-solves a short-horizon trajectory optimization at every control tick, executes the first action, and re-plans, gaining robustness to model error through constant feedback.

Decision Making Optimization

Read entry

Glossary

Monotonic Constraint

A modelling constraint that forces predictions to only move in one direction as a chosen feature increases.

Models Optimization

Read entry Source

Glossary

Natural Language Processing

A field focused on helping computers interpret, transform, and generate human language.

NLP Language

Read entry Source

Glossary

Novelty Effect

A short-lived lift from users exploring anything new, which fades as curiosity does; its mirror image, the primacy effect, is a temporary penalty while users relearn a changed interface.

Statistics A/B test Metrics

Read entry

Glossary

Odds Ratio

The multiplicative change in the odds of an outcome per unit change in a feature, obtained by exponentiating a logistic regression coefficient.

Statistics Decision Making

Read entry Source

Glossary

Overall Evaluation Criterion

The single pre-declared metric (or fixed weighting of a few) that decides an experiment's outcome, chosen so that short-term movement predicts long-term business value.

Metrics Decision Making A/B test

Read entry Source

Glossary

Overfitting

A failure mode where a model matches the training data too closely and performs worse on new, unseen examples.

Training Generalization

Read entry Source

Glossary

Partial Pooling

The middle path between fitting one model to all groups (complete pooling) and fitting each group separately (no pooling); group estimates are pulled toward the overall mean with strength inversely related to group size.

Statistics Generalization

Read entry Source

Glossary

Pipeline Parallelism

Training parallelism where consecutive layers live on different chips and activations flow between stages, cheap in bandwidth but prone to idle "bubbles".

Training Hardware

Read entry Source

Glossary

Policy Gradient

A family of RL methods that parameterise the policy directly and improve it by gradient ascent on expected return, nudging up the probability of actions in proportion to the reward that followed them.

RL Optimization

Read entry Source

Glossary

Potential Outcomes

The pair of outcomes a unit would experience with and without treatment, of which reality reveals exactly one; the framework that makes causal effects precise.

Statistics Decision Making

Read entry Source

Glossary

Pre-Training

The initial large-scale training phase that teaches a model general patterns before narrower task-specific adaptation.

Training LLMs

Read entry Source

Glossary

Prompt

The input text or instructions given to a generative model to shape the response it produces.

LLMs Generative AI

Read entry Source

Glossary

Prompt Engineering

The practice of designing prompts so a model is more likely to produce the kind of output you want.

LLMs Generative AI

Read entry Source

Glossary

Propensity Score

The probability of receiving treatment given covariates, e(x) = P(T = 1 | X = x); a one-number summary that makes high-dimensional confounder adjustment tractable.

Statistics Models

Read entry Source

Glossary

Random Effect

A group-specific coefficient treated as a draw from a shared distribution rather than a free parameter, so the model estimates the distribution's variance and shrinks each group's estimate toward the population center.

Models Statistics

Read entry Source

Glossary

Randomized Controlled Trial

An experiment that assigns treatment by chance, making treatment independent of potential outcomes so a simple difference in means is the causal effect.

Statistics Decision Making

Read entry Source

Glossary

Reasoning Model

A language model trained to produce an explicit intermediate chain of thought before its final answer, trading extra inference compute for accuracy on multi-step problems.

LLMs Training RL

Read entry

Glossary

Regression Discontinuity Design

A design that reads the causal effect off the outcome's jump at a threshold rule, since units just above and below the cutoff are effectively randomized.

Statistics Decision Making

Read entry Source

Glossary

Regularization

Techniques that limit model complexity or penalize certain behaviors so the model generalizes better to new data.

Training Generalization

Read entry Source

Glossary

Reinforcement Learning

A family of methods where an agent learns by taking actions, receiving rewards, and improving behavior over time.

RL Decision Making

Read entry Source

Glossary

Reinforcement Learning from Human Feedback

A post-training method that aligns a language model to human preferences by training a reward model on human comparisons, then optimising the policy against that reward with RL.

RL Training LLMs

Read entry

Glossary

Rejection Sampling

Sampling from a target distribution by drawing from an easier proposal distribution and probabilistically keeping or discarding each draw.

Statistics Generative AI

Read entry

Glossary

Residual Stream

The running per-token vector in a transformer that every attention and MLP block reads from and additively writes back to, acting as the model's shared communication channel.

Architecture Interpretability LLMs

Read entry Source

Glossary

Restricted Maximum Likelihood

A variant of maximum likelihood that integrates the fixed effects out before estimating variance components, correcting the downward bias that plain ML inherits from estimating the mean structure.

Statistics Optimization

Read entry Source

Glossary

Retrieval-Augmented Generation

Answering with an LLM by first retrieving relevant documents or memories from an external store (usually via embedding similarity) and placing them in the context window alongside the query.

LLMs Retrieval

Read entry Source

Glossary

Roofline Model

A performance model saying a workload's speed is capped by either a chip's peak compute or its memory bandwidth times the workload's arithmetic intensity, whichever is lower.

Hardware Optimization

Read entry Source

Glossary

Sample Ratio Mismatch

A statistically implausible gap between an experiment's configured traffic split and the observed user counts, signalling broken randomization that invalidates the results.

Statistics A/B test Data

Read entry Source

Glossary

Score Function

The gradient of the log-density of a distribution, pointing at every location toward regions of higher probability.

Generative AI Optimization

Read entry Source

Glossary

Selection Bias

Bias from the treated and untreated groups differing even without treatment, formally E[Y(0) given treated] minus E[Y(0) given untreated]; the main reason naive comparisons mislead.

Statistics Data

Read entry Source

Glossary

Self-Attention

Attention applied within a single sequence, so every token builds its new representation as a weighted mix of all the other tokens.

Architecture LLMs Representation

Read entry Source

Glossary

Semantic Search

Search that tries to match by meaning and intent instead of only exact keyword overlap.

Search Retrieval

Read entry

Glossary

Sequential Testing

Statistical designs that permit analyzing an experiment repeatedly while it runs, by spending the error budget deliberately across looks instead of pretending each look is the only one.

Statistics A/B test Decision Making

Read entry

Glossary

Sim-to-Real Transfer

Training a policy in a physics simulator and deploying it on real hardware, closing the gap between simulated and real dynamics with techniques like domain randomization and learned actuator models.

RL Deployment

Read entry Source

Glossary

Sliding-Window Attention

A sparse attention pattern where each token attends only to a fixed-size window of recent tokens, bounding compute and cache growth on long contexts.

Architecture LLMs Optimization

Read entry Source

Glossary

Sparse Autoencoder

An autoencoder trained on a model's internal activations with a sparsity penalty, used to decompose activations into a large dictionary of interpretable feature directions.

Interpretability Representation LLMs

Read entry Source

Glossary

Speculative Decoding

An inference trick where a small draft model proposes several tokens and the large model verifies them in one batched pass, spending spare compute to cut latency.

Optimization LLMs

Read entry Source

Glossary

Statistical Power

The probability a test detects an effect of a given size when it truly exists; the complement of the false-negative rate, conventionally targeted at 80%.

Statistics Metrics

Read entry

Glossary

Superintelligence

A hypothetical intelligence that greatly exceeds the best human performance across virtually every domain, including scientific creativity, strategic reasoning, and social skill.

Models Fundamentals

Read entry

Glossary

Superposition

A representation strategy where a network stores more features than it has dimensions by assigning each feature an almost-orthogonal direction and tolerating small interference between them.

Interpretability Representation LLMs

Read entry Source

Glossary

Survival Function

The probability that a lifetime exceeds t, written S(t). It starts at 1, never increases, and its value at each time is the fraction of a population expected to still be "alive" then.

Statistics Models

Read entry Source

Glossary

Teleoperation

A human directly controlling a robot's movements in real time, via leader-follower arms, VR controllers, or hand-held devices; the dominant way robot manipulation demonstrations are collected.

Data RL

Read entry

Glossary

Temporal-Difference Learning

Value-learning methods that update predictions from the gap between successive estimates, r + gamma V(s') - V(s), rather than waiting for the final outcome, trading a little bias for much lower variance.

RL Decision Making

Read entry Source

Glossary

Tensor Parallelism

Training and serving parallelism where each weight matrix itself is split across chips, requiring activation exchanges inside every layer.

Training Hardware

Read entry Source

Glossary

Tensor Processing Unit

Google's custom AI accelerator, built around large compiler-scheduled systolic arrays and connected into pods of thousands of chips via a switchless torus network.

Hardware

Read entry Source

Glossary

Test-Time Compute

Computation spent during inference rather than training (for example, longer reasoning traces or sampling many candidate answers) to raise quality without changing the weights.

LLMs Optimization

Read entry Source

Glossary

Test-Time Training

Architectures whose designated weights are updated by gradient steps during inference itself, so part of the model literally learns from the sequence it is processing.

Architecture Training

Read entry Source

Glossary

Time to First Token

The delay between sending a request and receiving the first token of the response, dominated by the compute-bound prefill of the prompt.

Deployment Metrics

Read entry Source

Glossary

Token

The basic unit a language model reads and predicts, which may be a word, character, or subword fragment.

LLMs Language

Read entry Source

Glossary

Transfer Learning

Reusing knowledge learned on one task or dataset to help solve a different but related task more efficiently.

Training Reuse

Read entry Source

Glossary

Transformer

A neural network architecture built around attention mechanisms that became the foundation for many modern language models.

Architecture LLMs

Read entry Source

Glossary

Vibe Coding

Building software by conversationally prompting an AI and accepting its output largely unread, embracing the vibes rather than the code. Fast for prototypes, risky for anything that has to last.

Generative AI LLMs

Read entry Source

Glossary

Virtual Machine

An emulated computer that runs a complete guest operating system on virtualized hardware, providing strong isolation at the cost of size and startup time.

Deployment

Read entry Source

Glossary

Vision-Language-Action Model

A robot policy built on a web-pretrained vision-language model, fine-tuned so that alongside understanding images and text it also outputs robot actions, either as discrete tokens or via a fast generative action head.

RL Models

Read entry Source

Glossary

World Model

A learned model of environment dynamics that predicts future observations or states given actions, letting an agent plan or train inside its own "imagination" instead of the real environment.

RL Models

Read entry Source

Glossary

p-value

The probability, computed assuming no true effect, of observing a result at least as extreme as the one seen; small values are evidence against the no-effect hypothesis.

Statistics Metrics

Read entry