Glossary Entry

Reinforcement Learning

A family of methods where an agent learns by taking actions, receiving rewards, and improving behavior over time.

RL Decision Making

Seed source: Google ML Glossary

Reinforcement learning is different from standard supervised learning because the system learns through interaction rather than from a fixed table of correct labels. The agent explores, receives rewards, and updates its behavior to improve long-run outcomes.

That framing underpins the bandits, policy-gradient, actor-critic, and RLHF-adjacent material across the blog. Once you can identify the agent, environment, action, and reward loop, those posts line up much more clearly.