Hierarchy
HMIA 2025

Class Title
HMIA 2025
"Readings"
Video: x [3m21s]
Activity: TBD
PRE-CLASS
CLASS
HMIA 2025
TRIVIAL COOPERATION: shared goals and information - pick a goal and execute
Hobbes' observation: scarcity + similar agents→competition life is brutish&short
Hobbes' fix: cede sovereignty to boss with credible enforcement. Command→order
Command Failure Mode (preferences)
Agents retain autonomy→effort substitution & selective obedience
Command Failure Mode (information)
Orders incomplete & ambiguous, environments shift.
Principals and Agents
From commands to contracts. Alignment by design: selection, monitoring, incentives to align autonomy with principals goals.
Agent as RL learner.
Naked RL is a clean micro-model: the agent updates a policy to maximize rewards.
Goodhart risk: m(·) omits what drives V(·), maximizing T(m(a)) reduces Us. Gaming, reward hacking, short termism.
Requires governance and guardrails. Lagged, hard-to-game proxies, HITL overrides, team rewards, culture, the "alignment stack"
Incentives are transfers on signals.
As soon as behavior is driven by 𝑇(𝑚(𝑎)) T(m(a)), the problem is no longer obedience—it’s measurement.
But T(m(a)) is always a lossy compression of what matters.
HMIA 2025
PRE-CLASS
HMIA 2025
PRE-CLASS
Lecture Title
HMIA 2025
CLASS
HMIA 2025
CLASS
HMIA 2025
Resources
Author. YYYY. "Linked Title" (info)
NEXT Markets
HMIA 2025 Hierarchy
By Dan Ryan
HMIA 2025 Hierarchy
- 66