Alignment by Incentive

HMIA 2025

Class Title

HMIA 2025

"Readings"

Video: x [3m21s]

Activity: TBD

PRE-CLASS

CLASS

Outline

  1. Examples from Kerr: what's actually being rewarded here?
  2. Review our cards

HMIA 2025

HMIA 2025

PRE-CLASS

Video: Linked Title [3m21s]

HMIA 2025

PRE-CLASS

HMIA 2025

PRE-CLASS

HMIA 2025

PRE-CLASS

If Everyone Has Their Price, Are We Good?

HMIA 2025

CLASS

HMIA 2025

CLASS

Hayek and Smith: collective order CAN emerge from individual motivation

Schelling: non-nefarious motives can generate nefarious outcomes

 

Review our several approaches to incentives - what's different? what's underlying same model?  Is it natural? Cultural? 

 

Kerr examples: what is actually being rewarded? how does that compare to intended incentives? What does this tell us about specifying rewards?

Show Amodei's boat?  Return to reward hacking question.

 

Get at reward shaping ?  Education?  My example (in Obsidian?)

HMIA 2025

Resources

Author. YYYY. "Linked Title" (info)

Kerr On the Folly of rewarding A while hoping for B

Smith Wealth of Nations

Hayek Cosmos and Taxis

Schelling Micromotives and macrobehavior.

HMIA 2025 Alignment by Incentive

By Dan Ryan

HMIA 2025 Alignment by Incentive

  • 9