Site icon The Blind Machine

Persona vectors: Monitoring and controlling character traits in language models

Persona vectors: Monitoring and controlling character traits in language models

A paper from Anthropic describing persona vectors and their applications to monitoring and controlling model behavior

https://www.anthropic.com/research/persona-vectors

Exit mobile version