COMPASS: Process Alignment for Safe Search Agents
COMPASS uses MCTS for the safety alignment of search agents, detecting malicious intents disguised as seemingly harmless sub-queries.
Sources arxiv.org
COMPASS uses MCTS for the safety alignment of search agents, detecting malicious intents disguised as seemingly harmless sub-queries.