Gonçalo Teixeira

Índice

Cada artigo científico, pessoa, descoberta empírica ou disposição regulatória que surge nos ensaios tem aqui página própria.

Artigos 11

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
10 de janeiro de 2024
O artigo Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training (arXiv:2401.05566) foi…
Alignment Faking in Large Language Models
18 de dezembro de 2024
O artigo Alignment Faking in Large Language Models (arXiv:2412.14093) foi publicado a 18 de dezembro de 2024…
Risks from Learned Optimization in Advanced Machine Learning Systems
5 de junho de 2019
O artigo Risks from Learned Optimization in Advanced Machine Learning Systems (arXiv:1906.01820) foi…

Pessoas 16

Descobertas 5

Regulação 14