Researchers detail "subliminal learning", where LLMs learn traits from model-generated data that is semantically unrelated to those traits (Anthropic)
143d ago
Technology
Techmeme

Researchers at Anthropic have detailed a phenomenon called "subliminal learning" in large language models (LLMs). This occurs when LLMs learn traits from model-generated data that are semantically unrelated to those traits. The research highlights a potential unintended consequence of training LLMs on their own generated content. This can lead to unexpected biases or behaviors developing within the model. Further investigation is needed to fully understand and mitigate the effects of subliminal learning in AI systems.