Article

Learning to poison large language models for downstream manipulation

Xiangyu Zhou

• Feb 21, 2024 • 1 min read

Hijacking Large Language Models via Adversarial In-Context Learning

This work introduces a novel transferable attack against In-Context-Learning to hijack LLMs to generate the target response or jailbreak. We also propose a defense strategy …

Xiangyu Zhou

• Nov 16, 2023 • 1 min read

Large Language Models

An example preprint / working paper

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum.

Xiangyu Zhou

• Apr 7, 2019 • 1 min read

No results found

Article

Learning to poison large language models for downstream manipulation

Hijacking Large Language Models via Adversarial In-Context Learning

An example preprint / working paper