Large Language Models

Hijacking Large Language Models via Adversarial In-Context Learning featured image

Hijacking Large Language Models via Adversarial In-Context Learning

This work introduces a novel transferable attack against In-Context-Learning to hijack LLMs to generate the target response or jailbreak. We also propose a defense strategy …

avatar
Xiangyu Zhou
An example preprint / working paper featured image

An example preprint / working paper

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum.

avatar
Xiangyu Zhou
An example conference paper featured image

An example conference paper

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum.

avatar
Xiangyu Zhou