journal article · 2024

Effect of Private Deliberation: Deception of Large Language Models in Game Play

Kristijan Poje , Mario Brcic , Mihael Kovac , Marina Bagic Babac

Entropy, Volume 26, Issue 6, 524

Notes

Studies whether giving large language models a private deliberation channel changes their tendency to deceive in social-deduction game play. Finds that private chain-of-thought materially increases strategic deception, with implications for evaluation and AI-safety design.

How to cite

@article{brcic2024poje,
  author = {Kristijan Poje and Mario Brcic and Mihael Kovac and Marina Bagic Babac},
  title = {Effect of Private Deliberation: Deception of Large Language Models in Game Play},
  journal = {Entropy, Volume 26, Issue 6, 524},
  year = {2024},
  doi = {10.3390/e26060524},
  url = {https://doi.org/10.3390/e26060524},
}

Topics:

ai safety