Kiki or Bouba?
Sound Symbolism in Vision-and-Language Models

Morris Alper Hadar Averbuch-Elor

NeurIPS 2023 ✨SPOTLIGHT✨

bouba

kiki

bouba

kiki

kiki

bouba

Text-to-image generations from prompts with "kiki" or "bouba".
Which images were generated with "kiki" and which with "bouba"?
(Hover to see the answers.)

TL;DR: Psychological experiments have shown that humans tend to associate certain speech sounds with certain visual shapes. We ask: What about AI models for tasks like text-to-image generation? By generating images using prompts containing pseudowords (nonsense words) and analyzing their shapes, we show that AI image generation models also show sound-shape associations.

Abstract

Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism. Among the many dimensions of meaning, sound symbolism is particularly salient and well-demonstrated with regards to cross-modal associations between language and the visual domain. In this work, we address the question of whether sound symbolism is reflected in vision-and-language models such as CLIP and Stable Diffusion. Using zero-shot knowledge probing to investigate the inherent knowledge of these models, we find strong evidence that they do show this pattern, paralleling the well-known kiki–bouba effect in psycholinguistics. Our work provides a novel method for demonstrating sound symbolism and understanding its nature using computational tools.



Acknowledgements

This work was partially supported by the Alon Fellowship. We thank Gal Fiebelman and Taelin Karidi for their helpful feedback.


Citation

@InProceedings{alper2023kiki-or-bouba,
  author    = {Morris Alper and Hadar Averbuch-Elor},
  title     = {Kiki or Bouba? Sound Symbolism in Vision-and-Language Models},
  booktitle = {Proceedings of Advances in Neural Information Processing Systems (NeurIPS)},
  year      = {2023}
}