Language-oriented Communication with Semantic Coding and Knowledge Distillation for T2I Generation
최종 수정일: 10월 17일
By integrating recent advances in large language models (LLMs) and generative models into the emerging semantic communication (SC) paradigm, in this article we put forward to a novel framework of language-oriented semantic communication (LSC). In LSC, machines communicate using human language messages that can be interpreted and manipulated via natural language processing (NLP) techniques for SC efficiency. To demonstrate LSC's potential, we introduce three innovative algorithms: 1) semantic source coding (SSC) which compresses a text prompt into its key head words capturing the prompt's syntactic essence while maintaining their appearance order to keep the prompt's context; 2) semantic channel coding (SCC) that improves robustness against errors by substituting head words with their lenghthier synonyms; and 3) semantic knowledge distillation (SKD) that produces listener-customized prompts via in-context learning the listener's language style. In a communication task for progressive text-to-image generation, the proposed methods achieve higher perceptual similarities with fewer transmissions while enhancing robustness in noisy communication channels.
1) Data-to-Language Translation: Text-based cross-modal models transform input data into language messages to be transmitted (e.g., via CLIP for image-to-text (I2T) or Whisper for speech-to-text translation).
2) Language Analysis & Manipulation: Large language models (LLMs) and other NLP algorithms (e.g., GPT4, Llama 2, and CoreNLP) are utilized for analyzing the syntax, semantics, and context in language messages and manipulating these messages for improving communication efficiency.
3) Language-to-Data Generation: Text-conditioned generative models produce intended data using the received message seman (e.g., via Stable Diffusion for text-toimage (T2I) or Zeroscope for text-to-video generation).
Full Paper: H. Nam, J. Park, J. Choi, M.Bennis, S.-L Kim, "Language-oriented Communication with Semantic Coding and Knowledge Distillation for Text-to-Image Generation," submitted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, [Online].Available: https://arxiv.org/abs/2309.11127