Language-oriented Communication with Semantic Coding and Knowledge Distillation for T2I Generation

RAMO
2023년 10월 12일
2분 분량

최종 수정일: 2023년 10월 17일

By integrating recent advances in large language models (LLMs) and generative models into the emerging semantic communication (SC) paradigm, in this article we put forward to a novel framework of language-oriented semantic communication (LSC). In LSC, machines communicate using human language messages that can be interpreted and manipulated via natural language processing (NLP) techniques for SC efficiency. To demonstrate LSC's potential, we introduce three innovative algorithms: 1) semantic source coding (SSC) which compresses a text prompt into its key head words capturing the prompt's syntactic essence while maintaining their appearance order to keep the prompt's context; 2) semantic channel coding (SCC) that improves robustness against errors by substituting head words with their lenghthier synonyms; and 3) semantic knowledge distillation (SKD) that produces listener-customized prompts via in-context learning the listener's language style. In a communication task for progressive text-to-image generation, the proposed methods achieve higher perceptual similarities with fewer transmissions while enhancing robustness in noisy communication channels.

1) Data-to-Language Translation: Text-based cross-modal models transform input data into language messages to be transmitted (e.g., via CLIP for image-to-text (I2T) or Whisper for speech-to-text translation).

2) Language Analysis & Manipulation: Large language models (LLMs) and other NLP algorithms (e.g., GPT4, Llama 2, and CoreNLP) are utilized for analyzing the syntax, semantics, and context in language messages and manipulating these messages for improving communication efficiency.

3) Language-to-Data Generation: Text-conditioned generative models produce intended data using the received message seman (e.g., via Stable Diffusion for text-toimage (T2I) or Zeroscope for text-to-video generation).

Full Paper: H. Nam, J. Park, J. Choi, M.Bennis, S.-L Kim, "Language-oriented Communication with Semantic Coding and Knowledge Distillation for Text-to-Image Generation," submitted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, [Online].Available: https://arxiv.org/abs/2309.11127

로보틱 및 무선네트워크 연구실

Robotic and Mobile Networks Laboratory

School of Electrical

& Electronic Engineering

Language-oriented Communication with Semantic Coding and Knowledge Distillation for T2I Generation

최근 게시물

Komentar