摘要:创新性地提出了一种基于隐扩散模型(LDM)的语义通信架构,能够利用预训练生成模型实现无需重新训练的语义特征还原。借助LDM固有的去噪能力,该架构在面对信道噪声扰动和分布外输入时,依然能够保持稳定的重构性能。此外,该架构支持灵活集成外部大模型资源,显著提升了语义通信系统的演进能力。在大规模图像数据集上的实验结果表明,该方法在低信噪比条件下仍具有优异的图像恢复与语义保真能力,尤其在基于学习的感知相似度等语义指标上显著优于现有主流方法。本研究为资源受限设备的机器语义通信提供了全新思路,具有良好的实用性与推广潜力。
关键词:语义通信;隐扩散模型;机器通信;机器视觉;信道鲁棒性
Abstract: A novel semantic communication architecture based on the Latent Diffusion Model (LDM) is proposed. This architecture leverages pre-trained generative models to enable semantic feature reconstruction without the need for task-specific retraining. By exploiting the intrinsic denoising capability of LDM, the proposed system maintains stable reconstruction performance under channel noise perturbations and out-of-distribution inputs. Additionally, the architecture supports the flexible integration of external large-scale models, significantly enhancing the evolutionary capability of semantic communication systems. Experimental results on large-scale image datasets demonstrate that our method achieves superior image recovery and semantic fidelity, even under low signal-to-noise ratio (SNR) conditions. In particular, it significantly outperforms existing mainstream approaches in terms of learning-based perceptual similarity metrics. This study offers a new perspective for deploying machine-type semantic communication on resource-constrained devices and exhibits strong potential for practical applications and large-scale deployment.
Keywords: semantic communication; Latent Diffusion Model; machine-type communications; machine vision; channel robustness