ScholarMate
客服热线:400-1616-289

A novel skip connection mechanism based on channel-wise cross transformer for speech enhancement

Jiang, Weiqi; Sun, Chengli*; Chen, Feilong; Leng, Yan; Guo, Qiaosheng
Science Citation Index Expanded
广州航海高等专科学校; 南昌航空大学

摘要

The skip connection mechanism has been proven to be an effective approach for improving speech enhancement networks. By strengthening the information transfer between the encoder and the decoder, it facilitates the restoration of speech features during the up-sampling process. However, simple skip connection mechanism that directly connect corresponding layers of the encoder and decoder have several issues. Firstly, it only forces the features of the same scale to be aggregated, ignoring the potential relationships between different scales. Secondly, the shallow encoder feature contains a lot of redundant information. Studies have shown that coarse skip connections can even be detrimental to model performance in some cases. In this work, we propose a novel skip connection mechanism based on channel-wise Transformer for speech enhancement, comprising two components: multi-scale channel-wise cross fusion and channel-wise cross attention. This proposed skip connection mechanism can fuse multi-scale speech features from different levels of the encoder and effectively connect the reconstructed features to the decoder. Building on this, we propose a lightweight U-shaped network (UNet) structure called UCTNet. Experimental results show that UCTNet is comparable to other competitive models in terms of various objective speech quality metrics with only a few parameters.

关键词

Channel-wise cross Transformer Skip connection Multi-scale speech features Speech enhancement