摘要
Medical image automatic segmentation plays an important role in Computer-Aided Diagnosis system. Although convolution-based network has achieved great performance in medical image segmentation, it has limitations in modeling long-range contextual interactions and spatial dependencies. Due to the powerful ability of long-range information interaction of Vision Transformer, Vision Transformer have achieved advanced performance in several downstream tasks via self-supervised learning. In this paper, motivative by Swin Transformer, we proposed BTSwin-Unet, which is a 3D U-shaped symmetrical Swin Transformer-based network for brain tumor segmentation. Moreover, we construct a self-supervised learning framework to pre-train the model encoder through the reconstruction task. Extensive experiments on tumor segmentation tasks validated the performance of our proposed model, and our results consistently demonstrate favorable benchmarks.
-
单位南昌航空大学