摘要
Artificial intelligence has been widely used for digital pathology diagnosis. However, the AI performance highly relies on the high-quality annotated datasets, pathological images need to be labeled by experienced pathologists manually, which is time consuming, laborious and expensive. In addition, small lesion areas are usually missed by human eyes, directly influencing the performance of those identification models trained by the data. This paper presents a new strategy for generating annotated pathological benchmark dataset from microscopic hyperspectral images of HE-CAM5.2 stained tissues. We design a Spatial-Spectral based Hyperspectral GAN (SSHGAN), which transforms hyperspectral images into standard histological images using networks trained by the cycle consistent adversarial model. Gradient boosting decision tree integrated with graph-cut method is used to automatically generate the annotations by adding the spectral prior. The proposed strategy can obtain both the standard H&E images and the corresponding annotation files simultaneously using spatial and spectral information of hyperspectral images. The methods have been tested on gastric cancer, lung adenocarcinoma, intrahepatic cholangiocarcinoma, and colorectal cancer tissues and evaluated by segmentation networks and experienced pathologists. Experimental results show that the proposed methods have desirable performance on small tumor targets and discrete regions, which is promising in automatically generating completely annotation pathology benchmark datasets.
- 
                                单位复旦大学
