The invention discloses a semi-
supervised learning method special for performing
cell nucleus segmentation on a histopathological image dyed by hematoxylin
eosin. According to the
cell nucleus segmentation method provided by the invention, according to the characteristics of the histopathological image and
cell nucleus segmentation, the two dyes of hematoxylin and
eosin in the histopathological image are separated by adopting non-negative matrix factorization with
sparse constraint, and then the
eosin dye in the histopathological image is replaced by the eosin dye in other histopathological images, so that the segmentation efficiency of the
cell nucleus is improved. Therefore, a group of positive example samples can be prepared, and the positive example samples have the same hematoxylin
staining agent, so that the positive example samples have interpretable invariance. And inputting the multiple groups of positive example samples into an
encoder, and outputting a corresponding embedded representation vector by the
encoder. And constraining the model by adopting a contrast learning
loss function, so that the model can learn invariance in a positive example sample, namely the hematoxylin
staining agent. The
hematoxylin stain can
stain the
cell nucleus and other
nucleic acid-rich parts, such as
ribosome, so that the
hematoxylin stain and the
cell nucleus have relatively high correlation. When the model learns the characteristics of the
hematoxylin stain, the characteristics accord with the characteristics of a cell nucleus segmentation task, so that the training of the downstream cell nucleus segmentation task is facilitated. As positive example sample construction and pre-training do not need labels, a large amount of unlabeled data can be utilized for training in the mode. And finally, the pre-trained
encoder is added into the segmentation model, and fine adjustment is performed on a very small amount of
labeled data, so that an effect better than
supervised learning on a small amount of samples can be achieved. Therefore, the demand of
annotation data is also reduced, and the labor cost is greatly reduced.