Simple Summary: Bladder cancer is a common cancer of the urinary tract, characterized
by high metastatic potential and recurrence. The research applies a transfer learning
approach on CT images (frontal, axial, and saggital axes) for the purpose of semantic
segmentation of areas affected by bladder cancer. A system consisting of AlexNet network
for plane recognition, using transfer learning-based U-net networks for the segmentation
task. Achieved results show that the proposed system has a high performance, suggesting
possible use in clinical practice.Abstract: Urinary bladder cancer is one of the most
common cancers of the urinary tract. This cancer is characterized by its high metastatic
potential and recurrence rate. Due to the high metastatic potential and recurrence
rate, correct and timely diagnosis is crucial for successful treatment and care. With
the aim of increasing diagnosis accuracy, artificial intelligence algorithms are introduced
to clinical decision making and diagnostics. One of the standard procedures for bladder
cancer diagnosis is computer tomography (CT) scanning. In this research, a transfer
learning approach to the semantic segmentation of urinary bladder cancer masses from
CT images is presented. The initial data set is divided into three sub-sets according
to image planes: frontal (4413 images), axial (4993 images), and sagittal (996 images).
First, AlexNet is utilized for the design of a plane recognition system, and it achieved
high classification and generalization performances with an (AUC(micro)) over bar
of 0.9999 and sigma(AUC(micro)) of 0.0006. Furthermore, by applying the transfer learning
approach, significant improvements in both semantic segmentation and generalization
performances were achieved. For the case of the frontal plane, the highest performances
were achieved if pre-trained ResNet101 architecture was used as a backbone for U-net
with (DSC) over bar up to 0.9587 and sigma(DSC) of 0.0059. When U-net was used for
the semantic segmentation of urinary bladder cancer masses from images in the axial
plane, the best results were achieved if pre-trained ResNet50 was used as a backbone,
with a DSC up to 0.9372 and sigma(DSC) of 0.0147. Finally, in the case of images in
the sagittal plane, the highest results were achieved with VGG-16 as a backbone. In
this case, (DSC) over bar values up to 0.9660 with a sigma(DSC) of 0.0486 were achieved.
From the listed results, the proposed semantic segmentation system worked with high
performance both from the semantic segmentation and generalization standpoints. The
presented results indicate that there is the possibility for the utilization of the
semantic segmentation system in clinical practice.