Название: Computational Analysis and Deep Learning for Medical Care
Автор: Группа авторов
Издательство: John Wiley & Sons Limited
Жанр: Программы
isbn: 9781119785736
isbn:
Keywords: CNN, deep learning, intervertebral disc degeneration, MRI segmentation
1.1 Introduction
The concept of Convolutional Neural Network (CNN) was introduced by Fukushima. The principle in CNN is that the visual mechanism of human is hierarchical in structure. CNN has been successfully applied in various image domain such as image classification, object recognition, and scene classification. CNN is defined as a series of convolution layer and pooling layer. In the convolution layer, the image is convolved with a filter, i.e., slide over the image spatially and computing dot products. Pooling layer provides a smaller feature set.
One major cause of low back pain is disc degeneration. Automated detection of lumbar abnormalities from the clinical scan is a burden for radiologist. Researchers focus on the automation task of the segmentation of large set of MRI data due to the huge size of such images. The success of the application of CNN in various field of object detection enables the researchers to apply various models for the detection of Intervertebral Disc (IVD) and, in turn, helps in the diagnosis of diseases.
The details of the structure of the remaining section of the paper are as follows. The next section deals with the study of the various CNN models. Section 1.3, presents applications of CNN for the detection of the IVD. In Section 1.4, comparison with state-of-the-art segmentation approaches for spine T2W images is carried out, and conclusion is in Section 1.5.
1.2 Various CNN Models
1.2.1 LeNet-5
The LeNet architecture was proposed by LeCun et al. [1], and it successfully classified the images in the MNIST dataset. LeNet uses grayscale image of 32×32 pixel as input image. As a pre-processing step the input pixel values are normalized so that white (background) pixel represents a value of 1 and the black (foreground) represents a value of 1.175, which, in turn, speedup the learning task. The LeNet-5 architecture consists of succession of input layer, two sets of convolutional and average pooling layers, followed by a flattening convolutional layer, then two fully connected layers, and finally a softmax classifier.
The first convolutional layer filters the 32×32 input image with six filters. All filter kernels are of size 5×5 (receptive field) with a stride of 1 pixel (this is the distance between the receptive field centers of neighboring neurons in a kernel map) and uses “same” padding. Given the input image of size 28×28, apply six convolutional kernels each of size 5×5 with stride 1 in C1, the feature maps obtained is of size 14×14. Figure 1.1 shows the architecture of LeNet-5, and Table 1.1 shows the various parameter details of LeNet-5. Let Wc is the number of weights in the layer; Bc is the number of biases in the layer; Pc is the number of parameters in the layer; K is the size (width) of kernels in the layer; N is the number of kernels; C is the number of channels in the input image.
(1.1)
(1.2)
In the first convolutional layer, number of learning parameters is (5×5 + 1) × 6 = 156 parameters; where 6 is the number of filters, 5 × 5 is the filter size, and bias is 1, and there are 28×28×156 = 122,304 connections. The number of feature map calculation is as follows:
(1.3)
(1.4)
W = 32; H = 32; Fw = Fh = 5; P = 0, and the number of feature map is 28 × 28.
First pooling layer: W = 28; H = 28; P = 0; S = 2
(1.5)
Figure 1.1 Architecture of LeNet-5.
Table 1.1 Various parameters of the layers of LeNet.
Sl no. | Layer | Feature map | Feature map size | Kernel size | Stride | Activation | Trainable parameters | # Connections |
1 | Image | 1 | 32 × 32 | - | - | - | - | - |
2 | C1 | 6 | 28 × 28 | 5 × 5 | 1 | tanh | 156 | 122,304 |
3 | S1 | 6 | 14 × 14 | 2 × 2 | 2 | tanh | 12 | 5,880 |
4 | C2 | 16 | 10 × 10 | 5 × 5 | 1 | tanh | 1516 | 151,600 |
5 | S2 | 16 | 5 × 5 | 2 × 2 | СКАЧАТЬ