Improving Quality of Watermarked Medical Images Using Symmetric Dilated Convolution Neural Networks

 Rapid development of online medical technologies raises questions about the security of the patient’s medical data. When patient records are encrypted and labeled with a watermark, they may be exchanged securely online. In order to avoid geometrical attacks aiming to steal the information, image quality must be maintained and patient data must be appropriately extracted from the encoded image. To ensure that watermarked images are more resistant to attacks (e.g. additive noise or geometric attacks), different watermarking methods have been invented in the past. Additive noise causes visual distortion and render the potentially harmful diseases more difficult to diagnose and analyze. Consequently, denoising is an important pre-processing method for obtaining superior outcomes in terms of clarity and noise reduction and allows to improve the quality of damaged medical images. Therefore, various publications have been studied to understand the denoising methods used to improve image quality. The findings indicate that deep learning and neural networks have recently contributed considerably to the advancement of image processing techniques. Consequently, a system has been created that makes use of machine learning to enhance the quality of damaged images and to facilitate the process of identifying specific diseases. Images, damaged in the course of an assault, are denoised using the suggested technique relying on a symmetric dilated convolution neural network. This improves the system’s resilience and establishes a secure environment for the exchange of data while maintaining secrecy.


Introduction
Advances in the field of Internet communication and telemedicine data sharing enable patients and medical professionals to communicate regardless of any location-or time-related constraints. Transfers of sensitive patient records via public networks create a risk of the patient's medical history being accessed by unauthorized persons. This means that such activities, lacking the proper degree of protection, may result in data theft. Consequently, information may be used for fraudulent purposes such situations may even lead to misdiagnosing the patient.
Watermarks can be used to secure data and provide copyright protection by identifying the owner. This method is also employed to identify authenticity and combat media piracy. An invisible watermark is the most popular solution of this type used currently, as it helps hide identity-related data. When the invisible watermarking technique is used, the information stored in the source image is not visible to the third party or the person trying to access it. In this case, the attacker tries to extract the data by performing different geometrical attacks which distort the image and add unwanted noise, affecting the sharpness or color of the captured images. Hence, improving the quality of the image is essential in image processing techniques and may be achieved by means of image denoising.
Denoising is a method that minimizes feature losses while lowering the level of noise in an image. Denoising seeks to reduce noise while retaining as much of the signal information that makes up an image as possible. Filters are commonly used to reduce image noise and work well. However, the use of filters results in the blurring of images and if the image is excessively noisy, the filtered image will be deformed to such an extent that most of the its crucial components will be destroyed. This causes artefacts and blurs in the image. In addition to being inefficient in decreasing noise levels, smoothing filters may also harm detailed elements, such as edges of the image. Although they are less effective in maintaining fine details, sharpening filters frequently reduce the distortion of edges and noise. The fundamental problem with image denoising techniques harness in medical image analysis scenarios is that missing elements lead to incorrect image analysis results, potentially causing an improper diagnosis. The primary challenge with image denoising is that a substantial amount of data is lost during the degradation process, making it a particularly poorly-posed inverse issue [1]- [5]. With those difficulties taken into consideration, effective techniques should offer the following features: -image borders must be retained (no blurring), -all texture information must be considered, -no new artefacts should be created, -the overall contrast should be preserved.

2/2023
Improving Quality of Watermarked Medical Images Using Symmetric Dilated Convolution Neural Networks In this paper, a system is proposed capable of enhancing the quality of corrupted medical images by relying on a new denoising strategy. The solution considers watermarked medical images to be the input data, with confidential information presented in an encrypted format. Watermarking and encryption techniques are used to secure the confidential data embedded in medical images and prevent unauthorized access. Finally, the proposed technique is examined and evaluated for selected medical image modalities.
The structure of the paper is described below. An overview of the literature is provided in Section 2. The proposed approach is outlined in Section 3. Section 4 presents the computational results for the suggested method. Conclusions are given in Section 5.

Related Works
Some information on image denoising and its importance in image processing is given by Gu et al. [2]. The authors describe the various methods used for denoising images, ranging from traditional approaches to more recent deep neural network-based methods. Numerous image denoising techniques and a summary of the image denoising problem are provided in [3] by Fan et al. The authors address the advantages and disadvantages of denoising approaches in the transform and spatial domains. Bhawna et al. in [4] and Sameera et al. in [5] compare, categorize, and assess various image denoising techniques. The advantages and disadvantages of various transform domain and spatial domain filtering techniques are covered in detail. Image denoising techniques used in order to achieve the best performance are evaluated by Ferzo et al. [6]. According to the survey, some of them used primary wavelet transforms, while others relied on complicated wavelet transforms with filters to get rid of mixed-or single-type noise.
Based on an enhanced wavelet threshold and median filtering, Qian et al. [7] propose a combined denoising method that, after wavelet deconstructing of the image, denoises images with high frequency coefficients. Kaur et al. [8] focused their research on identifying a suitable machine learning approach for noise removal in a medical radiography application. In most situations, machine learning outperforms conventional photo denoising techniques. In [9], Liu et al. investigated various machine learning strategies for image denoising in an effort to aid academics in understanding the advancements in this technology. This study introduces three primary types of models: pulse-coupled neural network, wavelet neural network and convolutional neural network (CNN) which are often used for picture denoising.
The research presented in [10] by Ruikai et al. focuses primarily on the application of deep learning technologies in image denoising. It also examines the challenges deep learning encounters when it comes to denoising and offers potential solutions. In [11], Juneja et al. develop a hybrid method for reducing noise present in MRI images. Their approach combines a block-based autoencoder network and a Bayes shrinkage-based fused wavelet transform. The proposed method aids in reducing MRI noise associated with data loss and significant edge characteristics. In their study [12], Chen et al. explain the architecture of a deep residual CNN based on dilated convolution. The presented system consists of a set of convolution layers, dilated convolution layers, and normalized multi-scale convolution blocks which help avoid over-parameterization problems. Based on the traditional threshold function approach, Shen et al. in [13] created an optimized wavelet threshold function algorithm for processing ultrasound images. They determined that denoised images are characterized by higher quality than original images, and that more precise information could be extracted from them. According to a watermarking technique described by Rahim et al. in [14], the fast Fourier transform algorithm is used to embed a watermark in the frequency domain. In the paper, a second layer of protection is applied to protect watermarks using AES and error correcting coding. Additionally, the article examined the impact of several noise-elimination techniques on the watermark technique when additive white Gaussian noise was present. Mahto et al. in [15] presented a strategy for building a better watermarking algorithm and suggested a way to produce watermarked images characterized by excellent imperceptibility and robustness levels. Starting with a perfect scale factor, the method embeds a number of hidden markings in the cover material. Additionally, a better encryption technique is used to encrypt the tagged image, and a denoising CNN is used to boost robustness of the suggested algorithm.
A denoising system based on the quaternion discrete cosine transform is used by Hsu et al. in [16]. It integrates the denoising convolutional neural network (DnCNN) and the grey wolf optimizer. The derivation of binary embedding takes into account the specifics of each QDCT component as well as the different features of the different modulation schemes. Visual recognition of the retrieved binary watermark is handled by DnCNN, while speed optimization is handled by GWO.

Symmetric Dilated Convolution Neural Network
Here, digital watermarking and encryption methods are used to provide copyright protection to medical images being shared by different medical facilities. The patient's records are encrypted using chaos and Arnold encryption techniques. This ensures that no one else will have access to the patient's personal information, unless they are authorized to do so. To create a watermarked image, an encrypted image is inserted into the cover image with the use of the watermarking pixel color correlation (WPCC) approach [17]. In order to be available for future reference, the WPCC technique creates a list of the locations of pixel color values. It is possible that after an attack, some of the pixel colors will not match any other pixels. The closest color location is then set for that pixel's color, and it is then added to the list.  Working directly with pixel values makes it straightforward to select pixel information from the watermarked image, even if an assault is launched, and facilitates accurate recreation of the secret image at a later stage.
Additionally, numerous assaults are used to test the system's robustness. Visual quality suffers whenever an attack is launched, or even when an operation is carried out on a given image. Thus, in order to protect the substance and clarity of the image, watermarked images are subjected to denoising. Decryption is then used to extract the watermark image from the medical image that has been encrypted and watermarked. The patient's medical record inserted in the embedding phase represents the watermark that was extracted. The flowchart of the system is presented in Fig. 1, and more detailed information on the denoising technique relying on deep learning and capable of addressing the underlying issues is presented in the sections below.

Denoising Convolution Neural Network
A deep learning algorithm, known as a convolutional neural network (CNN), is an effective way of processing and recognizing images. It is based on a number of layers, including convolutional, pooling, and fully linked layers. Convolutional layers constitute the central part of the CNN, as that is where filters are used to extract such features like edges, textures, and shapes from the input image. The feature maps are down-sampled using the output of the convolutional layers, which reduces the dimensions while preserving the most crucial data. Additionally, CNN displays strong prior modelling capabilities with deep design. Therefore, CNN is often chosen for image denoising, and many of its variations are used (model-based methods and methods based on discrimina-tive learning). Image denoising techniques used in computer vision applications are categorized as block-matching, 3D filtering (BM3D) and weighted nuclear norm minimization (WNNM).
In denoising convolutional neural networks (DnCNN), the model is trained to predict the residual image, i.e. is the difference between the noisy input and the latent clean image, rather than to directly produce the clean image. The training effectiveness of the DnCNN is further improved and stabilized by means of the batch normalization technique. DnCNN automatically removes the latent clean picture of the hidden layers. A single DnCNN model can be trained to perform a variety of generic image denoising tasks, such as JPEG image deblocking, single image super-resolution, and Gaussian denoising. The general architecture of a DnCNN is shown in Fig. 2.

Dilated Convolution
Dilated convolution is well known for its ability to expand the receptive field while preserving the benefits of traditional 3×3 convolution. Convolutional layers are the fundamental components of CNNs. The allow to down sample the input image in order to extract its highlight features. A feature map, displaying the positions and strengths of features detected in the input image, is created by repeatedly applying the same filter to the input image. CNNs are novel in a sense that they are capable of automatically learning, in a concurrent manner, a large number of filters that are specific to a given training dataset, while complying with the demands of a given predictive modeling problem. The general definition of discrete convolution is: where F (s) = input image and k(t) =filter applied for convolution and * is the discrete convolution operator. It moves along the horizontal axis, after being reversed to determine the intersection area between F and the reversed k for each location. The intersection region represents the value of the convolution. Dilated convolutions introduce a new parameter to convolutional layers, called the dilation rate l. Kernel values are separated by the dilation rate. A 3×3 kernel with a dilation rate of two has the same field of view as a 5×5 kernel with just nine parameters. This offers a wider field of view at the same processing expense. Dilated convolutions can be used when a wide field of vision is needed but a sufficient number of convolutions or larger kernels cannot be used. Equation (2) illustrates how dilated convolution is defined with a dilation factor l: Dilated convolutions "enhance" the kernel by introducing gaps between its components. Despite the fact that specific implementations may vary, l − 1 spaces are frequently added between kernel fragments. When l = 1, it is a conventional convolution, and when l > 1, it is a dilated convolution. In order to increase the receptive field and reduce network depth, dilated convolution has been commonly used in deep neural networks. It is praised for its capacity to expand the receptive area while preserving the advantages of the conventional 3×3 convolution.

Symmetric Dilated Convolution Neural Network (SDCNN)
Model-based methods, such as WNNM and BM3D, are capable of addressing various noise issues, but their execution is time-consuming and requires the modeling of difficult priors. Meanwhile, in the case of DnCNN, instead of immediately producing the clean image, the model is trained to predict the residual image which corresponds to the difference between the noisy input and the latent clean image. Hence, the system is designed with the advantage of using a CNN-based denoiser and dilated convolution. The architecture for the proposed SDCNN approach is shown in Fig. 3.
Batch normalization is used on each training mini-batch. Consequently, the problem of gradients that vanish or enlarge is overcome, and improved learning rates become feasible. When batch normalization is not used, training efficiency decreases and is lower than in a scenario with both batch normalization and residual learning being used. As a result, the training process is improved and made simpler using the combination of residual learning and batch normalization, which also enhances the performance of denoising. ReLU is a linear function that produces zero if the input is negative and the input directly if the input is positive. It is simpler to train using ReLU and also it frequently results in better performance. ReLU aids in preventing the exponential increase in the computing cost required to run the neural network.
Basically, CNN employs forward convolution techniques to gradually extend the receptive field in order to collect contextual data. The two basic techniques for enlarging CNN's receptive field are to increase the filter size and depth. However, enlarging the filter size would add more parameters and would increase computational complexity. The size of the receptive field and network depth are thereby balanced using a system with dilated convolution. Batch normalization and residual learning are complementary and work well with Gaussian denoising. To be more specific, such an approach tends to enable speedy and stable training in addition to higher denoising performance. In order to address artefacts brought about by denoising, the proposed system models the image boundary using zero padding. The system with dilated convolution aids in preserving the original arrangement of data and decreases memory usage thanks to relying on effective computations.

Experimental and Result Analysis
The proposed method was examined with the use of a range of different medical image types, including ultrasound, CT scan, MRI, and X-ray, as well as common test images, like those of a lena, a baboon, and a pepper. Specific 512×512 pixel images were extracted from open access medical databases [19], [20].
In the experiment, the patient's medical record is used as a watermark image with the size of 256×256 pixels. Some of the input images and he watermark image are shown in Fig. 4.
Chest X-ray Head CT scan Patient record Fig. 4. Sample of input images and watermark image.
The addition of noise, filtering, rotation, compression and resizing, imitated attacks on the watermarked images in order to evaluate the reliability of the suggested methodology. Denoising results obtained with the use of the proposed method are compared with outcomes achieved by existing systems in order to verify performance-related parameters. In the WPCC [17] method, watermarking is performed successfully and the watermark is extracted as well, but for some of the attacks the results are not good in terms of clarity of the image. Hence, the denoising approach is proposed to improve quality of the watermark images and to increase its readability. A comparison of the results before denoising, i.e. achieved with the use of the WPCC method, and after denoising, i.e. with the proposed and existing method [21], is shown in Figs. 5 and 6, respectively.

Average filter Median filter Motion blur
Proposed SDCNN WPCC method [17] Existing method [21] Fig. 5. Extracted watermark images after performing different attacks on a chest X-ray image.
Along with graphical results, SSIM, PSNR, MSE, and other measures are used to assess system performance. The fidelity of embedding algorithms is assessed using the structural similarity index measure (SSIM) and the peak signal-to-noise ratio (PSNR). As quantitative measures for assessing the watermark image quality following attacks, SSIM and PSNR have been selected and calculated by: where µ x , µ y are the mean values and σ x , σ y are the standard deviation values of the pixels in patch x and y, respectively. σ xy is the covariance of patches x and y and C 1 = (k 1 L) 2 and C 2 = (k 2 L) 2 are small constants to avoid instability while the denominator is close to zero. L is the dynamic range of pixel values, k 1 = 0.01 and k 2 = 0.03 . The higher the SSIM value, the smaller the distortion and better the enhancement.
where v is the minimum number of bits that can represent the potential maximum intensity in a given image and mean squared error (MSE) is calculated to quality of the system's performance.
Some attacks purposefully degrade image quality by adding noise, therefore lowering readability. The watermarked image is subjected to filtering, producing a difference map composed of noise. These attacks aim to remove the watermark information often having the form of additive noise. Hence, this system will help find the precise watermark even after performing various attacks. The proposed method creates images of excellent quality, with a high PSNR of 38.06 dB, for a variety of medical imaging modalities. After being extracted from the attacked picture, the watermark's quality is evaluated and compared with the existing system [21]. The results of an experiment carried out with the use of various images are displayed for a chest X-ray and a CT head scan. Table 1 compares the performance of the proposed system with [21]. In a similar manner, Table 2 shows the results of a comparison in a scenario in which the head CT scan is used as input. Tables 1 and 2 present the results of denoising operations performed to improve the quality of the damaged image -a necessity in healthcare applications.
Several image quality metrics, including bit error rate (BER) and normalized cross-correlation (NCC), have been used to assess the reliability of the retrieved watermark and are calculated using Eqs. (5) and (6). The system's vulnerability to image processing techniques is determined by its BER and NCC values. The system is more resistant to attacks with higher NCC and lower BER figures. Therefore, results obtained with the use of the proposed system are compared with those typical of the previous system, in order to assess the robustness of the new method.
where w o (i, j) is the actual embedded logo bit at coordinates (i, j) and w x (i, j) is the extracted logo bit at coordinates (i, j), with m×n being the dimensions of the logo.
Normalized cross correlation (NCC)  Fig. 8. Evaluation of chest X-ray BER in relation to current method.
The existing system described in [17] relied on a pixel color correlation-based watermarking technique and employed encryption to guarantee security of medical images. To improve the interpretation of the retrieved watermark, paper [21] introduced a denoising autoencoder. Figure 7 demonstrates the quality of the watermark image retrieved after an attack.
The proposed system is able to achieve higher NCC values. Figure 8 shows the performance of the system in terms of bit error rate calculated. The proposed system achieved a lower BER as compared with the other approach. The proposed system outperforms the benchmark schemes, as shown in Figs. 7 and 8, in terms of NCC and BER, respectively.

Conclusion
The proposed method uses residual learning to distinguish between noisy observations and noise. Batch normalization and residual learning are integrated to accelerate training and enhance denoising performance. A detailed analysis of experimental results has demonstrated that the suggested approach achieves good denoising performance, producing large quantities of good quality images.