Mobile QR Code QR CODE

2024

Acceptance Ratio

21%


  1. (School of Information and Intelligent Manufacturing, Chongqing City Vocational College, Chongqing, 402160, China)
  2. (Office of Academic Affairs, Chongqing City Vocational College, Chongqing, 402160, China)



Hybrid reality technology, Federated learning, Differential privacy, CNN

1. Introduction

As information technologies and the medical field rapidly develop, medical images play an important role in disease diagnosis and treatment. However, medical images contain patient privacy information, such as personal identity and disease details, which poses potential privacy leakage risks in data sharing and analysis [1]. To protect patient privacy and data security, Differential Privacy (DP) technology has emerged. It is a method of protecting individual privacy by adding noise to data, which not only protects the privacy and security of data, but also maintains the availability and accuracy of data to a certain extent [2,3]. DP plays a key role in protecting the privacy of medical image data. However, the noise DP introduces will inevitably affect the model performance in practical applications, which in turn has a potential adverse impact on the diagnosis of disease. To address these issues, a Medical Image Classification (MIC) algorithm based on Federated Learning (FL) and DP hybrid reality technology is proposed. MIC combines the advantages of virtual reality and augmented reality, enabling the integration of virtual objects with the real environment and allowing doctors to observe and analyze medical images more intuitively. FL is a distributed learning method that trains models on local devices and aggregates model parameters to achieve cross device model training. FL can protect the privacy of medical image data on local devices and improve the accuracy and generalization ability of models through model aggregation [4,5]. The improvement of FL and the integration of DP are major innovations in the research. The privacy protection of medical image data is realized by introducing Gaussian mechanism and adding encryption algorithm and noise protection mechanism when the parameter update information provided by each local model is aggregated in the central server. Compared with traditional methods, the research method not only protects the data privacy, but also ensures the accuracy and efficiency of MIC. The contribution of the research is to introduce DP mechanism to resist various background knowledge attacks to protect the security of medical images, prevent medical privacy disclosure, and provide solid data security for the development of the medical field. The research is mainly divided into four parts. Firstly, a summary and discussion are conducted on the MIC model. Secondly, research is conducted on MIC models based on FL and DP. Next, a results analysis is conducted on this model. Finally, the research results are summarized, indicating the feasibility and effectiveness of the research algorithm in MIC.

2. Related Works

Accurate MIC helps doctors develop more effective treatment plans, thereby improving treatment outcomes and patient survival rates. Therefore, many scholars have conducted relevant research on the MIC model. Jin et al. put forward a classifying method using a simplified inception module and Hadamard attention mechanism for MIC. These results confirmed that the method had shown excellent performance in MIC tasks, providing a new approach and approach for MIC [6]. In response to the issues of MIC, Abirami et al. put forward a COVID-19 classifying method using generative adversarial network medical image synthesis. These results confirmed that the method had shown good performance in COVID-19MIC tasks [7]. Kurm et al. put forward a deep learning-based classifying method for magnetic resonance image classification in brain tumor detection. These results confirmed that this method had high accuracy and reliability in brain tumor detection and could provide a more accurate diagnostic basis for clinical doctors [8]. Priyanka et al. proposed a metaheuristic technique for training Artificial Neural Network (ANN) for MIC. These results confirmed that metaheuristic techniques could find the optimal solution faster and improve training efficiency compared to traditional random search or grid search methods [9].

FL can train models without sharing raw data, thereby protecting user privacy and data security. This is particularly important for data processing in sensitive fields such as healthcare and finance. Therefore, some domestic and foreign researchers have made outstanding achievements in FL. For the efficiency and accuracy in FL, Envelope et al. proposed a hybrid improved FL method combining distributed strategy and heuristic enhancement. These results confirmed that the hybrid improved FL method had shown significant advantages in improving FL efficiency and accuracy. This method could effectively address issues such as data heterogeneity and communication limitations by combining distributed strategies and heuristic enhancement techniques, improving the efficiency and accuracy of model training [10]. Kweon et al. put forward a privacy protection method with FL to protect privacy in path selection behavior models. These results confirmed that this method achieved effective analysis and learning of path selection behavior while protecting privacy. In addition, it could be flexibly expanded and adjusted according to different application scenarios and requirements [11]. For the optimization of cell edge user performance in large-scale MIMO systems in wireless communication, Vu et al. proposed a large-scale MIMO technology based on wireless FL. These results confirmed that this technology had shown significant advantages in improving the performance of cell edge users in large-scale MIMO systems without cells [12]. Shao et al. put forward a privacy palmprint recognition method with joint hash learning to address the privacy protection and accuracy in palmprint recognition. These results confirmed that the privacy palm print recognition method based on joint hash learning achieved high recognition accuracy while ensuring privacy. Compared with traditional centralized learning methods, this method could better balance the relationship between data privacy and recognition performance [13].

To sum up, many domestic and foreign scholars have carried out relatively rich research and analysis in the field of MIC. They have made great breakthroughs in the improvement and application of FL. However, there are few studies on the application of FL to MIC, so this research has strong potential application value.

3. Methods

FL is a machine learning technique that involves distributed model training across multiple data sources, without the need to store the dataset in a central location. In this study, the adaptive gradient descent-based FL is used to classify images, and then the privacy MIC model is constructed using FedAvg-based FL privacy protection algorithm.

3.1. Federated Learning Image Classification Algorithm Based on Adaptive Gradient Descent

Image classification is a processing method that distinguishes images of different categories based on their semantic information. Traditional methods mainly rely on Convolutional Neural Network (CNN), which is an ANN specifically designed for image processing and recognition [14]. In Fig. 1, CNN is mainly composed of a 5-layer structure.

Fig. 1. Convolutional neural network model structure.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig1.png

As an important image classifying tool, CNN has the ability to extract hierarchical features from original images. CNN can gradually extract high-level semantic features from low-level pixel features by processing multiple convolutional layers layer-by-layer. However, training CNN requires a large amount of data, which may lead to a decrease in the model's generalization ability. To address this issue, FL is proposed. Its core idea is to train models in a distributed manner across multiple devices, rather than concentrating all data on one device [15]. In FL, the updating and aggregation of model parameters are represented by Eq. (1).

(1)
$ G_{t} +1=G_{t} +{\eta }{n}\sum i=1{m}(L{i}_{{t}} +1-G_{t}). $

In Eq. (1), $G_{t} +1$ represents the model parameters for the $t+1$ round. $G_{t} $ refers to the model parameters for round $t$. ${\eta }$ is the learning rate. ${n}$ means the number of samples. ${m}$ represents the number of devices participating in FL. $L{i}_{{t}} +1$ refers to the model parameters of the $i$th device in round $t+1$ [16]. FL can be classified based on the characteristics of the data and model of the participants. According to the distribution of the data sources of the participants, FL can be divided into horizontal FL, vertical FL, and federated transfer learning [17]. Fig. 2 is a schematic diagram of FL in both horizontal and vertical directions.

Fig. 2. Horizontal federated learning and vertical federated learning.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig2.png

Horizontal FL is suitable for scenarios where the dataset of participants has the same features but different users. Vertical FL is suitable for situations where the participants in the dataset have the same user but different characteristics. Federated transfer learning is suitable for situations where the participating dataset users and features are different [18]. Fig. 3 shows the framework of FL.

Fig. 3. Framework diagram of federation learning model.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig3.png

In Fig. 3, FL mainly contains two parts, namely FL server and multiple client devices. The FL server is the core of the entire FL system, responsible for coordinating and managing communication and data exchange between various client devices [19]. The client device is an important component of FL, responsible for local model training and data preprocessing. The objective function of the central server is represented by Eq. (2).

(2)
$ F(w)=\sum i=1{m}\left(f_{i} (w)\right) . $

In Eq. (2), $f_{i} (w)$ represents the loss function of parameter $w$ in the sample. ${m}$ refers to the total client devices when training. ${i}$ means the number of clients, then the objective function of the ${i}$th client is represented by Eq. (3).

(3)
$ F_{k} (w)=\frac{1}{n_{k} } \sum _{i\in d_{k} }f_{i} (w). $

In Eq. (3), $d_{k} $ represents the local dataset of the $k$th client. Adaptive gradient descent algorithm is an optimization method mainly used for parameter optimization in machine learning and deep learning models. Common adaptive gradient descent algorithms include Adagrad, Adadelta, Adam, etc. Adagrad can automatically adjust the learning rate based on the historical gradient size of each parameter, for sparse or non-stationary data. The historical sum of parameter gradient squared is represented by Eq. (4).

(4)
$ cache+=gradient ** 2 . $

In Eq. (4), $cache$ represents the accumulator. At each time step $t$, the gradient on the $i$th dimension of the parameter vector $f$ is $gradient$. Eq. (5) is the update of the parameter vector.

(5)
$ theta-=\frac{learning\_ rate*gradient}{(np.sqrt(cache) + epsilon)}. $

In Eq. (5), $learning\_ rate$ represents the learning rate. $epsilon$ refers to a smaller constant used to prevent errors in dividing by zero. Adadelta utilizes the Root Mean Square Prop (RMSprop), which scales the learning rate by exponentially moving the average of the gradient squared. In the Adadelta algorithm, the exponential decay moving average of the square of the parameter update difference $\Delta W$ is represented by Eq. (6).

(6)
$ \Delta X2t-1=\beta 1\Delta X2t-2+\left(1-\beta 1\right)\Delta Wt\nonumber\\ \quad -1\odot \Delta Wt-1. $

In Eq. (6), $\Delta X2t-2$ represents the $\Delta X2t$ value of the previous step. $\Delta Wt-1$ refers to the difference in parameter updates for the current step. $\odot $ means element wise multiplication, which is the Hadamard product. The parameter update of Adadelta is represented by Eq. (7).

(7)
$ \Delta Wt=-\Delta X2t-1+\varepsilon ht+\varepsilon \partial L\partial Wt-1 . $

In Eq. (7), $ht$ represents the exponential decay moving average of the square of the gradient $\partial L\partial W$ for each iteration. $\varepsilon $ refers to a smaller constant [20]. Adam is a stochastic optimization method with adaptive momentum. The first-order moment estimation calculation of gradient information is represented by Eq. (8).

(8)
$ \hat{m}=\frac{1}{t} \cdot (\beta _{1} \cdot \hat{m}+(1-\beta _{1} )\cdot g) . $

In Eq. (8), $g$ represents the current gradient. $t$ refers to the quantity of iterations. $\beta _{1} $ is the first-order moment's exponential decay rate of the gradient. Adam combines the characteristics of Adagrad and RMSprop, while considering the exponential moving average of historical gradients and gradient squares of parameters. Therefore, the Adam algorithm is integrated into the FL image classification algorithm to improve the training efficiency and performance of the model. The study process does not screen the demographics of specific medical image data or the details of ethical considerations based on the subjects' medical data.

3.2. Construction of Federated Learning Privacy Protection Model Based on Fedavg

Privacy protection occupies a core position in FL systems, mainly because such systems often involve a large amount of user data. In traditional model training, data are usually stored centrally and trained uniformly. However, this method carries the risk of data privacy leakage. Due to all data being stored in the same location, user privacy information may be stolen in the event of data leakage. To address this issue, homomorphic encryption technology is introduced into FL systems. Homomorphic encryption is based on the computational complexity theory of mathematical problems, allowing for the processing of encrypted data and obtaining an output [21]. After decrypting this output, the result is consistent with the output obtained after the same processing as the unencrypted original data. This technology provides strong privacy protection for data processing in FL systems, ensuring the security of user data while maintaining the accuracy of data processing and analysis. Fig. 4 shows homomorphic encryption technology.

Fig. 4. Flow chart of homomorphic encryption technology.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig4.png

Assuming that $E(m)$ represents the result of encrypting message $m$ and $F(m')$ represents the result of decrypting encrypted result $m'$, Eq. (9) needs to be satisfied in encryption.

(9)
$ E(m) = c. $

In Eq. (9), $m$ represents a plaintext message. $c$ refers to ciphertext. In decryption, Eq. (10) needs to be satisfied.

(10)
$ F(c) = m. $

The decrypted plaintext is equal to the original plaintext. For any function $f(x)$, if there are valid algorithms A and B, the homomorphic expression of the encryption method is represented by Eq. (11).

(11)
$ F(E(m1)+E(m2))=f(m1)+f(m2). $

In Eq. (11), $F$ represents the decryption function. $E$ refers to the encryption function. $f$ is a mapping function from plaintext messages to ciphertext. FedAvg is the first algorithm proposed by Google that fully defines the federated optimization process, allowing multiple users to train a machine learning model simultaneously. There is no need to upload any private data to the server during training. Local users are responsible for training local data to obtain local models. The central server is responsible for weighted aggregation of local models to obtain global models. After multiple iterations, a model that tends towards centralized machine learning results is finally obtained. This effectively reduces many privacy risks associated with traditional machine learning source data aggregation. Fig. 5 shows the FL structure based on FedAvg.

Fig. 5. Structure diagram of federation learning model based on FedAvg.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig5.png

In FedAvg's FL, the optimization objective function for each medical institution is represented by Eq. (12).

(12)
$ \arg \min L(\varpi ) = E[L(\hat{y},y)] . $

In Eq. (12), $L(\hat{y},y)$ refers to the loss function at each medical institution. The loss function at each medical institution is represented by Eq. (13).

(13)
$ L(\theta ) = \Sigma \left((h(x\wedge (i); \theta ) - y\wedge (i))^{2} \right) . $

In Eq. (13), $\theta $ refers to the model parameters. $h(x\wedge (i); \theta )$ represents the predicted output of the model for the $i$th sample. $y\wedge (i)$ refers to the true label of the $i$th sample. $\Sigma $ is the sum of all samples [22]. The study will scale local parameters for updates to replace the original parameter update amount to enhance the protection of specific participating clients. The scaled parameter update amount is represented by Eq. (14).

(14)
$ \Delta m_{k} =\frac{\Delta m_{k} }{\max (1,\frac{\left\| \Delta m_{k} \right\| _{2} }{S} )} . $

In Eq. (14), $m_{k} $ represents the local model updated by each client. In FL, a method of adding noise in each iteration update is studied to blur the update amount of the local model. After the $i$th iteration, the update of the global model is represented by Eq. (15).

(15)
$ M_{i+1} =M_{i} +\frac{1}{k} (\sum _{k=0}^{K}\Delta m_{k} '+N_{GS} (\tau ^{2} S^{2} ) ). $

In Eq. (15), $\tau ^{2} S^{2} $ represents the average noise. Fig. 6 shows the construction of MIC based on FL.

In Fig. 6, firstly, based on hybrid reality technology, a secure Transport Layer Security (TLS) channel is established between various medical institutions and central servers [23]. To ensure the privacy and security of data in the entire FL system, a homomorphic encryption module is adopted. After starting the model training process, the central server and various medical institutions follow the FL process for model training. During this process, the transmission of data and model parameters is protected through homomorphic encryption modules. To ensure the privacy of data from various medical institutions, the classification model is evaluated locally using a test set. After the training is completed, each medical institution can deploy the obtained optimal model to their own platform for practical application. The accurate analysis of medical image data can be realized on the premise of ensuring data privacy through the above processes.

Fig. 6. Building flowchart of federated learning privacy protection model based on FedAvg.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig6.png

The data set is divided into training set, verification set, and test set according to the ratio of 8:1:1 in the data preprocessing stage to verify the performance of this algorithm. Among them, 80% of the data are used as a training set, namely 152 images, for the algorithm training. Of the remaining 20%, half of the data are used as a validation set, a total of 19 images, to adjust the hyperparameters of the model and optimize the model structure during training. The other half of the data are used as a test set, a total of 19 images, to objectively evaluate the model's performance on previously unseen data. After the model training is completed, the test set data are input into the trained model. The prediction results of the model are compared with the real labels of the test set to calculate the performance indicators of the model. Specificity and sensitivity are two important indexes to evaluate the performance of the model. Specificity can measure the model's ability to correctly identify negative cases. Specificity is calculated by counting the samples in the test set correctly judged negative by the model and the negative samples incorrectly judged positive by the model. Sensitivity is a measuring index of the model's ability to correctly identify positive examples. The sensitivity is calculated by counting the samples in the test set correctly judged positive by the model and the positive samples incorrectly judged negative by the model.

4. Results and Discussion

To verify the applicability and superiority of the MIC model based on FL and DP, performance comparison experiments were conducted between the MIC model based on FL and DP, Support Vector Machine (SVM), and K-Nearest Neighbor (KNN) in different situations.

4.1. Performance Analysis of Federated Learning Image Classification Algorithm Based on Adaptive Gradient Descent

To verify the performance of the research algorithm, in the data preprocessing stage, the research adopted a fixed training method and divided the data set into training set, verification set, and test set according to the ratio of 8:1:1. Among them, 80% of the data were used as a training set, namely 152 images, for the algorithm model training. Of the remaining 20%, half of the data were used as a validation set, a total of 19 images, to adjust the hyperparameters of the model and optimize the model structure during training. The other half of the data were used as a test set, a total of 19 images, to objectively evaluate the model's performance on previously unseen data. The traditional MIC algorithms SVM and KNN were selected as reference algorithms to carry out the accuracy comparison experiment. The programming environment used in the experiment was Python 3.7, the hardware facility used Intel(R) Core (TM) i7-10510U, the CPU frequency was 1.8GHz, and the memory was 16GB, which provided sufficient computing resources for the experiment to run complex algorithms and process a large amount of data. In terms of operating system, the research chose Windows 11 64-bit operating system. Fig. 7 shows the accuracy of different algorithms.

Fig. 7. Accuracy comparison of different algorithms.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig7.png

In Fig. 7, when the iteration was 3, the research algorithm's accuracy curve tended to stabilize, fluctuating around 95%. The traditional model KNN's accuracy curve tended to stabilize at an iteration of 25, fluctuating around 90%. The accuracy curve of SVM tended to stabilize at iteration 30, fluctuating around 85%. In summary, the research algorithm was significantly superior to these two traditional models in terms of iteration efficiency and accuracy. During training, the experiment delved into the impact of different numbers of client participation on FL training and compared the performance of FL with KNN and SVM with different client participation. The quantity of clients set ranged from 3 to 12 to comprehensively evaluate the algorithm performance in different scenarios. Fig. 8 shows the algorithm performance curves for different numbers of clients.

Fig. 8(a) shows a comparison of algorithm performance curves with 8-12 clients. In the case of 8-12 clients, as the clients increased, the Area Under the Curve (AUC) of these three algorithms showed a decreasing trend. However, compared to KNN and SVM, this research algorithm had shown superior performance. When the clients were 8, its AUC was the highest, reaching 92.23%. Fig. 8(b) shows a comparison graph of algorithm performance curves with clients of 3-7. When the clients were 3-7, the algorithm in this study showed better advantages compared to KNN and SVM. When the clients were 3, its AUC value was the highest, at 97.58%. To evaluate the performance trends of the research algorithm and its various local models during the increasing communication rounds, experimental verification was conducted on FL, KNN, and SVM, respectively, in Fig. 9.

Fig. 8. Performance comparison with different number of clients.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig8.png

Fig. 9. Performance comparison with different communication rounds.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig9.png

Fig. 9(a) shows the comparison of algorithm performance curves for different communication rounds when the local clients are 8. As the communication rounds increased, the accuracy curves of the three algorithms showed an upward trend. When the communication rounds were 20, the accuracy of this research algorithm tended to stabilize at 88.54%. KNN tended to stabilize with an accuracy of 86.95% when the communication rounds were 25. When the communication rounds were 130, the accuracy of KNN reached 86.43%. Fig. 9(b) shows the comparison of algorithm performance curves for different communication rounds when the local clients are 12. As the communication rounds increased, the accuracy curves of these three algorithms also showed an upward trend. This research algorithm tended to stabilize when the communication rounds were 10, with an accuracy of 88.67%. KNN tended to stabilize with an accuracy of 86.25% when the communication rounds were 20. When the communication rounds were 60, KNN's accuracy reached 85.73%. According to Figs. 9(a) and 9(b), the performance of the research algorithm in the number of clients 3-7 was higher than that in the number of clients 8-12, which might be because the data distribution might become more diversified and complex with the increase of the number of clients, which increased the difficulty of model learning. To verify the performance of different deep learning algorithms in MIC, the research was based on adaptive gradient descent FL image classification algorithm and deep learning method. Convolutional Neural Networks (CNN), residual networks, ResNet, Transfer Learning, Generative Adversarial Networks (GANs) and Capsule Networks (CapsNet) were compared. The performance comparison table of different deep learning methods is shown in Table 1.

From Table 1, compared with other deep learning methods, the research algorithm had the highest accuracy of 95.4%, which indicated that it had high recognition accuracy in MIC tasks. In terms of overfitting value, the research algorithm had the lowest overfitting value, which was 0.015, which indicated that it effectively prevented the occurrence of overfitting in the training process. The performance of the research algorithm was better than other deep learning methods, and it effectively improved the accuracy of MIC.

Table 1. Performance comparison table of different deep learning methods.

Method

Training time (hours)

Accuracy rate

Overfitting values

CNN

10

86.7%

0.040

ResNet

10

88.5%

0.020

Transfer Learning

10

91.3%

0.030

CapsNet

10

89.2%

0.035

Research algorithm

10

95.4%

0.015

4.2. Performance Analysis of Differential Privacy Medical Image Classification Model Based on Federated Learning

The learning rate directly affects the efficiency and final performance of model training. To comprehensively evaluate the research algorithm's performance under different learning rates, three representative learning rates of 0.1, 0.01, and 0.001 were selected for experiments. Meanwhile, robustness comparison experiments were conducted between the models under these learning rates and traditional KNN. Fig. 10 shows the impact of initial learning rate on model performance.

Fig. 10. Effect of initial learning rate on model performance.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig10.png

Fig. 10(a) presents KNN's accuracy line graph under different learning rates. When the learning rate was 0.1 and the iteration was 9, the accuracy of KNN reached its highest level, at 98%. Fig. 10(b) presents KNN's loss curve under different learning rates. When the iteration was 8 and the learning rate was 0.1, the loss curve of KNN tended to stabilize, with a loss value of 0.07, which was the lowest. Fig. 10(c) presents the research algorithm's accuracy line graph under different learning rates. When the learning rate was 0.1 and the iteration was 10, the accuracy of the research algorithm was the highest, at 99.82%. Fig. 10(d) presents the research algorithm's loss curve under different learning rates. As iterations increased, the loss curve of the research algorithm became smoother compared to KNN. In summary, under the same learning rate, the research algorithm had higher accuracy and smoother loss curve compared to KNN. To verify the impact of different subsets on the research algorithm, Task 1 was set to divide the dataset into 10 subsets, and Task 2 was set to divide the dataset into 5 subsets. Table 2 shows the performance comparison of different models under different conditions.

Table 2. Performance comparison table of different models.

Metrics

Task 1

Task 2

/

SVM

KNN

Research model

SVM

KNN

Research model

F1

72.61%

75.82%

78.93%

85.48%

88.96%

89.96%

AUC

87.37%

89.28%

90.38%

95.73%

96.43%

98.05%

Accuracy

87.52%

89.35%

90.11%

90.26%

93.31%

95.27%

Specificity

85.49%

88.26%

89.37%

92.54%

94.82%

95.89%

Sensitivity

75.31%

76.49%

77.25%

85.28%

88.37%

89.35%

Fig. 11. Loss convergence curves for server training and local training.

../../Resources/ieie/IEIESPC.2025.14.5.579/fig11.png

In Table 2, the research algorithm in Task 1 outperformed KNN and SVM in F1, AUC, accuracy, and specificity indicators. The F1 of this research algorithm was 78.93%, AUC was 90.38%, accuracy was 90.11%, specificity was 89.37%, sensitivity was 77.25%, higher than 76.49% of KNN and 75.31% of SVM. In Task 2, the research algorithm also performed well. Its F1 reached 89.96%, AUC was 98.05%, accuracy was 95.27%, specificity was 95.89%, and sensitivity was 89.35%. In summary, this research algorithm had excellent performance and applicability in FL. These data indicated that the research algorithm had good generalization ability and stability and could adapt to the needs of different datasets and task scenarios. Fig. 11 shows the loss convergence curves of the study model for server-side training and local training under different subsets.

Fig. 11(a) shows the training loss curve of the server for Task 1. When the iterations reached 50, the loss convergence curve of the research algorithm tended to stabilize, and its loss value was 0.18. Fig. 11(b) shows the training loss curve of the server for Task 2. When the iterations reached 45, the loss convergence curve of the research algorithm tended to stabilize. Fig. 11(c) shows a comparison of the loss curves between the local training model and the research algorithm for Task 1. As the iterations increased, its loss curve's convergence trend was consistent with that of the local training model. Fig. 11(d) presents the comparison of the loss curves between the locally trained model and the research algorithm in Task 2. Their loss curves showed a consistent trend. The final loss value of the research algorithm was 0.38, indicating that it had good learning performance.

5. Discussion

With the continuous progress of medical technologies and the era of big data, MIC plays an increasingly important role in disease diagnosis and treatment planning. However, the privacy and security of medical image data have become increasingly prominent. Therefore, a federated learning DP MIC algorithm based on mixed reality technology is proposed to improve the accuracy and efficiency of image classification while protecting patients privacy.

Based on the performance analysis of adaptive gradient descent-based FL image classification algorithm, the proposed algorithm had obvious advantages in many aspects. Compared with the traditional KNN and SVM models, the research algorithm achieved stable accuracy within a few iterations, and the accuracy was 95%, which was significantly higher than the performance of the traditional model. This result fully proved the effectiveness of DP combining mixed reality technology and FL in improving model training efficiency and classification performance. This result was similar to Liu et al. 's research on federated edge learning CSIT-free model aggregation based on reconfigurable intelligent surfaces [24]. Furthermore, under the scenario of different number of clients, the research algorithm also showed strong stability and robustness. Regardless of 3-7 clients or 8-12 clients, the algorithm maintained a high AUC value and had better performance than other algorithms. This feature showed that the research algorithm flexibly adapted to data sets of different sizes and distributions, showing strong generalization ability. According to the effect analysis of DP MIC model based on FL, medical image data can be displayed more intuitively by combining mixed reality technology. Meanwhile, doctors' diagnosis efficiency and accuracy of diseases can be improved. With the increase of iterations, the loss convergence curve of the research algorithm tended to be stable, which indicated that the model had a good learning effect. In Task 1, when the iterations were 50, the training loss of the server converged to 0.18. In Task 2, when the iterations were 45, the server training loss converged to a lower value, further verifying the validity and stability of the study model, which was consistent with the results obtained by Park and Ko in the practical FL study for heterogeneous model deployment [25].

In summary, the proposed federated learning DP MIC algorithm based on mixed reality technology shows significant advantages in many aspects. In the future, the application prospect of mixed reality technology in MIC can be further explored. The algorithm performance can be continuously optimized to meet the practical application needs.

6. Conclusion

To solve the high-dimensional data processing and model generalization in MIC, while ensuring the security of data privacy, this study proposed a FL DP MIC algorithm based on hybrid reality technology. The analysis results confirmed that as the learning rate increased, the accuracy of the research algorithm also gradually improved. When the learning rate was 0.1 and the iterations were 10, the research algorithm's accuracy reached the highest of 99.82%. When the dataset was divided into 10 subsets, the research algorithm outperformed KNN and SVM in F1, AUC, accuracy, and specificity indicators. Its F1 was 78.93%, AUC was 90.38%, accuracy was 90.11%, specificity was 89.37%, and sensitivity was 77.25%. In summary, the performance of the research algorithm is excellent, further verifying the feasibility and effectiveness of the FL DP MIC algorithm based on hybrid reality technology. Although significant results have been achieved, some limitations still exist. Due to the need for data privacy protection, experimental validation is only conducted on limited medical institutions and datasets. Further testing and validation are needed on large-scale datasets in the future.

Funding

The research is supported by: Chongqing Natural Science Foundation Project ``Research on Key Technologies and Applications of Interaction in Puncture Surgery Based on Hybrid Reality Technology'' (No. CSTB2022NSCQ-MSX1256); 2022 Chongqing Education Commission Science and Technology Project, ``Research on Real time Monitoring Platform Technology for Robot System Status Based on Digital Twin Simulation Modeling'', (No. KJZD-K202203901).

REFERENCES

1 
X. Fan, Y. Wang, Y. Huo, and Z. Tian, ``1-Bit compressive sensing for efficient federated learning over the air,'' IEEE Transactions on Wireless Communications, vol. 22, no. 3, pp. 2139-2155, 2023.DOI
2 
N. Shi and R. A. Kontar, ``Personalized federated learning via domain adaptation with an application to distributed 3D printing,'' Technometrics, vol. 65, no. 3, pp. 328-339, 2023.DOI
3 
M. F. Pervej, R. Jin, and H. Dai, ``Resource constrained vehicular edge federated learning with highly mobile connected vehicles,'' IEEE Journal on Selected Areas in Communications, vol. 41, no. 6, pp. 1825-1844, 2023.DOI
4 
Y. Zou, Z. Wang, and X. Chen, ``Knowledge-guided learning for transceiver design in over-the-air federated learning,'' IEEE Transactions on Wireless Communications, vol. 22, no. 1, pp. 270-285, 2023.DOI
5 
X. Deng, J. Li, and C. Ma, ``Low-latency federated learning with DNN partition in distributed industrial IoT networks,'' IEEE Journal on Selected Areas in Communications, vol. 41, no. 3, pp. 755-775, 2023.DOI
6 
Y. Jin, Z. You, and N. Cai, ``Simplified inception module based Hadamard attention mechanism for medical image classification,'' Computer and Communication, vol. 11, no. 6, pp. 1-18, 2023.DOI
7 
R. N. Abirami, P. M. D. R. Vincent, and V. Rajinikanth, ``COVID-19 classification using medical image synthesis by generative adversarial networks,'' International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems (IJUFKS), vol. 30, no. 3, pp. 385-401, 2022.DOI
8 
Y. Kurmi and V. Chaurasia, ``Classification of Magnetic Resonance Images for Brain Tumor Detection,'' IET Image Processing, vol. 14, no. 12, pp. 2808-2818, 2020.DOI
9 
Priyanka and D. Kumar, ``Meta-heuristic Techniques to Train Artificial Neural Networks for Medical Image Classification: A Review,'' Recent Advances in Computer Science and Communications, vol. 15, no. 4, pp. 513-530, 2022.DOI
10 
D. P. P. Envelope and M. W. Envelope, ``A hybridization of distributed policy and heuristic augmentation for improving federated learning approach,'' Neural Networks, vol. 146, no. 7, pp. 130-140, 2022.DOI
11 
Y. Kweon, B. Sun, and B. B. Park, ``Preserving privacy with federated learning in route choice behavior modeling,'' Transportation Research Record, vol. 2675, no. 10, pp. 268-276, 2021.DOI
12 
T. T. Vu, D. T. Ngo, and N. H. Tran, ``Cell-Free Massive MIMO for Wireless Federated Learning,'' IEEE Transactions on Wireless Communications, vol. 19, no. 10, pp. 6377-6392, 2020.DOI
13 
H. Shao and D. Zhong, ``Towards privacy palmprint recognition via federated hash learning,'' Electronics Letters, vol. 56, no. 25, pp. 1418-1420, 2020.DOI
14 
E. Elyan, P. Vuttipittayamongkol, and P. Johnston, ``Computer vision and machine learning for medical image analysis: recent advances, challenges, and way forward,'' Artificial Intelligence Surgery, vol. 2, no. 1, pp. 24-45, 2022.DOI
15 
S. Guo, K. Zhang, and B. Gong, ``Sandbox computing: A data privacy trusted sharing paradigm via blockchain and federated learning,'' IEEE Transactions on Computers, vol. 72, no. 3, pp. 800-810, 2023.DOI
16 
N. K. Ray, D. Puthal, and D. Ghai, ``Federated learning,'' IEEE Consumer Electronics Magazine, vol. 10, no. 6, pp. 106-107, 2021.DOI
17 
J. Zhao, X. Chang, and Y. Feng, ``Participant selection for federated learning with heterogeneous data in intelligent transport system,'' IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 1, pp. 1106-1115, 2023.DOI
18 
Y. M. Saputra, D. T. Hoang, and D. N. Nguyen, ``Dynamic federated learning-based economic framework for Internet-of-Vehicles,'' IEEE Transactions on Mobile Computing, vol. 22, no. 4, pp. 2100-2115, 2023.DOI
19 
H. Zhu, J. Kuang, and M. Yang, ``Client selection with staleness compensation in asynchronous federated learning,'' IEEE Transactions on Vehicular Technology, vol. 72, no. 3, pp. 4124-4129, 2023.DOI
20 
Z. Liu, S. Chen, and J. Ye, ``DHSA: efficient doubly homomorphic secure aggregation for cross-silo federated learning,'' Journal of Supercomputing, vol. 79, no. 3, pp. 2819-2849, 2023.DOI
21 
M. Wasilewska, H. Bogucka, and H. V. Poor, ``Secure federated learning for cognitive radio sensing,'' IEEE Communications Magazine, vol. 61, no. 3, pp. 68-73, 2023.DOI
22 
J. D. Fernandez, S. Potenciano, and C. M. Lee, ``Privacy-preserving federated learning for residential short-term load forecasting,'' Applied Energy, vol. 326, no. 15, pp. 174-187, 2022.DOI
23 
S. N. Amin, P. Shivakumara, and T. X. Jun, ``An augmented reality-based approach for designing interactive food menu of restaurant using Android,'' Artificial Intelligence and Applications, vol. 1, no. 1, pp. 26-34, 2023.DOI
24 
H. Liu, X. Yuan, and Y. J. A. Zhang, ``CSIT-free model aggregation for federated edge learning,'' IEEE Wireless Communications Letters, vol. 10, no. 11, pp. 2440-2444, 2021.DOI
25 
J. Y. Park and J. G. Ko, ``FedHM: Practical federated learning for heterogeneous model deployments,'' ICT Express, vol. 10, no. 2, pp. 387-392, 2024.DOI

Author

Qun Luo
../../Resources/ieie/IEIESPC.2025.14.5.579/au1.png

Qun Luo obtained her master's degree in computer application technology from Chongqing University in 2013. Presently, she is working as an associate professor and director of the Information Technology Teaching and Research Office at Chongqing City Vocational College. She has published multiple articles in the field of computer applications. Her areas of interest include machine learning, virtual technology, and artificial intelligence.

Zhendong Liu
../../Resources/ieie/IEIESPC.2025.14.5.579/au2.png

Zhendong Liu graduated from Jishou University with a major in computer application technology in 2008. Presently, he is working as a professor and director of the academic affairs office at Chongqing City Vocational College. He has been invited to serve as a think tank expert, technical advisor, and master's supervisor, and has published multiple articles. His areas of interest include big data application technology, artificial intelligence, and virtual technology.