Mobile QR Code QR CODE

2024

Acceptance Ratio

21%


  1. (College of Arts and Design, Jining Polytechnic, Jining, 272000, China)
  2. (Ministry of Organization, Jining Polytechnic, Jining, 272000, China)
  3. (College of Electronic and Information Engineering, Jining Polytechnic, Jining, 272000, China)



Fisheye image, Panoramic stitching, Scenic area roaming, Rotation matrix, Image fusion

1. Introduction

Tourism, as a non-settled form of travel for people seeking spiritual pleasure, can alleviate work and life pressures and completely relax the mind. Meanwhile, it broadens one's horizons and expands one's knowledge. But many people, due to busy work and other reasons, have little time to visit various scenic spots. Virtual reality technology can simulate the real environment and has the characteristics of multi perception and interactivity, providing a good leisure way for people who have no time to travel. The traditional panoramic roaming system requires actual measurement of the environment to construct a 3D simulation scene, resulting in high time costs [1, 2]. The panoramic image stitching technology greatly reduces the time cost required for measuring the environment and achieves rapid modeling of scenic spots. The so-called panoramic image stitching refers to piecing together several overlapping images into a seamless panoramic or high-resolution image. The current image stitching technologies include spatial surface projection, planar projection, content preserving deformation, and stitching algorithm [3, 4]. There are difficulties in matching and complex reprojection in spatial surface projection. Planar projection is only suitable for the stitching of planar scenes, and there is a problem of perspective distortion. Although content preserving deformation has a high degree of freedom, it has certain problems in the processing of image stitching. The stitching algorithm has certain disadvantages in image registration.

Image registration, as a key technology in image stitching, is mainly responsible for extracting and matching feature points from different images. Zhang et al. raised an image registration method with Bayesian integration and asymmetric Gaussian mixture model for retinal image registration. This method achieved uniform distribution of features in image space and scale space through a hierarchical matching method, and used dynamic thresholds to classify the matching results. The experiment outcomes denoted that this method had robustness, and its stability and accuracy were better than existing methods [5]. Yan and other scholars proposed a multi-modal image registration method based on composite features of information distribution for automatic registration of multi-modal remote sensing data. This method used an adaptive information entropy graph to describe the information and contour feature distribution of an image. Then, it extracted feature points using information distribution composite features, and described composite features using the maximum information index graph and information trend graph. The experimental results showed that this method outperformed advanced algorithms such as Radiation Invariant Feature Transform (RIFT), Particle Swarm Optimization Scale Invariant Feature Transform (PSO-SIFT), and Optical Synthesis Scale Invariant Feature Transform (OS-SIFT) in terms of matching performance, and had strong robustness [6]. Han and his team proposed a 3D differential homeomorphic image registration model based on West Riemann constraints and lower bound deformation divergence to address the mesh folding problem in 3D image registration. The experiment outcomes denoted that the model could preserve the local shape of the image while avoiding grid folding [7]. Felix et al. raised multi image registration method based on extended images for X-ray fluorescence spectroscopy, which provides an adjustable compromise between registration quality and runtime. The experiment outcomes denoted that this method could effectively synthesize higher resolution X-ray fluorescence spectra [8]. Mohammadi et al. proposed an image registration method based on unified robust acceleration robust features for multi-sensor remote sensing image registration with multiple time differences. This method extracted image features through a unified robust acceleration robust feature algorithm, and then used a simple improvement based on graph transformation matching to remove outliers. The shape correction was achieved through thin plate spline and bilinear interpolation. The experimental results showed that the amount of matching points and registration accuracy of this method were 4940 and 1.8 pixels, respectively, which were better than other algorithms [9].

Image fusion, as another key technology for image stitching, mainly extracts favorable information from the same target image data collected from multi-channels through image processing and computer technology, and finally synthesizes it into high-quality images. Wang and Cheng proposed an image fusion method with joint bilateral filtering and multi-level local region energy for the fusion of multi-modal medical images. This method utilized multi-level local region energy to achieve the fusion of image energy layers, and achieved the fusion of structural layers by taking the maximum local region norm. The experiment outcomes indicated that this method outperformed others in terms of image fusion performance, visual evaluation, and computational efficiency [10]. Shibu and Priyadharsini proposed an image fusion method with non-downsampling contour wave transformation and sparse representation for multi-modal medical image fusion. This method utilized an L0 gradient smoothing filter to decompose the image into low-frequency and high-frequency layers. Then, image fusion was achieved through non downsampling contour wave transformation and sparse representation. The experiment outcomes indicated that the visual consistency and quantitative analysis performance of this method were superior to other methods [11]. Gao and other scholars proposed an image fusion method with dense connected entanglement representation generation adversarial networks for the fusion of infrared and visible light images. This method utilized adaptive instance normalization to reconstruct infrared and visible light images, and generated adversarial networks through dense connectivity to achieve multi-scale fusion. The experiment outcomes indicated that this method outperformed other methods in visual effects [12]. Vasu and Palanisamy proposed an image fusion method based on anisotropic diffusion filtering for multimodal medical image fusion. This method utilized edge preservation to construct weight layers and utilized anisotropic diffusion filtering to optimize weights, achieving the fusion of weight layers and decision layers. The experiment outcomes indicated that this method had advantages in visual effects and detail preservation compared to other methods [13]. Soroush and Baleghi proposed an image fusion method with improved visual saliency points for the fusion of near-infrared and visible light images. This method used an error function to search for the fusion parameter space to find the optimal fusion, and searched for the distance between the RGB channel and the fused image through gradient difference. The experiment outcomes indicated that the fusion effect of this method for near-infrared and visible light images was superior to other methods [14].

In summary, there are currently many studies on local registration schemes based on planar projection, which mainly solve the problem of ghosting caused by complex layers in scenes that are not on the same plane, as well as the perspective distortion caused by large field of view stitching. Although the planar projection method can achieve image stitching with a large field of view through mixed mapping, it does not have a closed spatial constraint relationship, making it difficult to re render and display. Moreover, planar projection cannot complete the stitching of panoramic images. In the task of panoramic stitching from all angles, spatial surface projection is mainly used, and a single viewpoint projection model is adopted. The main difficulty in panoramic stitching image registration lies in overcoming the ghosting problem caused by disparity. Due to the complex projection relationship and spatial closure constraints, current registration schemes in panoramic stitching have significant limitations. Meanwhile, for fisheye lenses, although they have a larger field of view and bring convenience to panoramic stitching at the hardware level, the corresponding distortion of fisheye lenses also poses great challenges to image registration. Currently, the panoramic stitching algorithms applicable to fisheye images are still relatively weak, and some panoramic stitching schemes for fisheye images have significant limitations. Therefore, to improve the quality of fisheye image stitching and improve the experience of panoramic roaming scenes, an improved Random Sample Consensus (RANSAC) image stitching method based on ray vector and rotation matrix is proposed. To achieve natural transitions in stitched images, the study also improves the optimal stitching line using a nonlinear weighted fusion algorithm.

The innovation of this study lies in the introduction of ray vectors in RANSAC, which decouples feature points from lens distortion and solves the problem of feature point selection in panoramic stitching of fisheye images. Then, a nonlinear weighted fusion algorithm was introduced into the optimal stitching algorithm to eliminate stitching seams in panoramic stitching of fisheye images. The contribution of the research is that: first, it provides an effective method for scenic spot roaming and improves the experience of virtual roaming. Secondly, it improves the quality of panoramic image splicing and effectively promotes the application of panoramic image splicing.

2. Methods and Materials

Compared with other images, fisheye images have a broader field of view, so in order to achieve the goal of remote scenic spot roaming, the panoramic splicing technology of fisheye images will be studied. To ensure accurate stitching of fisheye images, research will be conducted on image correction, registration, spherical projection mapping, and fusion to ensure that the stitched images conform to normal vision.

2.1. Fisheye Image Correction and Registration

When stitching fisheye images, it is necessary to first correct the images, that is, convert the non-linear storage of fisheye images into linear storage. Due to the fact that the pixels in the same column of the fisheye image are similar to the longitude of the Earth, when converting the fisheye image to a normal image, the horizontal axis of the same pixel is the same, but the vertical axis will change. The calculation method for mapping the pixel points of the fisheye image to the camera plane is shown in Eq. (1).

(1)
$ \left\{\begin{aligned} x' = x-x_0,\\ y' = y-y_0. \end{aligned}\right. $

In Eq. (1), $(x', y')$ represents the mapped pixel plane coordinates. $(x, y)$ means the pixel coordinates of the fisheye image; $(x_0, y_0)$ represents the center point coordinates. The method for converting spherical longitude and latitude coordinates of fisheye images is shown in Eq. (2).

Fig. 1. The RANSAC algorithm.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig1.png
(2)
$ \left\{\begin{aligned} X = \arcsin \frac{y'}{r_0},\\ Y = \arcsin \frac{x'}{\sqrt{r_0^2 - y'^2}}. \end{aligned}\right. $

In Eq. (2), $(X,Y)$ represent the spherical latitude and longitude coordinates of the pixel point. $r_0$ represents the distance between the distortion point and the center point. The correction calculation for fisheye images can be obtained by combining the above equation with Eq. (1), as shown in Eq. (3).

(3)
$ \left\{\begin{aligned} X = \arcsin \frac{y-y_0}{r_0},\\ Y = \arcsin \frac{x-x_0}{\sqrt{r_0^2 - (y-y_0)^2}}. \end{aligned}\right. $

After extracting the effective regions from the fisheye image, the center and radius of the shape are calculated, and a correction model is established. Thus, the pixel coordinates of the fisheye image can be converted. After the coordinate conversion is completed, the grayscale difference is applied to the pixel points to achieve the conversion of the fisheye image to a normal image. After image correction is completed, it is also necessary to register the image, which directly affects the final effect of image stitching. Image registration first uses the RANSAC algorithm to match and filter the feature points of the image, and then uses the SIFT algorithm for registration. The RANSAC algorithm is shown in Fig. 1.

As shown in Fig. 1, for a set of noisy sample points, the RANSAC algorithm can calculate the model parameters through random sampling, and then find the correct model from a large number of noisy samples. The main relationship models of the RANSAC algorithm are the basic matrix and the homography matrix, where the homography matrix is mainly responsible for characterizing the transformations between planes, and the basic matrix is mainly responsible for characterizing the epipolar geometric relationships between matching points [15]. The mathematical model of homography matrix is denoted in Eq. (4).

Fig. 2. Principles of the epipolar geometry.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig2.png
(4)
$ \begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = H \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}. $

In Eq. (4), $(x_1, y_1)$ represents the pixel coordinate after plane transformation. $H$ represents homography matrix. The mathematical model of the basic matrix is shown in Eq. (5).

(5)
$ \left\{\begin{aligned} p_2^T F p_1 = 0,\\ F = K_1^{-T} t^\wedge R K_2^{-1}. \end{aligned}\right. $

In Eq. (5), $p_1$ and $p_2$ represent the matching feature points of different images, respectively. $F$ represents the fundamental matrix. $K_1$ and $K_2$ represent the internal reference matrices of different cameras, respectively. $t$ and $R$ represent the relative external poses of different cameras. The principle of epiolar geometry is denoted in Fig. 2.

As shown in Fig. 2, the line connecting the two cameras (baseline) will intersect with the opposite poles of the two images. At this point, any plane containing a baseline is a epipolar plane that intersects with two image planes on two straight lines. When the 3D position of a point changes, it is actually a rotation of the epipolar plane around the baseline, and the resulting plane cluster is called the polar plane bundle. All epipolar lines that intersect with the image plane intersect at the epipolar point. However, due to the large number of false filters caused by homography matrix and the poor filtering effect of the basic matrix, the reliability of the feature filtering results of RANSAC algorithm is low [16, 17]. Therefore, the study introduces ray vectors and rotation matrices to improve the RANSAC algorithm. The ray vector is shown in Fig. 3.

Fig. 3. Schematic diagram of the light vector.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig3.png

From Fig. 3, the camera internal parameters can be used to recover the incident light of pixels in the image, and the camera model can be used to map the light vector of feature points to the camera coordinate system. Due to the fact that the ray vector is independent of the lens itself, using it for feature point filtering can effectively avoid the problem of false positives and missed filters. The method for calculating the ray vector is shown in Eq. (6).

(6)
$ v = \begin{bmatrix} \frac{\sin r}{r \cdot f} & 0 & 0 \\ 0 & \frac{\sin r}{r \cdot f} & 0 \\ 0 & 0 & \cos r \end{bmatrix} \begin{bmatrix} u - u_t \\ \upsilon - \upsilon_t \\ 1 \end{bmatrix}. $

In Eq. (6), $v$ represents the ray vector. $r$ represents the projection position of feature points of an image on the normalized focal length plane. $f$ stands for camera focal length. $[u,\upsilon]^T$ represents the coordinate of feature points. The screening of ray vectors is carried out through a rotation matrix, and the method for solving the rotation matrix is shown in Eq. (7).

(7)
$ M v_a - v_b = 0. $

In Eq. (7), $M$ represents the solution of the equation, which is the rotation matrix between cameras. $v_a$ and $v_b$ represent ray vectors for different images, respectively. If the diagonal value of $MM^T$ is 1, it will perform singular value decomposition on the $M$ matrix. If the singular value of the matrix is close to 1, the left and right singular matrices are multiplied to obtain the normalized unit orthogonal solution, which will be used for interior point screening. The criteria for interior point screening are shown in Eq. (8).

(8)
$ M v_a \cdot v_b = |M v_a| |v_b| \cos(\theta) = \cos(\theta) > \cos(\theta_t). $

In Eq. (8), $\theta$ represents the angle between vectors. Due to the fact that the final model of the improved RANSAC algorithm is calculated through three pairs of matching vectors, it does not have statistical significance and requires further optimization. The optimization objective function is shown in Eq. (9).

(9)
$ \sum_{i=1}^n \| v_i - R(r, v'_i) \|^2. $

In Eq. (9), $n$ represents the number of matched ray vector pairs. $v_i$ and $v'_i$ both represent matching ray vectors. $R(\cdot)$ represents rotation transformation. $r$ represents a rotation vector.

2.2. Spherical Projection Mapping and Fusion of Fisheye Images

Due to the fact that fisheye images are collected from four different directions, direct stitching can lead to severe image distortion. Therefore, it needs to first project the images onto a sphere so that all images are in the same plane before stitching can be performed. The schematic diagram of the spherical projection mapping of the image is denoted in Fig. 4.

Fig. 4. Schematic diagram of spherical projection mapping.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig4.png

In Fig. 4, A and B represent the horizontal tilt angle and the up and down tilt angle of the camera, respectively. As shown in Fig. 4, with the same point as the origin, the world coordinate system and camera coordinate system are constructed separately, connecting the coordinate origin with the pixel points of the image. The intersection point with the sphere is the projection mapping of the image on the sphere. The coordinate calculation method for spherical projection mapping is shown in Eq. (10).

(10)
$ \left\{\begin{aligned} x_1 = \frac{x_2 r}{\sqrt{x_2^2 + y_2^2 + z_2^2}},\\ y_1 = \frac{y_2 r}{\sqrt{x_2^2 + y_2^2 + z_2^2}},\\ z_1 = \frac{z_2 r}{\sqrt{x_2^2 + y_2^2 + z_2^2}}. \end{aligned}\right. $

In Eq. (10), $(x_2, y_2, z_2)$ and $(x_1, y_1, z_1)$ respectively represent the pixel coordinates before and after projection. $r_p$ represents the spherical radius after projection. By reflecting the projected image on the sphere, an unfolded image located in the same coordinate plane can be obtained. The coordinate calculation method after plane expansion is shown in Eq. (11).

Fig. 5. Suture search.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig5.png
(11)
$ \left\{\begin{aligned} x_3 = \frac{\arctan\left(\frac{y_1}{x_1}\right) \times \frac{1}{2}}{\pi},\\ y_3 = \frac{\arcsin\left(\frac{z_1}{r}\right)}{\pi} + \frac{1}{2}. \end{aligned}\right. $

In Eq. (11), $(x_3, y_3)$ represents the coordinate of the unfolded plane. After the above operation, the flat unfolded image of the fisheye image can be obtained, and the next step of image stitching is carried out. The image stitching method used in the study is the optimal stitching method, but due to factors such as differences in brightness between images, it can lead to obvious stitching marks and ghosting of moving objects. Therefore, to solve this problem, a nonlinear weighted fusion algorithm is utilized to achieve smooth transitions in concatenated images. The best suture line search is shown in Fig. 5.

From Fig. 5, the pixel values are searched in four different directions along the dotted line. If the corresponding feature point is found, it is added to the search queue. If multiple feature points are found, it will add the feature point with the smallest color difference to the search queue and continue searching [18]. The pixel value in the queue is set to 1 and the remaining pixel values are set to 0. After weighted optimization of all feature points in the queue, multiple stitching lines can be obtained. The optimal stitching algorithm is shown in Fig. 6.

Fig. 6. Best suture line algorithm.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig6.png

From Fig. 6, the algorithm first searches for the overlapping areas of each image. This operation is achieved through dynamic programming algorithm, which subtracts the pixel grayscale values of two images. If there is a significant change compared to the original grayscale values, then the area is an overlapping area. Next, superpixel segmentation is performed on the overlapping areas, which is achieved through a linear iterative clustering algorithm. Then it will conduct a suture line search in the overlapping area and find the best suture line. Next, the pixel values are readjusted and allocated to further optimize the optimal stitching line [19, 20]. Finally, the optimal fusion region is selected and processed using a nonlinear weighted fusion algorithm, to make the transition of the stitching area natural. The calculation method for superpixel segmentation is shown in Eq. (12).

(12)
$ \left\{\begin{aligned} d_c = \sqrt{(l_j - l_i)^2 + (a_j - a_i)^2 + (b_j - b_i)^2},\\ d_s = \sqrt{(x_j - x_i)^2 + (y_j - y_i)^2},\\ D' = \sqrt{\left(\frac{d_c}{p}\right)^2 + \left(\frac{d_s}{Q}\right)^2}. \end{aligned}\right. $

In Eq. (12), $d_c$ denotes the color distance. $l_i$ and $l_j$ represent the imaging distance of different images. $a_i$ and $a_j$ represent superpixel block edge pixels of different images. $b_i$ and $b_j$ represent pixels in different images except for edge pixels. $d_s$ is spatial distance. $D'$ represents the distance between a pixel point and the center of a pixel block. $p$ represents the max spatial distance within a class. $Q$ represents the length of pixel blocks. The non-linear weighted fusion expression is shown in Eq. (13).

Fig. 7. Fish eye image stitching method.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig7.png
(13)
$ f(x) = \left\{\begin{aligned} &1, & 0 \le x < a,\\ &1 - \frac{1}{2} \left(\frac{2(x-a)}{b-a}\right)^t, & a \le x < k,\\ &\frac{1}{2} \left(\frac{2(b-x)}{b-a}\right)^t, & k \le x < b,\\ &0, & b \le x. \end{aligned}\right. $

In Eq. (13), $f(x)$ represents the nonlinear fusion function. $[a,b]$ represents the optimal fusion area range. $k$ represents the best fusion line. $t$ is a constant. The proposed fisheye image stitching method is shown in Fig. 7.

As shown in Fig. 7, after inputting the fisheye image, it is first standardized and mapped to obtain an equidistant fisheye image. Then pile up for feature point matching and spherical projection. Then, based on the feature point matching results, the rotation matrix is obtained through RANSASR, and the projection parameters are calculated. Then, based on the matching inliers, determine the adjacency table and combine it with the projection parameters to make global and local optimal adjustments. Then perform spherical projection to obtain the deformed image. The panoramic image can be obtained by performing image fusion.

3. Results

To verify the performance of the proposed panoramic image stitching technology, the study tested the feature matching screening and image stitching effects of the proposed panoramic image stitching technology, and compare the proposed algorithm with other feature matching algorithms and image stitching algorithms.

3.1. Analysis of Feature Matching and Screening Results

The test dataset for feature matching and filtering was the Mikolajczyk dataset, which includes image compression, blur, viewpoint, and lighting changes. The operating system for the experiment was Windows 10, with an Intel Core i5-4210U 2.4 GHz CPU, 8GB of memory, and programming language in Python. The images used for interior point filtering were taken by a fisheye camera with a baseline of 0.2 meters. The fisheye image included two types of equidistant images: 140 degree and 180 degree. The image matching repetition rates of different algorithms are denoted in Fig. 8.

Fig. 8. Repeat rate of image matching.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig8.png

From Fig. 8(a), when facing blurred images, the channel attention and feature slicing description (CAFSD) network, fast Library for approximate nearest neighbor random sample consensus (FLANN-RANSAC) algorithm, and RANSAC algorithm exhibited significant changes in feature point repetition rate, while the proposed method consistently maintained a feature point repetition rate of over 90%, which was much higher than other algorithms. As shown in Fig. 8(b), in the compressed image, the repetition rate of FLANN-RANSAC, CAFSD, and RANSAC decreased to over 2.1% for roads. The repetition rate of the algorithm proposed by the research did not decrease by more than 1.5%, and the repetition rate was always higher than other algorithms.

Fig. 9. Characteristic point matching accuracy for different algorithms.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig9.png

Fig. 9 shows the accuracy of feature point matching for different algorithms. According to Fig. 9(a), in blurred images, FLANN-RANSAC, CAFSD, and RANSAC had the highest feature point matching accuracy of 91.2%, 93.4%, and 88.7%, respectively, with an average accuracy of 90.5%, 92.7%, and 88.1%. The matching accuracy of the method proposed by the research was the highest at 98.5%, with an average accuracy of 98.0%, both higher than other algorithms. According to Fig. 9(b), in compressed images, FLANN-RANSAC, CAFSD, and RANSAC had the highest feature point matching accuracy of 92.4%, 95.1%, and 90.9%, respectively, with an average accuracy of 91.8%, 94.9%, and 90.3%. The matching accuracy of the method proposed by the research was the highest at 99.3%, with an average accuracy of 98.9%. The above outcomes indicated that the raised method outperformed in feature matching.

Fig. 10. Number of inner points and the rate of different algorithms.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig10.png

Fig. 10 shows the amount of inners and inner rates for different algorithms. As denoted in Fig. 10(a), in the 140° fisheye image, the inners of FLANN-RANSAC, CAFSD, and RANSAC were 1348, 1763, and 1101, respectively, with inner rates of 9.6%, 10.3%, and 9.1%. The number of inners and inner rate of the algorithm proposed by the research were 2802 and 25.2%, respectively, which were higher than other algorithms. As shown in Fig. 10(b), in the 180° fisheye image, the number of inners for FLANN-RANSAC, CAFSD, and RANSAC did not exceed 1300, and the inner rate was all below 5.5%. The number of inners and inner rate of the algorithm proposed by the research were 1913 and 12.3%, respectively, higher than other algorithms. The above results indicated that the algorithm proposed in the study had better inner point filtering performance. To further verify the performance of the feature filter machine matching, the ablation experiments were performed. The results of the ablation experiment are shown in Fig. 11.

Fig. 11. Results of ablation experiment.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig11.png

From Fig. 11, the filtering results of the basic matrix identified 26 pairs of matching points, including 2 pairs of incorrect filtering, with a recall rate of 52.17% and an accuracy rate of 92.31%. The distortion of the lens caused a decrease in the recall rate of the basic matrix, while the small baseline caused a decrease in its accuracy. This problem was more pronounced when there were a large number of matching points. The screening results of the homography matrix model showed 7 pairs of matching points, 0 pairs of incorrect matching points, a recall rate of 15.22%, and an accuracy rate of 100%. The presence of lens distortion and multiple planes caused the recall rate of the homography matrix model to be extremely low. The screening results using the rotation matrix model resulted in 46 pairs of matching points and 0 pairs of incorrect matching points, with a recall rate of 100% and an accuracy rate of 100%. The use of ray vectors can effectively eliminate the influence of lens distortion to complete the correct screening of 46 pairs of matching points, indicating that RANSASR can achieve effective and accurate screening of matching point pairs in fisheye images in small baseline scenes.

3.2. Analysis of Panoramic Image Stitching Effect

The panoramic stitching test of fisheye images was completed using a 140° fisheye image with a camera baseline of 0.2 m. The operating system was Windows 10, the CPU was Intel Core i5-4210U 2.4 GHz, the memory was 8GB, and the programming language was Python. The stitching effect of fisheye images is denoted in Fig. 12.

Fig. 12. Panoramic stitching effect of the fisheye image.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig12.png

From Fig. 12, the fisheye image panoramic stitching technique proposed in the study produced images with no obvious stitching lines, and the stitching transition was natural. Additionally, the structure, contrast, and brightness on both sides of the stitching area were very similar. The panoramic image stitching technology proposed by the research had a good stitching effect. To assess the quality of stitched images, the study took the stitching results of six fisheye images as an example to calculate their root mean square error (RMSE), natural image quality evaluator (NIQE), peak signal to noise ratio (PSNR), and structural similarity index (SSIM). The RMSE and NIQE of image stitching using different algorithms are shown in Fig. 13.

Fig. 13. RMSE and NIQE of stitching images with different algorithms.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig13.png

According to Fig. 13(a), the maximum RMSE of FLANN-RANSAC, CAFSD, and RANSAC were 13.7, 12.1, and 18.9, respectively, with an average RMSE of 12.2, 11.3, and 17.9. The maximum RMSE and average RMSE of the method proposed by the research were 10.6 and 10.0, respectively, which were lower than other algorithms. According to Fig. 13(b), the maximum NIQE of FLANN-RANSAC, CAFSD, and RANSAC were 8.9, 7.6, and 9.4, respectively, with an average NIQE of 7.9, 7.0, and 8.4. The maximum and average NIQE of the method proposed by the research were 7.8 and 6.9, respectively, which were much lower than other algorithms. It can be seen that the panoramic image stitching technology proposed in the study can achieve high-quality stitching of images.

Fig. 14. A plot for PSNR and SSIM for the different algorithms.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig14.png

Fig. 14 shows the PSNR and SSIM of different algorithms. According to Fig. 14(a), the maximum PSNR of FLANN-RANSAC, CAFSD, and RANSAC were 22.6, 25.4, and 19.5, respectively, with an average PSNR of 21.8, 24.6, and 17.8. The maximum and average PSNR of the method proposed by the research were 29.4 and 27.6, respectively, which were higher than other algorithms. As shown in Fig. 14(b), the maximum SSIM values for FLANN-RANSAC, CAFSD, and RANSAC were 0.69, 0.76, and 0.49, respectively, and the average SSIM values were 0.65, 0.71, and 0.45, respectively. The maximum SSIM and average SSIM of the method proposed by the research were 0.92 and 0.87, respectively, which were much higher than other algorithms. The above outcomes denoted that after the raised method was studied, the noise in the image was low and the similarity of pixel blocks at both ends of the stitching area was high. To further analyze the performance of the proposed panoramic image stitching method, the study compared it with the current advanced image stitching methods - Improved Weighted RANSAC (IWRANSAC) algorithm and Improved SURF-K Nearest Neighbor RANSAC (ISURF-KNNN-RANSA) algorithm. The experimental results are shown in Fig. 15.

Fig. 15. A plot for PSNR and SSIM for the different algorithms.

../../Resources/ieie/IEIESPC.2026.15.1.83/fig15.png

As shown in Fig. 15(a), compared to IWRANSAC and ISURF-KNN-RANSA, the proposed panoramic image stitching method had a higher PSNE, with a maximum PSNR of 29.4. As shown in Fig. 15(b), the SSIM of IWRANSAC and ISURF-KNN-RANSA did not exceed 0.8, while the average SSIM of the proposed panoramic image stitching method was as high as 0.87. The above results indicate that compared to existing advanced panoramic image stitching methods, the proposed fisheye image panoramic stitching method has higher stitching quality.

4. Discussion

Based on the above research outcomes, the proposed panoramic image stitching technique has higher inner point rate and feature matching accuracy than other algorithms. This is because the proposed algorithm introduced ray vectors on the basis of RANSAC, which are independent of the lens. Therefore, mapping feature points onto ray vectors could achieve decoupling between feature points and lens distortion, avoiding interference from the camera lens. In addition, compared to other image stitching techniques, the proposed method had significant advantages in image stitching quality. This is because the study introduced a non-linear weighted fusion algorithm in the optimal stitching line algorithm, which obtains new pixels by summing the pixels of the image, thereby achieving natural transitions in the stitched image. Other algorithms weighted the average pixel values in the overlapping areas of the image.

5. Conclusion

Panoramic roaming of scenic spots can provide users with an immersive interactive experience, allowing them to intuitively feel the information content of their location, making them feel like they are in a real scene and can achieve the purpose of visiting scenic spots without leaving their homes. However, the realism of panoramic roaming is directly affected by the image stitching effect, and traditional image stitching techniques are difficult to achieve high-quality stitching of fisheye images. In view of this, the study proposed an image stitching technique based on rotation matrix and ray vector, and solved the distortion problem of fisheye images using spherical projection. The experiment outcomes indicated that in blurred images, the matching accuracy of the proposed method was the highest at 98.5%, with an average accuracy of 98.0%, both higher than other algorithms. Moreover, in the 140° fisheye image, the number of inners and inner point rate of the proposed algorithm were 2802 and 25.2%, respectively, which were higher than other algorithms. Besides, the average RMSE and NIQE of the algorithm proposed in the study were 10.0 and 6.9, respectively, which were lower than other algorithms. The average PSNR and SSIM were 27.6 and 0.87, respectively, higher than other algorithms. The above results indicated that the image stitching technique proposed in the study can effectively achieve high-quality stitching of fisheye images. However, due to the algorithm proposed in the study only considering small baseline situations, its stitching effect on large baseline images is questionable. Therefore, future research will focus on how to improve the universality of rapid development.

References

1 
Maltais L. G. , Gosselin L. , 2022, Visiting central heating plant and mechanical rooms in buildings: a case study of virtual tours to foster students' learning in a distance course, International Journal of Mechanical Engineering Education, Vol. 50, No. 4, pp. 1007-1024DOI
2 
Hasanvand M. , Nooshyar M. , Moharamkhani E. , Selyari A. , 2023, Machine learning methodology for identifying vehicles using image processing, Artificial Intelligence and Applications, Vol. 1, No. 3, pp. 170-178DOI
3 
Yu J. , He Y. , Zhang F. , Sun G. , Hou Y. , Liu H. , 2023, An infrared image stitching method for wind turbine blade using UAV flight data and U-Net, IEEE Sensors Journal, Vol. 23, No. 8, pp. 8727-8736DOI
4 
Flaman G. T. , Boyle N. D. , Vermelle C. , Morhart T. A. , Ramaswami B. , Read S. , 2023, Chemical imaging of mass transport near the no-slip interface of a microfluidic device using attenuated total reflection-Fourier transform infrared spectroscopy, Analytical Chemistry, Vol. 95, No. 11, pp. 4940-4949DOI
5 
Zhang H. , Jia N. , Zhuo K. , Zhao W. , 2023, Retinal fundus image registration framework using Bayesian integration and asymmetric Gaussian mixture model, International Journal of Imaging Systems and Technology, Vol. 33, No. 1, pp. 403-418DOI
6 
Yan X. , Shi Z. , Li P. , Zhang Y. , 2023, IDCF: information distribution composite feature for multi-modal image registration, International Journal of Remote Sensing, Vol. 44, No. 5-6, pp. 1939-1975DOI
7 
Han H. , Wang Z. , 2023, 3D diffeomorphic image registration with Cauchy-Riemann constraint and lower bounded deformation divergence, ESAIM: Mathematical Modelling and Numerical Analysis, Vol. 57, No. 1, pp. 299-328DOI
8 
Felix B. , Andreas G. , Kerstin L. , Henning B. , 2023, Efficient and robust image registration for two-dimensional micro-X-ray fluorescence measurements, Journal of Analytical Atomic Spectrometry, Vol. 38, No. 5, pp. 1021-1030DOI
9 
Mohammadi N. , Sedaghat A. , Rad M. J. , 2022, Rotation-invariant self-similarity descriptor for multi-temporal remote sensing image registration, The Photogrammetric Record, Vol. 37, No. 177, pp. 6-34DOI
10 
Wang F. , Cheng Y. , 2023, Image fusion method based on JBF and multi-order local region energy, Journal of Northwestern Polytechnical University, Vol. 40, No. 6, pp. 1414-1421Google Search
11 
Shibu D. S. , Priyadharsini S. S. , 2021, Multimodal medical image fusion using L0 gradient smoothing with sparse representation, International Journal of Imaging Systems and Technology, Vol. 31, No. 4, pp. 2249-2266DOI
12 
Gao Y. , Ma S. , Liu J. , 2023, DCDR-GAN: a densely connected disentangled representation generative adversarial network for infrared and visible image fusion, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, No. 2, pp. 549-561DOI
13 
Vasu G. T. , Palanisamy P. , 2023, CT and MRI multimodal medical image fusion using weight-optimized anisotropic diffusion filtering, Soft Computing, Vol. 27, No. 13, pp. 9105-9117Google Search
14 
Soroush R. , Baleghi Y. , 2023, NIR/RGB image fusion for scene classification using deep neural networks, The Visual Computer, Vol. 39, No. 7, pp. 2725-2739DOI
15 
Fan J. , Yang X. , Lu R. , Li W. , Huang Y. , 2023, Long-term visual tracking algorithm for UAVs based on kernel correlation filtering and SURF features, The Visual Computer, Vol. 39, No. 1, pp. 319-333DOI
16 
Xu M. , Wang L. , 2023, Left ventricular myocardial motion tracking in cardiac cine magnetic resonance images based on a biomechanical model, Journal of X-Ray Science and Technology, Vol. 31, No. 3, pp. 525-543DOI
17 
Pai N. S. , Huang W. Z. , Chen P. Y. , Chen S. A. , 2022, Optimization and path planning of simultaneous localization and mapping construction based on binocular stereo vision, Sensors and Materials, Vol. 34, No. 3, pp. 1091-1104DOI
18 
Pan W. , Li A. , Wu Y. , Deng Z. , Liu X. , 2023, Research on seamless image stitching based on fast marching method, IET Image Processing, Vol. 17, No. 14, pp. 4159-4175DOI
19 
Cheng H. , Xu C. , Wang J. , Zhao L. , 2022, Quad-fisheye image stitching for monoscopic panorama reconstruction, Computer Graphics Forum, Vol. 41, No. 6, pp. 94-109DOI
20 
Lu J. , Huo G. , Cheng J. , 2022, Research on image stitching method based on fuzzy inference, Multimedia Tools and Applications, Vol. 81, No. 17, pp. 23991-24002DOI
Li Wei
../../Resources/ieie/IEIESPC.2026.15.1.83/au1.png

Li Wei is a professor, and she received her bachelor's degree in computer science and technology from Naval Aeronautical and Astronautical University in 2003. In 2009, she graduated from Shandong University of Science and Technology with a master's degree in control theory and control engineering. Currently, she serves as the Director of the Teaching and Research Section at the School of Art and Design, Jining Polytechnic. She has published more than 20 academic papers in core journals, EI source journals and national-level journals, and has published two personal monographs. Her research interests include computer image processing, virtual reality technology and art design.

Jingtao Man
../../Resources/ieie/IEIESPC.2026.15.1.83/au2.png

Jingtao Man obtained his bachelor's degree in literature from Wuhan Textile University in Wuhan, China in 2008. Currently, he serves as the Deputy Director (Vice - Deputy Director) of the Organization (Personnel) Department and concurrently heads the Cadre and Personnel Section at Jining Polytechnic. In his spare time, he actively engages in scientific research and has achieved remarkable results. He has participated in numerous provincial - level and above research projects, won related awards, and published several academic papers in international journals. His research interests span a wide range of fields, including computer science, art design, and innovation in higher vocational education teaching.

Xiufang Wang
../../Resources/ieie/IEIESPC.2026.15.1.83/au3.png

Xiufang Wang is a professor, and he holds a master's degree and graduated from Liaocheng University. She is a key member of the national modern apprenticeship program, head of the big data technology professional group (a high-level professional group in Shandong provincial higher vocational education), a member of the Shandong Provincial Vocational Education Steering Committee for Computer Science, and a member of the CCF VC Software Technology Working Group. She teaches courses such as Big Data Visualization, Data Analysis, and Introduction to Artificial Intelligence. Her main research directions are big data technology and software technology.