Jump to content

Multi-focus image fusion

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Mostafa Amin-Naji (talk | contribs) at 14:25, 13 September 2019 (I update this page with more details of multi-focus image fusion and new reference). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Overview

In recent years, image fusion has been used in many varieties of applications such as remote sensing, surveillance, medical diagnosis, and photography applications. The two major application examples of image fusion in the photography applications are fusion of multi-focus images and multi-exposure images. The main idea of image fusion is gathering important and the essential information from the input images into one image which that single image has ideally all of this information of the input images[1][2][3]. The research’s history of image fusion is over 30 years with many scientific papers[4].

A Sample of Multi-Focus Image Fusion

Multi-focus image fusion is a multiple image compression technique using input images with different focus depths to make an output image that preserves information. In visual sensor network (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can't give a perfect illustration including all details of the scene. This is because of the limited depth of focus exists in the optical lens of cameras Therefore, just the object located in the focal length of camera is focused and cleared and the other parts of image are blurred. VSN has an ability to capture images with different depth of focuses in the scene using several cameras. Due to the large amount of data generated by camera compared to other sensors such as pressure and temperature sensors and some limitation such as limited band width, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmission data. The aforementioned reasons emphasize the necessary of multi-focus images fusion. Multi-focus image fusion is a process which combines the input multi-focus images into a single image including all important information of the input images and it's more accurate explanation of the scene than every single input image[3].

Vast researches for multi-focus image fusion have been done in the recent years and can be classified into two categories of transform and spatial domains. Common used transforms for image fusion are Discrete Cosine Transform (DCT) and Multi-Scale Transform (MST). Recently, Deep Learning (DL) has been thriving in several image processing and computer vision applications. That is why DL based methods have been an attractive scientific discussion in image fusion researches[1][2][3][4].

Multi-Focus image fusion using Deep Learning

Nowadays, the deep learning is utilized in image fusion applications such as multi-focus image fusion. Amin-Naji et al. proposed three state of the art multi-focus image fusion methods of ECNN[1], HCNN[2], and FCN[4].

ECNN: Ensemble of CNN for Multi-Focus Image Fusion[1]

The schematic diagram of generating three datasets according to the proposed patch feeding which is used in the training procedure of ECNN [1]

The Convolutional Neural Networks (CNNs) based multi-focus image fusion methods have recently attracted enormous attention. They greatly enhanced the constructed decision map compared with the previous state of the art methods that have been done in the spatial and transform domains. Nevertheless, these methods have not reached to the satisfactory initial decision map, and they need to undergo vast post-processing algorithms to achieve a satisfactory decision map. In the method of ECNN, a novel CNNs based method with the help of the ensemble learning is proposed. It is very reasonable to use various models and datasets rather than just one. The ensemble learning based methods intend to pursue increasing diversity among the models and datasets in order to decrease the problem of the overfitting on the training dataset. It is obvious that the results of an ensemble of CNNs are better than just one single CNNs. Also, the proposed method introduces a new simple type of multi-focus images dataset. It simply changes the arranging of the patches of the multi-focus datasets, which is very useful for obtaining the better accuracy. With this new type arrangement of datasets, the three different datasets including the original and the Gradient in directions of vertical and horizontal patches are generated from the COCO dataset. Therefore, the proposed method introduces a new network that three CNNs models which have been trained on three different created datasets to construct the initial segmented decision map. These ideas greatly improve the initial segmented decision map of the proposed method which is similar, or even better than, the other final decision map of CNNs based methods obtained after applying many post-processing algorithms. Many real multi-focus test images are used in our experiments, and the results are compared with quantitative and qualitative criteria. The obtained experimental results indicate that the proposed CNNs based network is more accurate and have the better decision map without post-processing algorithms than the other existing state of the art multi-focus fusion methods which used many post-processing algorithms.

The schematic of the proposed ECNN architecture with all details of models of CNNs [1]

This method introduces a new network for achieving the cleaner initial segmented decision map compared with the others. The pro- posed method introduces a new architecture which uses an ensemble of three Convolutional Neural Networks (CNNs) trained on three different datasets. Also, the proposed method prepares a new simple type of multi- focus image datasets for achieving the better fusion performance than the other popular multi-focus image datasets. This idea is very helpful to achieve the better initial segmented decision map, which is the same or even better than the others initial segmented decision map by using vast post-processing algorithms. The source code of ECNN is available in http://amin-naji.com/publications/ and https://github.com/mostafaaminnaji/ECNN

The flowchart of the proposed method of ECNN for getting the initial segmented decision map of multi-focus image fusion [1]








References

  1. ^ a b c d e f g Amin-Naji, Mostafa; Aghagolzadeh, Ali; Ezoji, Mehdi (2019). "Ensemble of CNN for multi-focus image fusion". Information Fusion. 51: 201–214. doi:10.1016/j.inffus.2019.02.003. ISSN 1566-2535.
  2. ^ a b c Amin-Naji, Mostafa; Aghagolzadeh, Ali; Ezoji, Mehdi (2019). "CNNs hard voting for multi-focus image fusion". Journal of Ambient Intelligence and Humanized Computing: 1–21. doi:10.1007/s12652-019-01199-0. ISSN 1868-5145.
  3. ^ a b c Amin-Naji, Mostafa; Aghagolzadeh, Ali (2018). "Multi-Focus Image Fusion in DCT Domain using Variance and Energy of Laplacian and Correlation Coefficient for Visual Sensor Networks". Journal of AI and Data Mining. 6 (2): 233–250. doi:10.22044/jadm.2017.5169.1624. ISSN 2322-5211.
  4. ^ a b c Amin-Naji, Mostafa; Aghagolzadeh, Ali; Ezoji, Mehdi (2018). "Fully Convolutional Networks for Multi-Focus Image Fusion". 2018 9th International Symposium on Telecommunications (IST): 553–558. doi:10.1109/ISTEL.2018.8660989.