Fig 2 - uploaded by Wonkook Kim
Content may be subject to copyright.
(a) Example of a 3 × 3 grid for the two-parameter case. (b) “Shift” is performed when the (red) optimal point occurs in the boundary of the current grid. (c) “Shrinking” is performed when the (red) optimal point does not occur at the boundary. 

(a) Example of a 3 × 3 grid for the two-parameter case. (b) “Shift” is performed when the (red) optimal point occurs in the boundary of the current grid. (c) “Shrinking” is performed when the (red) optimal point does not occur at the boundary. 

Source publication
Article
Full-text available
Localized training data typically utilized to develop a classifier may not be fully representative of class signatures over large areas but could potentially provide useful information which can be updated to reflect local conditions in other areas. An adaptive classification framework is proposed for this purpose, whereby a kernel machine is first...

Context in source publication

Context 1
... the association is determined, the partitioning is repeated until a single class remains at a leaf node. 2) Parameter Tuning Via IGS: After the binary tree is constructed, parameter tuning is performed at each node of the tree using IGS. We propose this simple but effective tuning method to facilitate the selection of an appropriate interval between candidate parameter values. The proposed recursive grid search method performs well with a small grid size (i.e., 3 × 3 or 3 × 3 × 3 for two parameters and three parameters, respectively) as it intelligently adjusts the interval and the location of the grid based on the tuning results of the previous grid. Here, the grid size is related to the number of rows and columns of the grid (i.e., 3 × 3), and interval refers to the difference in parameter values between adjacent points. Once the tuning accuracies are obtained for the initial grid, IGS uses one of the following two procedures, depending on where the maximum tuning accuracy occurs in the grid. — Shift: When the entry with the maximum tuning accuracy is located on the boundary of the grid, the maximum point is likely to be outside the current grid, so the location of the grid is shifted. The current grid is moved toward the maximum point so that the center of the grid is placed on the maximum point. — Shrinking: When the entry with maximum tuning accuracy does not occur on the boundary, the interval of the grid is halved to determine the optimal parameter value more precisely. The concept is shown in Fig. 2. The two operations are repeated until the interval of the grid is reduced to a user-defined value. To obtain reliable estimates, m -fold cross-validation is performed for each grid point. Once a validation set is selected for parameter tuning, the samples in the set are divided into m sets, where the training of the classifier with the parameters of the grid point is performed with ( m − 1) sets and the testing is performed on the remaining set. Whereas the validation set can be constructed with only the labeled samples for super- vised classification methods, inclusion of unlabeled samples is important for the semisupervised classifier to perform well for new data. For example, in tuning of MRC parameters ( γ A , γ I , k ) , we have obtained very small values for γ I when no unlabeled samples are used for the validation set, if the number of labeled samples was large enough that unlabeled samples were not required to achieve adequate classification accuracies. For this reason, we remove a portion of labels from the currently labeled samples at a given rate R and then use both the labeled and unlabeled samples for the training procedure of the validation process. 3) Adaptive MRC for Binary Problems: If the hierarchical structure and the tuned parameters at each node are given, the proposed adaptive classification method is applied to new data at each node. The overall framework is shown in Fig. 3. The key operations of the method are outlined in the remainder of this section. — Initial classification: For the initial classification, MRC is trained with both the original data ( X ̃ , y ) and new data X and is applied to the new data X to provide semilabels y (0) for the test ...

Citations

... Goplan et al. proposed a sampling geodesic flow (SGF) method for DA [12], which learns intermediate representations of source and target samples via Grassmann manifolds to describe domain shift. However, the SGF approach has several limitations, such as the difficulty in its sampling strategy [7], RHM [8], IWATL [9], CSSPL [10], ELM-based transfer [11] Feature-based (1) Subspace-based Adaptation: SGF [12], GFK [13], coclustering [14], SA [15]- [17], TSSA [18], TA [19], TPCA [20], SCA [21], SDA [22], DCA [23], CS-DDA [24], KSA [25], IRDMKSA [26], HFAA [27], IRHTL [28], invariant-feature-based [29]- [32], dictionary learning-based [33], PCDA-NSD [34], DSTL [35], SS-DDNMF [36], GFP [37], (2) Transformation-based Adaptation: TCA and SSTCA [38], [39], TJM [40], DTJM [41], JDA [42], JGSA [43], DATL [44], [45], LPJT [46], GEDA [47], DADFL [48], CORAL [49], [50], CCCA [51], [52], CCA-based [53]- [56], MA-based [57]- [62], MRDA [63], DABL [64], SSWK-MEDA [65], NFNalign [66], OT [67], 3-D Gabor transformation [68] Classifier-based ML-based [69], multiple cascade-classifier [70]- [72], invariant SVM [73], SCT-SVM [74], multiple-kernel learning [75]- [78], ELM-based [79]- [83], MDAF and MBCF [84], open set DA [85], EasyTL [86], BHC [87], DASVM [88], MRC [89], AL-based [90]- [97] Deep DA Discrepancy-based DAN [98], JAN [99], MRAN [100], DSAN [101], DeepCORAL [102], DNN with class centroid alignment [103], TCANet [104], class-wise distribution alignment based deep DA [105], DDA-Net [106], TDDA [107], TSTnet [108], GNN [109], AMF-FSL [110], MSCN [111], AMRAN [112], DJDANs [113] Adversarial-based GAN [114], [115], adversarial CNN [116], MADA [117], DAAN [118], MCD [119], DWL [120], GAN with VAE-based generator [121], [122], content-wise alignment [123], class reconstruction driven adversarial [124], class-wise adversarial [125], ADADL [126], DABAN [127], UDAD [123], deep metric learning [128], DCFSL [129], CDADA [130], DFENet [131], ...
... A typical semi-supervised SVM is the domain adaptation SVM (DASVM) [88], which built a standard SVM on a source domain and then iteratively adjusted the SVM model using unlabeled target samples. Kim et al. proposed an adaptive manifold classifier (MRC) in a semisupervised setting, where a kernel machine was first trained with labeled data and then iteratively adapted to new data using manifold regularization [89]. The AL technique was also adopted to update existing classifiers [90]- [97]. ...
Article
Full-text available
Traditional remote sensing (RS) image classification methods heavily rely on labeled samples for model training. When labeled samples are unavailable or labeled samples have different distributions from that of the samples to be classified, the classification model may fail. The cross-domain or cross-scene remote sensing image classification is developed for this case where an existing image for training and an unknown image from different scenes or domains for classification. The distribution inconsistency problem may be caused by the differences in acquisition environment conditions, acquisition scene, acquisition time or/and changing sensors. To cope with the cross-domain remote sensing image classification problem, many domain adaptation (DA) techniques have been developed. In this article, we review DA methods in the fields of RS, especially hyperspectral image classification, and provide a survey of DA methods into traditional shallow DA methods (e.g., instance-based, feature-based, and classifier-based adaptations) and recently developed deep DA methods (e.g., discrepancy-based and adversarial-based adaptations).
... The second information is the spatial information for which numerous methods have been used, such as manifold regularization [20,21], kernel methods [22,23], and morphological analysis [13,24]. ...
Article
Full-text available
A new methodology, the hybrid learning system (HLS), based upon semi-supervised learning is proposed. HLS categorizes hyperspectral images into segmented regions with discriminative features using reduced training size. The technique utilizes the modified breaking ties (MBT) algorithm for active learning and unsupervised learning-based regressors, viz. multinomial logistic regression, for hyperspectral image categorization. The probabilities estimated by multinomial logistic regression for each sample helps towards improved segregation. The high dimensionality leads to a curse of dimensionality, which ultimately deteriorates the performance of remote sensing data classification, and the problem aggravates further if labeled training samples are limited. Many studies have tried to address the problem and have employed different methodologies for remote sensing data classification, such as kernelized methods, because of insensitiveness towards the utilization of large dataset information and active learning (AL) approaches (breaking ties as a representative) to choose only prominent samples for training data. The HLS methodology proposed in the current study is a combination of supervised and unsupervised training with generalized composite kernels generating posterior class probabilities for classification. In order to retrieve the best segmentation labels, we employed Markov random fields, which make use of prior labels from the output of the multinomial logistic regression. The comparison of HLS was carried out with known methodologies, using benchmark hyperspectral imaging (HI) datasets, namely “Indian Pines” and “Pavia University”. Findings of this study show that the HLS yields the overall accuracy of {99.93 and 99.98%}Indian Pines and {99.14 and 99.42%}Pavia University for classification and segmentation, respectively.
... These alignment methods identify the direct relations between elements from different data set firstly, then they map different data sets into a common subspace to ensure that the mapped data have the same data distribution before using. Manifold alignment [7][8][9][10][11], kernel alignment [12][13] and subspace alignment [14][15] have been widely used to align different remote sensing data set. However, there are some disadvantages to these alignment methods in multi-temporal analysis. ...
... Because it is difficult to invert too many unknown variables in a model at the same time, in order to reduce the inversion difficulty of the above model, we simplify the solution method of the above model (13), that is, take out RU, S1U, S2U and M to solve separately. The three variables RU, S1U, S2U can be obtained directly from the MIID method if M is known. ...
Article
Full-text available
Due to interference with remote imaging by some natural factors, the multi-temporal analysis ability is limited by the spectral drift between images. In this paper, a new approach to optimize the existing multi-temporal analysis system is proposed: multi-temporal intrinsic image decomposition (MIID). The MIID method is designed to extract common spectral reflectance from multi-temporal images. With MIID, multi-temporal classification, changing detection and index extracting will become extremely easy and more accurate. Firstly, without considering land cover change, the general MIID framework is proposed by adding local temporal-spatial energy constraints in traditional intrinsic images decomposition. On this basis, an improved MIID method with change detection (short for CD-MIID) capability is proposed to make the model adapt to the land cover change situation. Finally, specific steps of how to use MIID methods in the multi-temporal analysis are given. Multi-temporal multispectral/hyperspectral remote sensing images from GF-1, GF-2, GF-5, Landsat TM, and 2 groups of captured datasets with reflectance truth map are used to evaluate the performance. The experimental results show the following two points: first, the MIID methods achieve better extraction results of spectral reflectance. Second, the proposed MIID methods have better performance both on multi-temporal classification and change detection.
... It exploits an additional regularization term on the geometry of both the labeled and the unlabeled samples by using the graph Laplacian. In [63], authors also use a manifold-regularized classifier in a semisupervised setting, where the adaptation is performed by adding semilabeled examples from the target domain. ...
Preprint
Full-text available
The success of supervised classification of remotely sensed images acquired over large geographical areas or at short time intervals strongly depends on the representativity of the samples used to train the classification algorithm and to define the model. When training samples are collected from an image (or a spatial region) different from the one used for mapping, spectral shifts between the two distributions are likely to make the model fail. Such shifts are generally due to differences in acquisition and atmospheric conditions or to changes in the nature of the object observed. In order to design classification methods that are robust to data-set shifts, recent remote sensing literature has considered solutions based on domain adaptation (DA) approaches. Inspired by machine learning literature, several DA methods have been proposed to solve specific problems in remote sensing data classification. This paper provides a critical review of the recent advances in DA for remote sensing and presents an overview of methods divided into four categories: i) invariant feature selection; ii) representation matching; iii) adaptation of classifiers and iv) selective sampling. We provide an overview of recent methodologies, as well as examples of application of the considered techniques to real remote sensing images characterized by very high spatial and spectral resolution. Finally, we propose guidelines to the selection of the method to use in real application scenarios.
... Regarding the adaptation of the classifier, most strategies are issued from semi-supervised learning: the adaptation is usually performed either by modifying the weights of the classifier using unlabeled data coming from the PDF of the image to be classified [7]- [9], by spatial regularization [10], or by adding few informative labeled examples carefully chosen from the new image [11], [12]. This family of approaches often comes with several free parameters, requires expertise in machine learning and statistics, and involves high computational costs. ...
... SS-MA is effective for large deformations, since it is not based on inter-graphs distances, as the methods in [17], [34]. It does not require co-registration of the image sources as [16], [35], and is naturally multisource, as it can align images of different dimensionality, unlike standard domain adaptation algorithms [8], [9]. Since the eigenvectors are defined with a discriminative term and are sorted by their eigenvalues, the classifier can then be learned using only the first dimensions: this makes SS-MA an interesting solution also for dimensionality reduction. ...
Preprint
Full-text available
We introduce a method for manifold alignment of different modalities (or domains) of remote sensing images. The problem is recurrent when a set of multitemporal, multisource, multisensor and multiangular images is available. In these situations, images should ideally be spatially coregistred, corrected and compensated for differences in the image domains. Such procedures require the interaction of the user, involve tuning of many parameters and heuristics, and are usually applied separately. Changes of sensors and acquisition conditions translate into shifts, twists, warps and foldings of the image distributions (or manifolds). The proposed semisupervised manifold alignment (SS-MA) method aligns the images working directly on their manifolds, and is thus not restricted to images of the same resolutions, either spectral or spatial. SS-MA pulls close together samples of the same class while pushing those of different classes apart. At the same time, it preserves the geometry of each manifold along the transformation. The method builds a linear invertible transformation to a latent space where all images are alike, and reduces to solving a generalized eigenproblem of moderate size. We study the performance of SS-MA in toy examples and in real multiangular, multitemporal, and multisource image classification problems. The method performs well for strong deformations and leads to accurate classification for all domains.
... Manifold learning has been previously utilized in the literature for the classification of temporal images, but their use in change detection is still severely limited. The clustering assumption on the data manifold is combined with kernel machines in the form of manifold regularization to classify multitemporal hyperspectral images in [11]. As the spectral shifts between multitemporal hyperspectral images pose difficulties, similar local manifolds of temporal images are aligned in a common latent space for multitemporal classification in [12]. ...
... Among these filtering approaches, one can mention, the morphological profiles (MPs) [7], extended MPs (EMPs) [8], and extended multiattribute profiles (EMAP) [9]. Then, the features extracted by those approaches were combined with the results of spectral methods; among them, the kernel-based methods have been widely used to classify HSI with limited training samples [10], [11]. Moreover, other low-rank representation methods were introduced to utilize a subspace learning technique, which aims to study the underlying low-dimensional subspace structures, and, hence, removed the redundancy of the image [12], [13]. ...
... feed I i into CNN with learned parameters w and b. 10. ...
... Here, the proposed DGCN-D can be defined by (10). A ∈ R M×M represents the adjacency matrix in the graph; A i, j is equal to the weight of the edge e i, j between nodes i and j if v i and v j are connected or A i, j = 0 otherwise. ...
Article
Full-text available
Due to powerful feature extraction capability, convolutional neural networks (CNNs) have been widely used for hyperspectral image (HSI) classification. However, because of a large number of parameters that need to be trained, sufficient training samples are usually required for deep CNN-based methods. Unfortunately, limited training samples are a common issue in the remote sensing community. In this study, a dual graph convolutional network (DGCN) is proposed for the supervised classification of HSI with limited training samples. The first GCN fully extracts features existing in and among HSI samples, while the second GCN utilizes label distribution learning, and thus, it potentially reduces the number of required training samples. The two GCNs are integrated through several iterations to decrease interclass distances, which leads to a more accurate classification step. Moreover, a new idea entitled multiscale feature cutout is proposed as a regularization technique for HSI classification (DGCN-M). Different from the regularization methods (e.g., dropout and DropBlock), the proposed multiscale feature cutout could randomly mask out multiscale region sizes in a feature map, which further reduces the overfitting problem and yields consistent improvement. Experimental results on the four popular hyperspectral data sets (i.e., Salinas, Indian Pines, Pavia, and Houston) indicate that the proposed method obtains good classification performance compared to state-of-the-art methods, which shows the potential of GCN for HSI classification.
... For example, in [112], a feature-level domain adaptation technique based on dictionary learning was proposed, which maps the spectral features of the source and target HSIs into a common low-dimensional embedded space through multi-task dictionary learning, so as to align the spectral feature distribution between the bi-temporal hyperspectral data. Laplacian support vector machines (LapSVM) method was proposed in [113] for solving spectral drift, in which the classifier in LapSVM was adapted to the new data set via iterative application of the classifier using the clustering condition on the data manifold. In addition, in order to realize the classification of MultiTemp-HSIs, Yang and Crawford [114] used spatial information of HSIs to regularize the solutions of manifold alignment (MA) and ...
Article
Since the advent of hyperspectral remote sensing in the 1980s, it has made important achievements in aerospace and aviation field and been applied in many fields. Conventional hyperspectral imaging spectrometer extends the number of spectral bands to dozens or hundreds, and provides spatial distribution of the reflected solar radiation from the scene of observation at the same time. Nowadays, with the fast development of new technology in the fields of information and photoelectricity sensing, and the popularity of unmanned aerial vehicle, hyperspectral remote sensing imaging presents the new trends of multimodality and acquires integration information while keeping high or very-high spectral resolution, especially, high temporal even real time sensing and stereo sensing. Therefore, three important modes of hyperspectral imaging come into existence: (1) multitemporal hyperspectral imaging, which refers to the observation of same region at different dates; (2) hyperspectral video imaging, which captures full frame spectral images in real-time; (3) hyperspectral stereo imaging, which obtains the full dimension information (including 2D image, elevation, and spectra) of observed scene. Along this perspective, firstly, the current researches on hyperspectral remote sensing and image processing are briefly reviewed, and then, comprehensive descriptions of the aforementioned three main hyperspectral imaging modes are carried out from the following four aspects: fundamental principle of new mode of hyperspectral imaging, corresponding scientific data acquisition, data processing and application, and potential challenges in data representation, feature learning and interpretation. Through the analysis of development trend of hyperspectral imaging and current research situation, we hope to provide a direction for future research on multimodal hyperspectral remote sensing.
... However, the classification results can be seriously deteriorated as the distribution disparity between images is completely neglected. In fact, spectrums of the same ground object can vary in distributions considerably due to many affecting factors during the imaging process [2]. Therefore, the significant issue of cross-scene transfer learning is to reduce the distribution differences between source and target datasets, i.e. the domain adaptation problem. ...
... The MSSN model is implemented using the Tensor-Flow open source library. During the network training process, all convolutional kernels and weight matrix of the FC layers are initialized through the initialize function of the library, while the bias values are initialized as 0. The learning rate is fixed as 0.001 and remains unchanged during the whole procedure, and the batch size is set to 128. 1  , 2  and  of Adam are all set to the default values. Maximal number of the training iterations is 1 × 10 3 . ...
Article
Full-text available
The small size of labeled samples has always been one of the great challenges in hyperspectral image (HSI) classification. Recently, cross-scene transfer learning has been developed to solve this problem by utilizing auxiliary samples of a relevant scene. However, the disparity between hyperspectral datasets acquired by different sensors is a tricky problem which is hard to overcome. In this paper, we put forward a cross-scene deep transfer learning method with spectral feature adaptation (SFA) for HSI classification, which transfers the effective contents from source scene to target scene. The proposed framework contains two parts. First, the distribution differences of spectral dimension between source domain and target domain are reduced through a joint probability distribution adaptation approach. Then, a multiscale spectral-spatial unified network (MSSN) with two-branch architecture and a multiscale bank is designed to extract discriminating features of HSI adequately. Finally, classification of the target image is achieved by applying a model-based deep transfer learning strategy. Experiments conducted on several real hyperspectral datasets demonstrate that the proposed approach can explicitly narrow the disparity between HSIs captured by different sensors and yield ideal classification results of the target HSI.
... Kim etal. [21] first trained a kernel machine with labeled data, which was then adapted to new data with manifold regularization. Persello etal. ...
Article
In this letter, we propose a multitask deep learning method for the classification of multiple hyperspectral data in a single training. Deep learning models have achieved promising results on hyperspectral image classification, but their performance highly relies on sufficient labeled samples that are scarce on hyperspectral images. However, samples from multiple data sets might be sufficient to train one deep learning model, thereby improving its performance. To do so, we trained an identical feature extractor for all data, and the extracted features were fed into corresponding softmax classifiers. Spectral knowledge was introduced to ensure that the shared features were similar across domains. Four hyperspectral data sets were used in the experiments. We achieved higher classification accuracies on three data sets (Pavia University, Pavia Center, and Indian Pines) and competitive results on the Salinas Valley data compared with the baseline. Spectral knowledge was useful to prevent the deep network from overfitting when the data shared similar spectral response. The proposed method tested on two deep CNNs successfully shows its ability to utilize samples from multiple data sets and to enhance networks' performance.