Study on Vegetation Classification Based on Spectral Knowledge Base

Abstrac t . A framework about spectral based vegetation classification was proposed, which serves as a core methodology of the vegetation spectral knowledge base. The hyperspectral reflectances of 13 types of plants were measured by an ASD FieldSpec 4 spectroradiometer. Two forms of spectral features were used for representing the key spectral characteristics of plants, including Vegetation index (VI) and spectral shape features. Based on these spectral features, a sensitivity analysis was performed to identify the most important features for establishing the classifier. The analysis of variance (ANOVA) and the cross-correlation analysis were applied to derive the sensitivity of features and remove features that have high correlations. Then, a classification method for differentiating plants was established by coupling some spectral similarity measures(e.g., ED) with some classification methods(e.g., BPANN and SVM). The results of discrimination analysis showed that a highest accuracy was produced by SVM with the OAA over 99% when using 7 sensitive VIs. The results suggested the framework about spectral based vegetation classification can form a basis for spectral knowledge base and application technology and further achieve a wide range of plant classification based on remote sensing.


Introduction
Recent advances in Hyperspectral provide opportunities to map plant species and vegetation at various scales and resolutions.Establish a spectral knowledge base is an effective management tool.
Among them, the study of vegetation classification methods based on hyperspectral data is an important part of plant spectrum library.Over the last decade, vegetation canopy spectral reflectance has been successfully used in discriminating plant species (Schmidt and Skidmore 2003;Pu 2009;Allard et al. 2011;Peñuelas 1995).Some detailed changes in spectral curves of hyperspectral data can be detected by spectral feature selection and extraction methods such as continuum removal or derivative analysis .(Schmidt and Skidmore 2003;Abdel-Rahman et al. 2010).Gong et al. (1997) and Pu (2009) found that the first derivative of the tree spectrum can significantly improve the accuracy of the identification of the six species of common conifers in northern California and the 11 species of urban trees in Tampa, Florida.Kurt et al. (2014) used 47 spectral variables to classify 46 plant species in tropical wetland.A series of methods were used to select and extract features.Then a set of algorithms were used to build the classification model.Pu (2011) utilized a stepwise masking system to process the high-resolution IKONOS images and to identify and map urban forest tree species/groups.Brian et al. (2015) proposed a method of combining Multiple Endmember Spectral Mixture Analysis (MESMA) and Multiclass Discriminant Analysis (MDA) for classifying spectrally-similar species.In a case study for mapping urban trees, Pu and Landry (2012) found that the WV2 imagery produced a higher accuracy than the IKONOS imagery according to an independent validation.Zeng et al.(2017) chooses the wetland of Poyang Lake as the study area, and with the help of ASD FieldSpec4, it gets the reflecting curve of six advantage kinds of vegetation.Based on the pretreatment of original spectral data, this research uses a series of methods to analyze and compare the spectral curve, including derivative, Log(1/R), continuum removing an so on.Then, it uses the characteristic parameters of spectral to classify and extract the plant.Yu et al.(2017) used spectral derivative method and vegetation index method to construct spectral features.The artificial neural network method and factor analysis method were used to classify and extract typical vegetation.
The spectral based vegetation classification is one of the core technologies for establishing plant spectrum library.Based on hyperspectral measurements of a number of plant species, this paper focuses on the extraction and selection of spectral features for vegetation classification.Besides, combining with some pattern recognition algorithms, a framework for spectral based vegetation classification was established and evaluated .

Experimental Design and Data Acquisition
In this study, plant hyperspectral measurements were made in May 2017 in Hangzhou (Lon 120.34°,Lat 30.31°),where consisting a variety of plants.The plant canopy spectrometer was measured using an ASD Fieldspec FR2500 spectrometer with a sampling interval of 1.4 nm (350 to 1000 nm) and 2 nm (in the range of 1000 to 2500 nm).All spectral measurements are made in windless weather within 10: 00 ~ 14: 00.The sensor was placed at 60 cm above the plants' canopy.The spectral data is calibrated against a white reference panel to convert from the radiance signal to spectral reflectance.A total of 20 recordings were made and were then averaged to get one canopy spectrum.

Features Extraction
Vegetation Indices.In this paper, 13 classic vegetation indices (VIs) are purposely selected for plants classification (Table 1).Some of the VIs are sensitive to pigments variation, some of the VIs are sensitive to variation of leaf area or biomass, while some others are sensitive to the plants' water contents.The different mechanisms of these VIs are important for indicating the spectral difference among plants.Besides, a sensitivity analysis by the analysis of variance (ANOVA) was performed to identify the most important VIs for establishing the classification model.In addition, a cross-correlation analysis were applied to eliminate features with relatively high information redundancy. where, is the reflectance at a wavelength (band) i;   is the wavelength interval between 1 + i  and 1 − i  and equals twice bandwidth for this case.The continuum removal technique provided a quantitative measure of absorption features in plants' spectra (Fig. 2).
Thereby, a total of 12 spectral shape features, including 9 first order differential features of red edge, blue edge and yellow edge, and the depth, width and area of CRM in bands range from 530 nm to 770nm were extracted from the original spectra (Tab.2).

Classification Algorithm
To achieve the spectral classification among different plants, three classification algorithms are tested, including Euclidian distance (ED), Support vector machines (SVM) and Artificial neural network (ANN).The Euclidian distance (ED) provides a measure of distance between two pixels or between a reference spectrum and a test spectrum in the L-dimensional feature space: Where, are assumed as the reference spectrum (a laboratory or pixel spectrum known to characterize a target of interest) and the test spectrum, respectively.L represents the spectral dimensionality and equals to the number of bands of hyperspectral data (Kong et al. 2010).The SVM is a new type of classifiers that can efficiently overcome the Hughes phenomenon by directly seeking a separating surface (hyperplane) through an optimization procedure.The ANN has been proven that it works well even with small training sample size.The general principal of the ANN is demonstrated in Fig. 3.

Classification with VIs
According to the sensitivity analysis, a total of 13 Vis passed the ANOVA with p-value<0.01.Then, after the cross correlation check, 7 VIs were retained, including GI, NRI, ACI, RVSI, NDVI, NPQI, and TVI.The means of the 7 VIs were compared among all types of plants as shown in Fig. 5.The classification accuracies of ED,SVM and ANN were shown in Table 3.

Discussions
In this comprehensive study by using hyperspectral data for classifying 13 plant species, it is encouraging that a satisfactory accuracy can be achieved by this spectral classification.Based on the sensitive VIs, the OAA that was produced by all three classification algorithms were over 84%.
However, the OAA of the classification models that were based on spectral shape feature were all below 55%, which is not acceptable for application.Comparing with spectral shape features, the significant difference of VIs among plants may account for its higher classification accuracy (Fig. 5, Fig. 6).Among the three classification algorithms, the highest accuracy was produced by SVM with the OAA over 99% whereas the lowest accuracy was produced by ANN.The classification ability based on spectral shape features is weak, and the selection and extraction of more sensitive features are to be further explored.

Concluding Remarks
In this study, hyperspectral data was used to classify plant species.(1) A total of 7 VIs and 7 spectral shape features were identified as most suitable for plants classification in our case.

Figure 1
Figure 1 Example of hyperspectral curves of different plants and their canopy pictures

Figure 2
Figure 2 Illustrated of CRM

Figure 3 Figure 4
Figure 3 Neural network structure Based on the two types of spectral features, the plants classification models were established according to ED, SVM and ANN, respectively.The models were trained with 60% of the data (n=195), whereas were validated against the left 40% of the data (n=129).The accuracies of the classification are evaluated according to the overall accuracy(OAA) and kappa coefficient (Congalton and Mead 1983; Story and Congalton 1986).A flowchart of the entire analysis is demonstrated in Fig. 4.

Figure 5
Figure 5 Illustrates statistics of VIs

( 2 )
Based on VIs and SVM algorithm, the highest classification accuracy can be achieved with OAA=0.99.The results suggested the great potential of hyperspectral data in plants classification.The framework about spectral based vegetation classification can form a basis for spectral knowledge base and application technology and further achieve a wide range of plant classification based on remote sensing.

Table 1
The 13 vegetation indices extracted from hyperspectral data Besides the VIs, some spectral shape feature including the first-order derivative features and continuum removal (CRM) features are also included.The derivative spectrum is the normalized spectral difference of two adjacent narrow-bands with their wavelength interval.Spectral derivative analysis can partially eliminate the effects of atmospheric effects, vegetation and environmental background (shadow, soil, etc.) which thus emphasize some essential characteristics of the plant (Demetriades-Shah et al. 1990; Tsai and Philpot 1998).The first derivative spectra can be calculated by: