ChapterPDF Available

A Simple Baseline Correction Method for Raman Spectra

Authors:
The following manuscript was published in
Yung-Sheng Chen and Yu-Ching Hsu, A Simple Baseline Correction Method for
Raman Spectra, IAENG Transactions on Engineering Sciences - Special Issue for the
International Association of Engineers Conferences 2019 (Edited By: Sio-Iong Ao,
Haeng Kon Kim, Oscar Castillo, Alan Hoi-shou Chan and Hideki Katagiri), World
Scientific Publishing Co Pte Ltd, pp. 21-32, 2020.
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 1
1
A Simple Baseline Correction Method for Raman Spectra
Yung-Sheng Chenand Yu-Ching Hsu
Department of Electrical Engineering, Yuan Ze University,
Chungli, Taoyuan 320, Taiwan, ROC
E-mail: eeyschen@saturn.yzu.edu.tw
Baseline shifts caused by a variety of factors depending on the type of spec-
troscopy are common degradation problems in Raman spectra. An effective
and efficient method is presented for the baseline correction of Raman spectra.
A kernel smoothing function is developed for the estimation of baseline, and a
negative signal filter is designed for the removal of negative impulse responses
caused from the spectroscopy. Results on the computer and instrument imple-
mentation confirm the feasibility of the proposed method.
Keywords: Raman spectra; Spectrometers; Spectroscopic instrumentation;
Spectroscopy.
1. Introduction
In applied sciences, a lot of natural phenomena can be observed and mea-
sured based on a proper instrument for further analyzing and studying.
For example, in order to read a color-band resistor (which is an often-used
electronic component), except for the use of multimeter measurement or hu-
man observation and calculation, it can also be read by a computer vision
based method. 1–3 For the multimeter, it is also a quite significant instru-
ment for sensing or measuring electronic parameters (e.g., voltage, current,
resistance) and is indispensable to the area of science and technology. How-
ever, reading an analog multimeter usually relies on human eyes and has
two obvious problems, i.e., inefficiency and easy fatigue, while a long time
of reading the analog multimeter is needed. This conducts us to develop a
computer vision approach for automatically reading the analog multimeter
in recent.4,5 From such a research trend of instrument, the signal processing
method to the instrument of Raman spectroscopy is similar to the reading
technique (such as the computer vision method) to the analog multimeter.
An effective and efficient baseline correction algorithm for Raman spectra
is thus addressed in this paper.
Raman spectroscopy is a well-established but significant technique for
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 2
2
revealing the molecular fingerprints based on its vibrational information.
Due to the label-free and non-invasive property as well as convenient appli-
cation, Raman spectroscopy has been widely applied for material analysis, 6
biological investigation,7medical diagnosis, 8and so on. Baseline is a com-
mon degradation problem in Raman spectra and usually results from the
spurious background signal or instrument fluctuation. The problem of base-
line will severely result in the difficulty of quantitative or qualitative anal-
ysis of Raman spectra. To overcome such a problem, baseline correction is
a necessary step before performing the Raman spectra analysis and usually
classified into three main types of methods, i.e., smoothing spline fitting,
wavelet decomposition and integration, and least squares error modelling
(or divided into physically and mathematically motivated approaches). The
merits and disadvantages of those methods have been investigated and can
be found in the related literatures. 6,7,9 A well-known commercial software
tool, called Origin/OriginPro developed by OriginLab Corporation, 10 have
been widely used in the related application fields, where the peak analysis
is one of the major functions in this product and includes the baseline cor-
rection function being focused in our study. In recent, due to the progress
of micro Raman spectroscopy, e.g., Micro Raman Identify (MRI) spectrom-
eter,11 users usually need a friendly interface to fast correct the baseline
with only a few controllable parameters. To achieve such a goal, an effective
and efficient baseline correction algorithm is presented in this paper. The
partial work of this study has been presented in the IMECS 2019.12
The rest of this paper is organized as follows. In Section 2, an illustra-
tion of base line estimation and correction is given and the base line mod-
eling is introduced at first. Then a kernel smooth function and a so-called
negative signal removal are developed to construct our baseline correction
algorithm. Section 3 shows the experimental results, where the behaviour
of the parameters is also discussed. The conclusion and future work of this
paper is finally drawn in Section 4.
2. Proposed Approach
All the samples of Raman spectra used in this study are from the MRI
spectrometer. 11 Two samples are illustrated in Fig. 1, where the baseline
(red colour) is estimated by the proposed method. Note that the negative
impulse responses result from the MRI spectrometer can also be filtered
out by the proposed method as illustrated in Fig. 1(b). Their final base-
line correction results are shown respectively in Fig. 2. The details of the
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 3
3
(a)
(b)
Fig. 1. Illustrations of the measured Raman spectrum and the baseline (red colour) es-
timated by the proposed algorithm. (a) Sample-1: Measured Raman spectrum including
the baseline. (b) Sample-2: Measured Raman spectrum including not only the baseline
but also the negative impulse responses.
presented algorithm will be described as follows.
2.1. Baseline removal modeling
Let fraw be the raw observed Raman spectrum having Ndata points.
Generally, the raw data fraw is composed of the true signal (s), the baseline
(b), as well as the noise (n). It can thus be modelled as
fraw =s+b+n(1)
Based on the quality of nowadays technology, the noise term ncan be
ignored since it has a little effect compared to the baseline. Therefore,
the signal scan be estimated by subtracting the baseline bfrom the raw
data fraw. In other words, the estimation of baseline is the main task for
extracting the signal s. Let fbaseline be the estimated baseline. Then the
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 4
4
(a)
(a)
Fig. 2. Results of baseline correction by the proposed algorithm. (a) Baseline correction
result for the Raman spectrum given in Fig. 1(a). (b) Baseline correction result for the
Raman spectrum given in Fig. 1(b).
correct signal fcorrect can be expressed as
fcorrect =fraw fbaseline (2)
2.2. Smooth function
From the characteristic of Raman spectra, the baseline can be regarded
as a stable component from the human visual inspection, which is used to
carry the signal. Therefore, the key function of estimating the baseline is
first developed in this study for finding the local minima of the raw data
and smoothing them as the SMOOTH function depicted in Fig. 3(a).
Here fin is the input signal, whereas the fmin and favg are the resultant
local minima and averaged information. Assume the i-th data point is to
be processed, the sliding window is defined as [iW, i +W], where W
is a positive integer and controls the window size. The functional blocks
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 5
5
(a)
(b)
Fig. 3. Signal flow of the proposed method. (a) SMOOTH function block diagram
with MIN,MEAN and AVG operations for finding useful local stationary information.
(b) Flowchart of removing the negative impulse responses, extracting the baseline, and
obtaining the baseline correction result.
(MIN,MEAN, and AVG) shown in Fig. 3(a) are formulated respectively
as follows.
MINW(i) = min
j[iW,i+W]fin(j) (3)
MEANW(i) =
X
j[iW,i+W]
MINW(j)
/(2W+ 1) (4)
MEAN4W(i) =
X
j[i4W,i+4W]
MEANW(j)
/(8W+ 1) (5)
AVGW(i) = (MEANW(i) + MEAN4W(i)) /2 (6)
Here MINWcomputes the fundamental local minima for the input signal
and after the processing of MEANW, the smoothed local minima fmin
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 6
6
can be obtained. However, in our study on the MRI spectrometer,11 some
negative impulse responses (possibly resulting from the sensor device or
the integrated mechanism) may occur during the signal acquisition as the
Sample-2 shown in Fig. 1(b). To filter out such an unwanted negative signal,
it is reasonable to provide a more smoother local minimum as a reference
for comparison. A more wider MEAN4Wfunction is thus performed on
the resultant signal from MEANWand then combined with MEANWby
AVGWto obtain the useful reference signal favg. As long as the mean
input signal is less than the reference, it can be replaced by the mean signal,
otherwise the original input signal is remained.
2.3. Negative signal removal
The function block of (NegSignal Filter)Win Fig. 3(b) performs such a
negative signal removal, which can be formulated as below.
fout(i) = fref(i), fin(i)< f ref(i)
fin(i),otherwise (7)
Here fin represents the resultant mean signal from the input signal fin,
and can be computed as the formula in Eq. (4) with Wwindows size. fref
is obtained by the same way. Since the negative impulse responses may
be large or small as illustrated in Fig. 1(b), (NegSignal Filter)Wcascad-
ing (SMOOTH) and (NegSignal Filter)W/3cascading (SMOOTH) are
adopted respectively for the removal of large and small negative signal as
depicted in Fig. 3(b). It can be regarded as a two-stage negative signal fil-
ter. Here the SMOOTH extracts continuously the useful fmin and favg
from the filtered signal obtained by the NegSignal Filter. In this case,
the final baseline is estimated and output fmin located at the middle right
of the signal flow given in Fig. 3(b). Note here that if there is not any neg-
ative impulse responses such a two-stage negative signal filter would not be
performed, the baseline can be directly output from the fmin of the first
SMOOTH function block.
2.4. Baseline correction algorithm
The whole signal flow of the proposed algorithm is depicted in Fig. 3(b). Let
Pbe a binary parameter to control a multiplexer (MUX), which outputs
the estimate baseline fbaseline that comes from the two-stage negative
signal filter (P= 1) or the first SMOOTH function block (P= 0). The
baseline correction result fcorrect can be further obtained by the Signal
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 7
7
Subtraction function block. For the given Raman spectra shown in Fig. 1,
the estimated baselines (red colour) and their baseline correction results
shown in Fig. 2 demonstrate the feasibility of the proposed algorithm.
Table 1. Performances of the five samples from the
Micro Raman Identify (MRI) spectrometer11 using the
proposed algorithm.
Samples data size W P execution time
(ms)
Sample-1 1559 19 0 5
Sample-2 1979 15 1 15
Sample-3 1010 15 0 3
Sample-4 1014 15 0 3
Sample-5 1973 31 1 23
3. Result and Discussion
The algorithm is implemented in Microsoft R
Visual Studio R
C++ 2013
and run on a laptop with Intel R
CoreTM 2 Duo 1.8 GHz CPU and 4G RAM.
In addition to the baseline correction results of Sample-1 and Sample-2
shown in Fig. 2, the results of other three samples are also given in Fig. 4 for
further evaluations. Fig. 4(a) shows our method can estimate effectively the
baseline for a background spectrum (Sample-3). Fig. 4(b) shows the result
of a material measured under such a background (Sample-4). Fig. 4(c)
shows the baseline estimation result for a spectrum with some negative
impulse responses corrupted, and its baseline correction result is shown in
Fig. 4(d) (Sample-5). The performances reported in Table 1 are evaluated
by using data size, W,P, as well as execution time (with milliseconds or
ms), where parameters Wand Pcan be selected by user. It can be observed
that the execution time of the presented method is positively related to the
data size and the window size W.
Before concluding this paper, the behaviours of Wand Pare briefly
discussed as follows. Considering the Sample-1, the results of baseline es-
timation using parameter (a) W= 7, (b) W= 9, and (c) W= 31 are
shown in Fig. 5. By comparing the result using W= 19 shown in Fig. 1(a),
it can be observed that the small W(see Fig. 5(a) and 5(b)) will result
in an under-fitting, whereas the large W(see Fig. 5(c)) will result in an
over-fitting. Since there are possibly a lot of various Raman spectra, it is
reasonable to select a proper Wby means of heuristic manner. Considering
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 8
8
(a)
(b)
(c)
Fig. 4. Results of the other samples for baseline correction. (a) The estimated baseline
fits well for a background spectrum. (b) Result of a material measured under such a
background in (a). (c) The estimated baseline for a spectrum with some negative impulse
responses corrupted. (d) The final baseline correction result for the spectrum in (c).
the Sample-2, with the same W= 15 and set P= 0, it can be observed
that the baseline estimation is influenced with the negative impulse signal
as the result shown in Fig. 6 by comparing the result given in Fig. 1(b),
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 9
9
(d)
Fig. 4. (Continued.)
where the negative signal is removed by setting P= 1.
Based on our experiments, the effectiveness and efficiency of the pro-
posed baseline correction algorithm for Raman spectra have been confirmed.
As a result, in accordance with the desired application of micro Raman
spectroscopy, e.g., Micro Raman Identify (MRI) spectrometer, 11 the goal
of providing users a friendly interface to fast correct the baseline with only a
few controllable parameters has been achieved by the presenting approach.
4. Conclusion and Future Work
In the field of Raman spectrometer, it is of great importance to correct the
baseline before performing the quantitative and qualitative analysis for a
raw Raman spectrum. The presented algorithm can effectively not only
correct the baseline but also remove the negative impulse responses. The
execution time with milliseconds performs the efficiency of the proposed
method. This algorithm has been implemented in the instrument software
of Micro Raman Identify (MRI) spectrometer11 and thus confirms its fea-
sibility in the engineering and laboratory application. Because the peak
fitting or peak decomposition is also a significant function in the field of
peak analysis, it could be a good topic along this study direction and re-
garded as our future work.
Acknowledgment
This work was supported in part by the Protrustech (Taiwan), which is a
professional instrument company, working on Raman spectroscopy, Photo-
luminescence, Photo-reflectance, and so on.
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 10
10
(a)
(b)
(c)
Fig. 5. Results for Sample-1 with the parameter: (a) W= 7 (sever under-fitting), (b)
W= 11 (little under-fitting), (c) W= 31 (over-fitting).
References
1. Y.-S. Chen and J.-Y. Wang, Reading resistor based on image process-
ing, in Proc. IEEE Int. Conference on Machine Learning and Cyber-
netics, ICMLC 2015 , (Guangzhou, China, 2015, pp566–571).
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 11
11
Fig. 6. Result for Sample-2 with the parameter P= 0, where the negative signal
removal is not performed.
2. Y.-S. Chen and J.-Y. Wang, Implementation of cost-effective diffuse
light source mechanism to reduce specular reflection and halo effects
for resistor-image processing, in Proc. of SPIE Optical Engineering +
Applications: Applications of Digital Image Processing XXXVIII , (San
Diego, CA, USA, 2015, pp959928-1-14).
3. Y.-S. Chen and J.-Y. Wang, Computer vision on color-band resistor
and its cost-effective diffuse light source design, Journal of Electronic
Imaging 25, 061409 (2016).
4. Y.-S. Chen and J.-Y. Wang, A novel approach of reading analog mul-
timeter based on computer vision, in Proc. IEEE Int. Conference on
Applied System Innovation, ICASI 2018, (Chiba, Tokyo, Japan, 2018,
pp758–761).
5. Y.-S. Chen and J.-Y. Wang, Computer vision-based approach for read-
ing analog multimeter, Applied Sciences 8, 1268 (2018).
6. M. Koch, C. Suhr, B. Roth and M. Meinhardt-Wollweber, Iterative
morphological and mollifier-based baseline correction for raman spec-
tra, Journal of Raman Spectroscopy 48, 336 (2017).
7. S. Guo, T. Bocklitz and J. Popp, Optimization of raman-spectrum
baseline correction in biological application, Analyst 141, 2396 (2016).
8. P. Kumar, T. Bhattacharjee, M. Pandey, A. Hole, A. Ingle and C. M.
Krishna, Raman spectroscopy in experimental oral carcinogenesis: in-
vestigation of abnormal changes in control tissues, Journal of Raman
Spectroscopy 47, 1318 (2016).
9. J. Liu, J. Sun, X. Huang, G. Li and B. Liu, Goldindec: a novel al-
gorithm for raman spectrum baseline correction, Applied Spectroscopy
69, 834 (2015).
September 13, 2019 14:13 WSPC Proceedings - 9in x 6in ICEE-70-baseline page 12
12
10. Peak Analisys in the software of Origin or OriginPro, OriginLab Cor-
poration, USA, https://www.originlab.com/.
11. Raman Spectroscopy, Protrustech CO., LTD, Taiwan, https://www.
protrustech.com/mri.html.
12. Y.-S. Chen and Y.-C. Hsu, Effective and efficient baseline correction al-
gorithm for raman spectra, in Lecture Notes in Engineering and Com-
puter Science: Proceedings of The International MultiConference of
Engineers and Computer Scientists 2019, IMECS 2019, (Hong Kong,
13-15 March, 2019, pp295–298).
... Chen et al. [17] proposed automated baseline correction based on alternating sequential filters (iterations of closing and opening operations) and convolutions. Chen and Hsu [22,23] developed an approach involving iterations of convolutions and erosion operations. Morphological closing operation, with horizontal segment or disc for example, leads to interesting results. ...
Preprint
Comprehensive two−dimensional Gas Chromatography with Vacuum Ultraviolet detection (GC×GC/VUV) results in sizable data for which noise and baseline drift ought to be corrected. As GC×GC/VUV signal is acquired from multiple channels, these pre−processing steps have to be applied to data from all channels while being robust and rather fast with respect to significant size of the GC×GC/VUV data. In this study, we describe advanced GC×GC/VUV data pre−processing techniques for noise and baseline correction that are not available in commercial softwares. Noise reduction was performed on both the spectral and the time dimension. For baseline correction, a morphological approach based on iterated convolutions and rectifier operations is proposed. On the spectral dimension, much less noisy and reliable spectra are obtained. From a quantitative point of view, mentioned pre−processing steps significantly improve signal to noise ratio for analyte detection and hence improve their limit of detection (circa 6 times in this study). These pre−processing methods were integrated into plug im! platform (https://www.plugim.fr/plugin/107).
Article
Full-text available
Multimeters are useful instruments for measuring electronic parameters. Even though the digital multimeter is commonly used in our daily life under the considerations of precision and cost, the analog multimeter is still preferable in many applications due to its easy use to monitor promptly varying values. However, the reading of analog multimeters (or A-meter) usually relies on human eyes with two obvious drawbacks of inefficiency and easy fatigue, while visual inspection onto an A-meter is needed for a long period of time. From the viewpoint of optical sensor application, computer vision, like human eyes, can also be used to sense stimuli from the real world. Therefore, in this paper, an approach of reading an A-meter based on a computer vision technique is proposed. Reading an A-meter relies on information from the arrow on the function selector and the pointer on the instrument meter; the presented method is thus mainly composed of horizontal alignment of the A-meter, detection of the instrument meter region, angle detection of the selector arrow, and angle detection of the pointer. In addition, the schemes of edge-based geometric matching (EGM) and pyramidal gradient matching (PGM) are adopted to detect the regions of interest. The mapping relationship between the function selector and the selector arrow as well as that between the instrument meter and the pointer are built and formulated to finally read the A-meter. The often used scenarios for reading AC voltage, DC voltage, and DC current as well as resistance are used for experiments and evaluations. The experimental results show that the accuracy of detecting the function selected is 100%, the mean accuracy of reading a value from the A-meter is 95% or above, except for some cases of reading resistance that are affected by the so-called little-change-large-multiplier effect. The proposed method can perform very well as long as the mean intensity is ≥7.5. Based on a suitable modification of the proposed method, an application of monitoring a storage level meter and pressure meter installed on a 15 m3 liquid nitrogen (LN2) tank is demonstrated. Our experiments and demonstrations confirm the feasibility of the proposed approach.
Article
Full-text available
In vivo Raman spectroscopy with low signal-to-noise ratio and strong, irregularly shaped fluorescence background imposes a challenge for automatic baseline correction methods. In this work, an approach that enables fast and efficient batch baseline correction has been developed, which is based on a morphological operation in combination with a mollifier algorithm. As this algorithm relies only on three parameters, which are determined by the given experimental conditions, it can be used for automatic and objective processing of many Raman spectra. The applicability of the baseline correction is demonstrated on resonance Raman spectra of beta-carotene mixed with fluorescent red ink as model system, on carotenoids in human skin, and on an excitation–emission map of the green alga Haematococcus pluvialis. In the future, the algorithm opens the potential for wide application in Raman spectra analysis in biological contexts. In particular, it greatly facilitates data processing in cases where special photochemical sample preparation or complex experimental baseline removal was required before. Similarly, processing data of experiments using resonant excitation techniques yielding strong fluorescence background is possible. Copyright
Article
Full-text available
Color-band resistor possessing specular surface is worthy of studying in the area of color image processing and color material recognition. The specular reflection and halo effects appearing in the acquired resistor image will result in the difficulty of color band extraction and recognition. A computer vision system is proposed to detect the resistor orientation, segment the resistor's main body, extract and identify the color bands, as well as recognize the color code sequence and read the resistor value. The effectiveness of reducing the specular reflection and halo effects are confirmed by several cheap covers, e.g., paper bowl, cup, or box inside pasted with white paper combining with a ring-type LED controlled automatically by the detected resistor orientation. The calibration of the microscope used to acquire the resistor image is described and the proper environmental light intensity is suggested. Experiments are evaluated by 200 4-band and 200 5-band resistors comprising 12 colors used on color-band resistors and show the 90% above correct rate of reading resistor. The performances reported by the failed number of horizontal alignment, color band extraction, color identification, as well as color code sequence flip over checking confirm the feasibility of the presented approach.
Conference Paper
Full-text available
Light source plays a significant role to acquire a qualified image from objects for facilitating the image processing and pattern recognition. For objects possessing specular surface, the phenomena of reflection and halo appearing in the acquired image will increase the difficulty of information processing. Such a situation may be improved by the assistance of valuable diffuse light source. Consider reading resistor via computer vision, due to the resistor’s specular reflective surface it will face with a severe non-uniform luminous intensity on image yielding a higher error rate in recognition without a well-controlled light source. A measurement system including mainly a digital microscope embedded in a replaceable diffuse cover, a ring-type LED embedded onto a small pad carrying a resistor for evaluation, and Arduino microcontrollers connected with PC, is presented in this paper. Several replaceable cost-effective diffuse covers made by paper bowl, cup and box inside pasted with white paper are presented for reducing specular reflection and halo effects and compared with a commercial diffuse some. The ring-type LED can be flexibly configured to be a full or partial lighting based on the application. For each self-made diffuse cover, a set of resistors with 4 or 5 color bands are captured via digital microscope for experiments. The signal-to-noise ratio from the segmented resistor-image is used for performance evaluation. The detected principal axis of resistor body is used for the partial LED configuration to further improve the lighting condition. Experimental results confirm that the proposed mechanism can not only evaluate the cost-effective diffuse light source but also be extended as an automatic recognition system for resistor reading.
Article
Oral cancer is a major cause of mortality in South Asian men owing to rampant tobacco abuse. Cancers are also reported in non-tobacco habitués, especially women, attributed to chronic irritations from irregular/sharp teeth, improper fillings, and poorly fit dentures. Conventional screening approaches are shown to be effective for high-risk groups (tobacco/alcohol habitués). Raman spectroscopy (RS) is being extensively explored as an alternate/adjunct tool for diagnosis and management of oral cancers. In a previous Raman study on sequential oral carcinogenesis using hamster buccal pouch model, misclassifications between spectra from control and carcinogen [7,12-dimethylbenz(a)anthracene (DMBA)]-treated tissues were observed. Histopathology of some control tissues suggested pathological changes, attributable to repeated forceps-induced irritations/trauma during animal handling. To explore these changes, in the present study, we recorded spectra from three different types of controls – vehicle control (n = 45), vehicle contralateral (n = 45), and DMBA contralateral (n = 70) – exposed to varying degree of forceps handling, along with DMBA-treated pouches (n = 70) using a 14-week carcinogenesis protocol. Spectra certified on the basis of histopathology and abnormal cell proliferation (cyclin D1 expression) were used to build models that were evaluated by independent test spectra from an exclusive set of DMBA-treated and control animals. Many DMBA-contralateral, vehicle-control, and vehicle-contralateral spectra were identified as higher pathologies, which subsequently corroborated with histopathology/cyclin D1 expression. Repeated forceps-mediated injuries/irritations, during painting and animal handling, may elicit inflammatory responses, leading to neo-plasm. The findings of the study suggest that RS could identify micro-changes. Further, RS-based in vivo imaging can serve as a promising label-free tool for screening even in the non-habitué population where conventional screening is shown to be not effective .
Article
In the last decade Raman-spectroscopy has become an invaluable tool for bio-medical diagnostics. However, a manual rating of the subtle spectral differences between normal and abnormal disease states is not possible or practical. Thus it is necessary to combine Raman-spectroscopy with chemometrics in order to build statistical models predicting the disease states directly without manual intervention. Within chemometrical analysis a number of corrections have to be applied to receive robust models. Baseline correction is an important step of the pre-processing, which should remove spectral contributions of fluorescence effects and improve the performance and robustness of statistical models. However, it is demanding, time-consuming, and depends on expert knowledge to select an optimal baseline correction method and its parameters every time working with a new dataset. To circumvent this issue we proposed a genetic algorithm based method to automatically optimize the baseline correction. The investigation was carried out in three main steps. Firstly, a numerical quantitative marker was defined to evaluate the baseline estimation quality. Secondly, a genetic algorithm based methodology was established to search the optimal baseline estimation with the defined quantitative marker as evaluation function. Finally, classification models were utilized to benchmark the performance of the optimized baseline. For comparison, model based baseline optimization was carried out applying the same classifiers. It was proven that our method could provide a semi-optimal and stable baseline estimation without any chemical knowledge required or any additional spectral information used.
Article
Raman spectra have been widely used in biology, physics, and chemistry and have become an essential tool for the studies of macromolecules. Nevertheless, the raw Raman signal is often obscured by a broad background curve (or baseline) due to the intrinsic fluorescence of the organic molecules, which leads to unpredictable negative effects in quantitative analysis of Raman spectra. Therefore, it is essential to correct this baseline before analyzing raw Raman spectra. Polynomial fitting has proven to be the most convenient and simplest method and has high accuracy. In polynomial fitting, the cost function used and its parameters are crucial. This article proposes a novel iterative algorithm named Goldindec, freely available for noncommercial use as noted in text, with a new cost function that not only conquers the influence of great peaks but also solves the problem of low correction accuracy when there is a high peak number. Goldindec automatically generates parameters from the raw data rather than by empirical choice, as in previous methods. Comparisons with other algorithms on the benchmark data show that Goldindec has a higher accuracy and computational efficiency, and is hardly affected by great peaks, peak number, and wavenumber.