ArticlePDF Available

A versatile classification tool for galactic activity using optical and infrared colors

Authors:
  • Astronomical Institute of The Czech Academy of Sciences

Abstract and Figures

Context. The overwhelming majority of diagnostic tools for galactic activity are focused mainly on the classes of active galaxies. Passive or dormant galaxies are often excluded from these diagnostics, which usually employ emission-line features (e.g., forbidden emission lines). Thus, most of them focus on specific types of activity or only on one activity class, for example active galactic nucleus (AGN) galaxies Aims. In this work we used infrared and optical colors to build an all-inclusive galactic activity diagnostic tool that can discriminate between star-forming, AGN, low-ionization nuclear emission-line region, composite, and passive galaxies, and which can be used in local and low-redshift galaxies. Methods. We used the random forest algorithm to define a new activity diagnostic tool. As the ground truth for the training of the algorithm, we considered galaxies that have been classified based on their optical spectral lines. We explored classification criteria based on infrared colors from the first three WISE bands (bands 1, 2, and 3) supplemented with optical colors from the u, g , and r SDSS bands. From them, we sought the combination with the minimum number of colors that provides optimal results. Furthermore, to mitigate biases related to aperture effects, we introduced a new WISE photometric scheme that combines apertures of different sizes. Results. Using machine learning methods, we developed a diagnostic tool that accommodates both active and passive galaxies under one unified classification scheme using just three colors. We find that the combination of W1-W2, W2-W3 , and g-r colors offers a good performance, while the broad availability of these colors for a large number of galaxies ensures it can be applied to large galaxy samples. The overall accuracy is ~81%, and the achieved completeness for each class is ~81% for star-forming, ~56% for AGN, ~68% for LINER, ~65% for composite, and ~85% for passive galaxies. Conclusions. Our diagnostic represents a significant improvement over existing infrared diagnostics because it includes all types of active galaxies, as well as passive galaxies, extending their application to the local Universe. The inclusion of the optical colors improves its ability to identify low-luminosity AGN galaxies, which are generally confused with star-forming galaxies, and helps us identify cases of starbursts with extreme mid-infrared colors that mimic obscured AGN galaxies, a well-known problem for most infrared diagnostics.
Content may be subject to copyright.
Astronomy &Astrophysics manuscript no. output ©ESO 2023
October 10, 2023
A versatile classification tool for galactic activity using optical and
infrared colors
C. Daoutis1,2, E. Kyritsis1,2, K. Kouroumpatzakis3,2,1, and A. Zezas1,2,4
1Physics Department, and Institute of Theoretical and Computational Physics, University of Crete, 71003 Heraklion, Greece
e-mail: cdaoutis@physics.uoc.gr
2Institute of Astrophysics, Foundation for Research and Technology-Hellas, 71110 Heraklion, Greece
3Astronomical Institute, Academy of Sciences, Boˇ
cnà II 1401, CZ-14131 Prague, Czech Republic
4Center for Astrophysics |Harvard & Smithsonian, 60 Garden St., Cambridge, MA 02138, USA
Received May 26, 2023; Accepted October 2, 2023
ABSTRACT
Context. The overwhelming majority of diagnostic tools for galactic activity are focused mainly on the classes of active galaxies.
Passive or dormant galaxies are often excluded from these diagnostics, which usually employ emission-line features (e.g., forbidden
emission lines). Thus, most of them focus on specific types of activity or only on one activity class, for example active galactic nucleus
(AGN) galaxies
Aims. In this work we used infrared and optical colors to build an all-inclusive galactic activity diagnostic tool that can discriminate
between star-forming, AGN, low-ionization nuclear emission-line region, composite, and passive galaxies, and which can be used in
local and low-redshift galaxies.
Methods. We used the random forest algorithm to define a new activity diagnostic tool. As the ground truth for the training of the
algorithm, we considered galaxies that have been classified based on their optical spectral lines. We explored classification criteria
based on infrared colors from the first three WISE bands (bands 1, 2, and 3) supplemented with optical colors from the u,g, and r
SDSS bands. From them, we sought the combination with the minimum number of colors that provides optimal results. Furthermore,
to mitigate biases related to aperture eects, we introduced a new WISE photometric scheme that combines apertures of dierent
sizes.
Results. Using machine learning methods, we developed a diagnostic tool that accommodates both active and passive galaxies under
one unified classification scheme using just three colors. We find that the combination of W1-W2, W2-W3, and g-rcolors oers a
good performance, while the broad availability of these colors for a large number of galaxies ensures it can be applied to large galaxy
samples. The overall accuracy is 81%, and the achieved completeness for each class is 81% for star-forming, 56% for AGN,
68% for LINER, 65% for composite, and 85% for passive galaxies.
Conclusions. Our diagnostic represents a significant improvement over existing infrared diagnostics because it includes all types
of active galaxies, as well as passive galaxies, extending their application to the local Universe. The inclusion of the optical colors
improves its ability to identify low-luminosity AGN galaxies, which are generally confused with star-forming galaxies, and helps
us identify cases of starbursts with extreme mid-infrared colors that mimic obscured AGN galaxies, a well-known problem for most
infrared diagnostics.
Key words. galaxies: active galaxies: star formation galaxies: starburst galaxies: Seyfert –infrared: galaxies methods:
statistical
1. Introduction
Galaxies can be classified into dierent categories based on their
activity. Some form new stars (i.e., star-forming galaxies, also
referred to as H ii galaxies due to their H ii-region-like spectra),
while others present intense nuclear activity fueled by the su-
permassive black hole (SMBH) in their active galactic nucleus
(AGN). Some galaxies simultaneously exhibit both of these be-
haviors. They are known as composite galaxies or transition ob-
jects (e.g., Ho et al. 1997). In another galactic category, we find
galaxies that host old stellar populations, contain small amounts
of gas or dust, and do not exhibit any star formation or nu-
clear activity. These are the passive galaxies. Finally, there are
also the low-ionization nuclear emission-line region (LINER)
galaxies (Heckman 1980). These galaxies can be separated into
two distinct categories: those powered by a SMBH (type 1; Ho
et al. 1997) and those for which the source of excitation is UV
emission from post-asymptotic-giant-branch stars (Binette et al.
1994;Stasi´
nska et al. 2008;Papaderos et al. 2013).
Before now, the best way to discriminate between these four
classes of active galaxies (star-forming, AGN, LINER, and com-
posite) had been via the use of the emission-line ratio diagrams
introduced by Baldwin et al. (1981), hereafter BPT diagrams.
These are two-dimensional diagrams that separate galaxies into
Hii regions (star-forming), AGN (Seyfert), LINER, and com-
posite classes using the characteristic emission-line ratio fluxes.
The most commonly used version of this diagram is a plot of
[O iii]λ5007/Hβagainst [N ii]λ6584/Hα, [S ii]λλ6716,6731/Hα,
or [O i]λ6300/Hα(Kewley et al. 2001;Kaumann et al. 2003;
Schawinski et al. 2007). The classification of a galaxy depends
on its location on the diagram. Although it has been a highly ac-
curate and reliable method for galactic activity classification for
many years, it presents some disadvantages. One is that in or-
der to classify a galaxy, one needs to obtain an optical spectrum,
Article number, page 1 of 17
Article published by EDP Sciences, to be cited as https://doi.org/10.1051/0004-6361/202347016
A&A proofs: manuscript no. output
which can be challenging for very large samples of galaxies. A
second reason is absorption by the interstellar medium, which
may obscure the AGN emission. Additionally, some emission
lines are weak, hampering the application of these diagnostics to
faint objects. In order to overcome these diculties, new meth-
ods for classifying galaxies that use infrared photometry, specif-
ically in the mid-infrared (3-24 µm) part of the spectrum, have
emerged. The use of photometry allows the diagnostic to be ap-
plied to large samples of galaxies, and the use of infrared data
allows the identification of obscured AGNs.
Observations with the Spitzer Space Telescope (Werner et al.
2004) led to the development of the first versatile activity diag-
nostics in the near- to mid-infrared by Stern et al. (2005) and
Donley et al. (2012). Subsequently, the launch of the Wide-field
Infrared Survey Explorer (WISE) satellite (Wright et al. 2010)
enabled systematic studies of large populations of galaxies by
providing sensitive all-sky photometry in the 3-24 µm range; its
four bands, W1, W2, W3, and W4, have eective wavelengths
of 3.4 µm, 4.6 µm, 12.0 µm, and 22.0 µm, respectively. This led
to the development of a new family of diagnostic tools.
One widely used diagnostic for AGN identification based
on WISE infrared photometry is the criterion of W1-W2 0.8
(Stern et al. 2012). Another diagnostic is based on the W1-W2
color against the W2-W3 color (Mateos et al. 2012).
Even though these two infrared AGN selection methods have
had great success in identifying high-redshift galaxies in several
surveys (e.g., CANDELS; Koekemoer et al. 2011), they are tai-
lored toward higher-redshift, more luminous, or obscured AGNs.
In fact, the diagnostic of Mateos et al. (2012) was built based on
an X-ray-selected sample of AGNs. However, the application of
such diagnostics to other samples of galaxies shows that they fail
to identify a large population of AGNs, especially in the local
Universe. In a sample of galaxies taken from the Sloan Digital
Sky Survey (SDSS), most of the AGN galaxies are located be-
low the W1-W2 =0.8 AGN selection line of Stern et al. (2012)
or are located outside the AGN wedge of Mateos et al. (2012,
see our Sect. 5.4).
In order to overcome this limitation, we have developed
a new mid-infrared-optical color activity diagnostic using ad-
vanced methods, including machine learning algorithms, to sup-
plement and enhance the performance of the existing diagnostic
methods. The main reason we considered these algorithms as
the basis of our diagnostic tool is that there is strong mixing be-
tween the mid-infrared colors of the dierent types of galaxies.
Sensitive all-sky surveys provide photometric data for millions
of galaxies, and machine learning methods allow us to eciently
exploit these rich databases and capture their complexity in mul-
tidimensional parameter spaces.
Since in this work we do not use emission lines, we are able
to include the class of passive galaxies, which is often excluded
in standard diagnostic tools. Therefore, we embarked on the de-
velopment of a new activity diagnostic based on infrared (WISE)
and optical (SDSS) photometry and machine learning methods.
More specifically, this new diagnostic utilizes three colors in or-
der to classify galaxies into five dierent activity classes: star-
forming (SF), AGN, LINER, composite, and passive.
The paper is organized as follows. In Sect. 2we describe the
data, introduce the photometry scheme, and describe the meth-
ods used for the selection of each galactic activity class. In Sect.
3we introduce the classification method. In Sect. 4we present
the results of the training of our diagnostic tool and investigate
its performance. In Sect. 5we discuss the achieved results and
the limitations of the tool, and we explore the reliability of the
classifier. We also compare our results with other widely used in-
frared classification methods for AGNs. In Sect. 6we summarize
our conclusions.
2. Data accumulation
2.1. The Sloan-Digital Sky Survey
Our main galaxy sample is drawn from the SDSS, a northern
sky survey that provides homogeneous and high-quality pho-
tometric and spectral data. For the activity classification of the
galaxies in our sample (see Sect. 2.3) we used the spectroscopic
information provided by the SDSS-MPA-JHU catalog (Kau-
mann et al. 2003;Brinchmann et al. 2004;Tremonti et al. 2004).
This catalog includes spectroscopic line and redshift measure-
ments for more than one million galaxies within the SDSS foot-
print. In order to obtain photometric data for these galaxies we
crossed-matched the galaxies with reliable measurements (RE-
LIABLE ,0) with the SDSS - DR16 photometric catalog based
on their specObjID. The SDSS - DR16 provides measurements
for a number of surface brightness profiles and aperture sizes in
five filters: u,g,r,i, and z. For our purposes, we opted to use the
fiberMag (flux within an appropriate to the SDSS spectrograph
3′′ aperture) and cModelMag photometry. The cModelMag pho-
tometry is based on a radial profile that is a linear combination
of the best fit of an exponential and a de Vaucouleurs profile.
The fiberMag is a good approximation of the flux in a galaxy’s
nucleus (especially for the nearby galaxies) and the cModelMag
profile gives the total flux of a galaxy in a given band.
2.2. Wide-field Infrared Survey Explorer photometry
The WISE satellite (Wright et al. 2010) mapped almost the entire
sky. The WISE All-Sky Release Source Catalog covers 42,195
deg2, or 99.86 % of the entire sky in four broad bands in the
3-25 µm range. Its bands W1, W2, W3, and W4 have eective
wavelengths at 3.4 µm, 4.6 µm, 12 µm, and 22 µm, respectively.
Their angular resolution was 6.1, 6.4, 6.5, and 12 arcseconds, re-
spectively. The WISE survey provides several advantages for the
classification of large populations of galaxies: it is more sensi-
tive than previous broadband infrared surveys; it covers the 3-25
µm range, which includes several important diagnostic features,
for example the polycyclic aromatic hydrocarbon (PAH) emis-
sion features, primarily found in SF galaxies; and the 3-20 µm
continuum probes the transition from the stellar continuum to
dust emission of a galaxy that hosts an AGN.
The WISE survey oers dierent photometry profiles and
apertures. In this project, we used the w?mag_2 and w?gmag
(the question mark corresponds to dierent band numbers 1, 2,
3, and 4 for W1, W2, W3, and W4, respectively). The w?mag_2
photometry is the calibrated source brightness measured within
a circular aperture of 8.25 arcseconds radius centered on the
source position for every WISE band and no curve growth cor-
rection has been applied. The background sky was measured
from an annulus with an inner and outer radius of 50 and 70 arc-
seconds, respectively. The w?gmag photometry is based on el-
liptical aperture photometry for every WISE band (the question
mark corresponds to 1, 2, 3, and 4 for W1, W2, W3, and W4,
respectively). The parameters of the elliptical apertures (semi-
major axis and position angle) are based on the 2MASS survey
(Skrutskie et al. 2006). In addition, the WISE survey provides
extended source photometry, which however, is subjected to sig-
nificant photometric uncertainties due to the low signal-to-noise
ratio in the lower-surface brightness regions of the galaxies.
Article number, page 2 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
As this project aims to study galaxies in the local Uni-
verse, we started our analysis by considering photometry from
the w?gmag photometric aperture as these galaxies will appear
extended in the WISE apertures. The use of a fixed photomet-
ric aperture means some of the galaxy emission for the nearest
galaxies will be missed and, most importantly, an increasingly
large galactic region will be included for more distant galaxies.
Although this may dilute some of the nuclear (AGN) emission,
it allows the application of the diagnostic to a wide range of dis-
tances, from local galaxies to more distant unresolved ones. We
find that 20% of the galaxies in our sample appear extended in
the WISE apertures (ext_flg,0). This reduces aperture ef-
fects and allows the application of the diagnostic even to very
local galaxies (z0).
For more distant objects that are unresolved by WISE, we
used the w?mag_2 photometry aperture. The reason for choos-
ing the w?mag_2 over other similar WISE photometry apertures
(e.g., w?mag_1) was that the former has an aperture radius sim-
ilar to that of the PSF. For each of the four individual WISE
bands, the w?gmag photometry is kept for all galaxies that have
measurements in that aperture, and the w?mag_2 photometry
is used for all galaxies that did not have measurements on the
w?gmag aperture. The consideration of the integrated photome-
try eliminates any aperture-related bias resulting from the large
distance range of our galaxies, since galaxies that belong to the
same activity class but have dierent distances will now have
the same colors. Given the dierent photometry apertures avail-
able in the WISE catalog, in order to overcome this photomet-
ric bias, our photometry consists of the extended apertures for
the resolved and of the point-like apertures for the unresolved
sources. In addition, spiral galaxies tend to have H ii regions scat-
tered across the galaxy disk and a bulge region dominated by old
stellar populations in the center (e.g., Feltzing & Gilmore 2000;
Ortolani et al. 2001). This hybrid photometry scheme is ideal
for accounting for the infrared emission of these gas regions and
also avoids confusion in the classification process due to aperture
eects.
Lang et al. (2016) obtained WISE integrated photometry by
using higher-resolution WISE maps together with apertures from
the SDSS data, that is, WISE-forced (WF) photometry. This in-
formation is only available for the galaxies in the SDSS footprint
and therefore not appropriate for an all-sky sample. Nonetheless
in Fig. 1we compare our hybrid photometry with the WF pho-
tometry. Since in our diagnostic we considered WISE colors, in
that plot, we see the one-to-one comparison of the two WISE
colors, W1-W2 and W2-W3, calculated with WF photometry
and with our hybrid scheme. This comparison shows that the
two methods show good agreement apart from a small system-
atic oset of 0.1 mag in the W1-W2 color and 0.5 mag in the
W2-W3 color. We also color-coded the sources based on their
ext_flg value. If the source has ext_flg=0 means that its
shape is consistent with a point-source profile in the WISE. We
see there is no dependence on the measured colors in our hybrid
scheme depending on the source extent.
2.3. Activity classes and passive galaxies
For the activity classification of galaxies with emission-line
spectra, we used the diagnostic tool defined by Stampoulis et al.
(2019). This is an extension of the generally used diagnos-
tics of Baldwin et al. (1981), Kewley et al. (2001), Kaumann
et al. (2003), and Schawinski et al. (2007) that allows the si-
multaneous use of all available diagnostic line ratios, avoid-
ing contradictory classifications and providing more robust re-
Fig. 1. Comparison between the forced photometry, WF, and the hy-
brid photometry scheme introduced in this work. Top: W1-W2 color
calculated with the hybrid scheme against the same color but calcu-
lated with the WF photometry. Bottom: Same but for W2-W3 color.
The galaxies have been color-coded according to their extension in the
WISE data (ext_flg=0 for point-like sources and ext_flg,0for
extended sources). The black solid line is the y=x.
sults. This scheme is based on fitting multivariate Gaussian dis-
tributions to the four-dimensional emission-line ratio distribu-
tions of log10([N ii]/Hα), log10 ([S ii]/Hα), log10([O i]/Hα), and
log10([O iii]/Hβ). For this reason if we project these objects on
the two-dimensional BPT diagram, the tails of the distributions
of the dierent activity classes may not be confined within the
demarcation lines that separate the dierent activity classes de-
fined by Kewley et al. (2001), Kaumann et al. (2003), and
Schawinski et al. (2007). The emission-line measurements were
obtained from the SDSS JHU-MPA catalog (Kaumann et al.
2003;Brinchmann et al. 2004;Tremonti et al. 2004). The classes
Article number, page 3 of 17
A&A proofs: manuscript no. output
of galaxies considered in that classification scheme were: SF,
Seyfert, LINER, and composite. This diagnostic is based on a
probabilistic classifier. That means that based on the location
of an object in this four-dimensional space, one can also deter-
mine the probability that it belongs to each of the classes that the
classifier has been designed to discriminate. In our analysis, we
used their Soft Data-Driven Analysis (SoDDA) classifier, adopt-
ing the class with the highest probability.
So far, most galactic emission-line diagnostic tools do not
include the class of passive galaxies. Inactive or passive galax-
ies are defined as galaxies that do not show any evidence of ac-
tivity (i.e., SF or AGN) based on the lack of optical emission
lines. Since in this work we also considered the class of passive
galaxies, we needed to define the corresponding sample. Thus,
since we were seeking inactive galaxies, the sample of passive
galaxies was selected using the following criteria: emission lines
of Hα, Hβ, [O iii]λ5007, [O i]λ6300, [N ii]λ6584, and [S ii]
λλ6717,6731 should have had signal-to-noise ratio below 3 and
the signal-to-noise ratio of the continuum at the location of each
emission line was above 3. This ensures that the lack of emis-
sion lines was not the result of the diculty in measuring them
in poor-quality spectra. A confirmation that this method of clas-
sifying galaxies as passive is eective, is their location on the
color-magnitude diagram (e.g., Bell et al. 2004). Figure 2shows
the g-rcolor against the absolute r-band photometry (Mr). The
galaxies selected spectroscopically as passive are located on the
upper part of the diagram in the so-called red sequence region,
where early-type galaxies are found.
Fig. 2. Color-magnitude diagram of g-ragainst Mr. On the y-axis is the
g-rcolor against the absolute magnitude in the SDSS r band, Mr. The
red points represent the sample of passive galaxies and the gray points
the whole sample of galaxies (all classes).
2.4. Final sample
After defining the criteria for the selection of each galaxy class,
we filtered the galaxies that will constitute the final training sam-
ple based on the quality of the photometric data and the activity
classification.
As our goal here was to train a machine learning algorithm,
the filtering had to be done in two stages. The first one ensured
that the true labels (i.e., activity classes) are well defined, as a
poor true label definition based on insecure classification can
lead to an algorithm with significant uncertainty in its predic-
tions. The other stage regarded the features (photometric mea-
surements) that was used for the discrimination between the dif-
ferent galaxy classes by the new activity diagnostic tool.
For the first step, we only selected active galaxies that have
S/N above 5 for all optical emission lines that were used for
the characterization of the true class of each galaxy (Sect. 2.3),
namely, Hα, Hβ, [O iii]λ5007, [O i]λ6300, [N ii]λ6584, and
[S ii]λλ6717,6731. As stated earlier, the classification of each
galaxy was based on the class with the highest probability in the
diagnostic of Stampoulis et al. (2019). As the classifier also pro-
vides the probabilities of each galaxy belonging to the other con-
sidered classes, we chose galaxies that have been classified with
high confidence based on the probability dierence of the high-
est and second highest predicted probability that were assigned
by the classifier for each galaxy. Classifications with a large dif-
ference between the first and the second-ranking class are con-
sidered highly reliable. In this respect, we considered galaxies
with a dierence in their predicted probabilities of at least 25%.
First, we considered objects with reliable photometric mea-
surements based on the WISE quality flags. To identify and
remove these problematic cases, we consulted the AllWISE
Source Catalog and Reject Table 1, where we found the qual-
ity flags for the photometry of a galaxy in the first three WISE
bands. We considered as unreliable photometry every detection
that, in at least one of the three WISE bands (1, 2, and 3), has
been flagged in the above-mentioned catalog as having a mea-
surement error of 9.999, as this indicates that even though a
measurement exists it should be considered as highly suspicious.
Also, another flag concerning the quality of the photometry is the
w?flg=32. Every galaxy with this flag means that its photome-
try measurement is in the 95% upper limit and should not be con-
sidered reliable detection. Other important factors that had to be
accounted for in the quality of the photometry measurements are
source contamination and confusion. If a source has been flagged
in cc_flags with a value of D, P, H, O, d, p, h, or
o, it means that the source may be contaminated due to its prox-
imity to an image artifact, and thus we removed any galaxy that
has one of these flags in any of the W1, W2, and W3 bands.
Concerning the second stage of filtering, we chose active and
passive galaxies that have photometric data with S/N>5 in the
two WISE bands, W1 and W2, as well as for the two SDSS fil-
ters (gand r). A more relaxed lower limit of S/N=3 for the W3
WISE band was selected. The reasoning behind these choices is
that the W3 WISE band has lower sensitivity than the W1 and
W2 bands and a strict S/N selection criterion will result in a sig-
nificant reduction in the number of galaxies in our sample.
Other important facts that had to be taken into consideration
are survey selection and galaxy evolution eects. We found that
in our sample the number of AGN and passive galaxies tended
to increase sharply with redshift. In order to create a uniform
distribution of activity classes of galaxies across the whole red-
shift range, we split the sample into four equal redshift bins. Our
sample of galaxies spans the redshift range from z=0.02 to
z=0.08. We limited the lower cutoof redshift to z=0.02 as
this is compatible with the definition range of the BPT diagrams
(i.e., from z0.02 to z0.06) and thus avoids strong aper-
1https://wise2.ipac.caltech.edu/docs/release/
allwise/expsup/sec2_1a.html
Article number, page 4 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
ture eects during the training of the algorithm. We proceeded
by finding a "reference" redshift bin, which was used as the ba-
sis for selecting the number of objects to sample from each class
in each redshift bin. We find that the 0.033 <z<0.047 bin is
ideal as a reference bin as it is close to the middle of the redshift
range. Based on this bin, we randomly selected the same num-
ber of objects for each class individually from the other three
redshift bins (namely 0.02 <z<0.033, 0.047 <z<0.063, and
0.063 <z<0.08.)
After the implementation of the two stages of filtering, the
redshift balancing, and the removal of unreliable detections in
the WISE photometry we obtained the final sample that contains
all the eligible galaxies for the training process of our diagnos-
tic tool. In that sample, there are 40954 galaxies in total, with
redshifts between z=0.02 and z=0.08. The composition of
the training sample per galactic activity class is given in Table
1. We note that although our classifier was trained on a sample
with high quality optical spectroscopy classifications, it can be
used on any sample of galaxies with available photometry in the
WISE and SDSS bands. The only limitation is that the infrared
photometry should encompass the extent of the galaxy, and the
optical photometry the central 3′′ of the galaxy (to match the
SDSS fiberMag).
3. The diagnostic tool
3.1. The random forest algorithm
For the development of our diagnostic we opted to use the ran-
dom forest algorithm (Louppe 2014), which is based on the con-
cept of decision trees. A decision tree starts with a root node that
contains all the training data, then it will use the considered fea-
tures to progressively create more homogeneous groups of data
(nodes). Ideally, at the end of the process, the final nodes (leaves)
will only contain data of the same kind (class). The problem with
a single decision tree is that, in most cases, the tree tends to adapt
too well to the training data, and as a result, its performance is
poor when it is applied to new data (overfitting). To avoid over-
fitting, we can combine many decision trees in parallel to build a
random forest. Each decision tree of the random forest is trained
on a subsample of the training data. Every such subsample of the
full data set that is used for the training of the trees is selected
by randomly shuing the full training set.
During the classification process, each tree takes as input an
object and gives as output (or vote) the class that this individ-
ual object belongs. Then, this process continues until that object
has been through every tree of the ensemble. In the end, the de-
cisions made by every tree of the ensemble for the object under
question are summed and the object belongs to the class that col-
lected the most votes. The algorithm also allows us to calculate
the probability of that object belonging to each of the classes.
This probability is given by the ratio of the number of votes the
object received to belong in a particular class to the total number
of trees considered in the algorithm.
It is called random because, during the training process of
the algorithm, the features used to make the split of the data into
the new nodes are selected randomly. The random forest oers
several advantages: it is intuitive, probabilistic and it is easily
adaptable to many problems. We used the implementation of the
random forest algorithm provided by
sklearn.ensemble.RandomForestClassifier()
from the scikit-learn Python 3 package, version 1.1.2.
Fig. 3. Distributions of colors considered as potential features for the
definition of our diagnostic tool. Starting from top to bottom, we see the
distributions of colors W1-W2, W2-W3, g-r, and u-gfor each galactic
activity class. Combinations of these four colors are used as potential
feature schemes for defining our diagnostic. Due to the high imbalance
of the sample, the number of galaxies is normalized based on the fre-
quency of occurrence in our data sample. Blue histograms correspond
to the SF, green to the AGN, yellow to LINER, purple to composite, and
red to passive galaxies.
Article number, page 5 of 17
A&A proofs: manuscript no. output
Table 1. Composition of the final sample per galactic activity class.
Class Number of objects Percentage (%)
Star forming 35878 87.6
Seyfert 1337 3.3
LINER 1322 3.2
Composite 1673 4.1
Passive 744 1.8
3.2. Performance metrics
To evaluate the performance of our diagnostic tool, we adopted
standard metrics such as the accuracy, the precision, the recall,
and the F1-score. The exact definition of each performance met-
ric we used for the evaluation is presented in Table 2. Evaluating
the performance of an algorithm based on the accuracy metric
may lead to misleading results, especially in cases with skewed
data sets like the one we are dealing with here.
The metrics that we considered as more appropriate are the
recall and the precision for each class. Recall is a metric that
quantifies completeness (how many objects of each class have
been correctly selected) while the precision quantifies contami-
nation (the fraction of correctly selected objects within the pop-
ulation of all selected elements). To properly evaluate perfor-
mance, we plotted the confusion matrix and calculated the pre-
cision and the recall scores per class.
3.3. Feature selection
There are many characteristics that one can use to classify galax-
ies based on their activity. Our main goal here is to define a
diagnostic tool that is capable of discriminating eciently be-
tween the galaxy classes by utilizing observables that can easily
be acquired for a large number of galaxies. Considering all the
above, in this work, we used only infrared and optical colors
that are available from all-sky or wide-area surveys. Initially, we
started by considering colors that are the combinations of the
three WISE bands and three SDSS filters. More specifically, us-
ing combinations of the WISE bands 1, 2, and 3 along with the u,
g, and, rSDSS (fiberMag photometry) filters, we calculated the
colors: W1-W2, W2-W3, g-r, and u-g. In Fig. 3we see the dis-
tributions of each color we considered as a potential feature, for
the dierent classes, normalized by the total number of objects
in each activity class. In our analysis, we also considered includ-
ing the W3-W4 color as a potential feature. However, due to the
low sensitivity of the detector at the 24 µmband, combined with
the weak emission from passive galaxies in the mid-infrared, we
found that almost none of the passive galaxies has reliable detec-
tions in the W4 WISE band. Therefore, this band (and the colors
involving it) was not considered further in our study.
These particular features were chosen based on the broad-
band spectral shape of these five dierent galaxy classes. Pre-
vious diagnostics have demonstrated the diagnostic power of
WISE. Star-forming galaxies have Hii regions that are rich in
dust and gas heated up by hot young stars, producing strong
emission in the infrared WISE bands (in particular in W3 due
to the PAHs and dust). The AGN-hosting galaxies have a rising
red continuum due to dust heated by the power-law UV spectrum
while at the same time, this extreme UV radiation environment
results in the dissociation of the large molecules, leading to sup-
pressed PAH emission (Alonso-Herrero et al. 2014). However,
the two top plots of Fig. 3show that there is an overlap between
the classes in the WISE colors, especially in the case of the W1-
W2 color. In the two bottom plots of the same figure, we see
that the optical colors help break the degeneracy observed in the
mid-infrared colors.
Even though there is astrophysical reasoning behind our ini-
tial feature selection, we cannot know a priori which combina-
tions give optimal results and which ones do not provide any
improvements in the performance of this multi-class classifi-
cation problem. In order to determine which combination can
yield optimum results with the minimum number of features, we
trained the algorithm with dierent combinations of features and
recorded the performance of each training scheme. This method
helped us compare the performance of the dierent models and
identify highly informative or redundant features. The total num-
ber of models examined was six, and they are presented in Ta-
ble 3. In Fig. 4we plot the recall score of each activity class
for each model (i.e., feature scheme) presented in Table 3. We
started from the simplest case by testing one infrared and one
optical color. For the first two simple models from the two avail-
able infrared colors, we decided to test only the W2-W3 com-
bined with a dierent optical color each time as the W2-W3
provides greater dynamic range and discrimination between the
dierent classes compared to the W1-W2. The selection of a fea-
ture scheme (model) for this new diagnostic was made using two
basic criteria: (1) it had to oer high recall scores for each class,
and (2) it used the minimum number of features. We chose to use
the recall as a performance metric since it also oers information
about completeness.
In our analysis we used the average value of the recall scores
calculated with the cross-validation (CV) method. In more de-
tail, we used galaxies from the final sample (see Sect. 2.4) by
imposing the additional criterion that all galaxies must have S/N
>5 in the u SDSS band. Then, we split those galaxies in k-folds
and used the k-1 folds for training and one for the performance
evaluation of the model (testing fold). Every time, we replaced
the testing fold with one of the training folds and repeated this
step until all the k folds were in the position of the testing fold.
We kept the k total recall scores and calculated the average. The
error bars on the average recall are the standard deviation of the
k recall scores. We selected the number of folds to be 10 (k =
10), which is the maximum number of folds that oered a bal-
ance between satisfactory number of objects in each fold for the
under-represented classes and good statistics for evaluating the
performance metrics.
After evaluating each possible model, we conclude that the
model that includes all the available colors (Model 6) does not
improve the performance when compared to the models that use
two infrared and one of the two optical colors (Model 4 and 5).
The performance of Model 3 is generally the same as that of
Models 4, 5, and 6, but the lack of an optical color results in dra-
matically lower recall for the SF galaxies. This can be attributed
to the fact that SF galaxies have bluer g-rcolors, which sepa-
rate them clearly from the other activity classes (Fig. 3). For the
other two models (Models 1 and 2), we notice that the classifier
does not have enough information to separate the classes eec-
tively. In particular, we notice that there is a significant perfor-
mance drop for the AGN and composite galaxy classes. This can
probably be explained by the nature of the AGN and composite
galaxies. The W2-W3 color records the infrared emission from
the circumnuclear dust heated by the SMBH that is found in ev-
ery AGN galaxy but can also be present in a composite galaxy.
Although AGN-heated dust can lead to stronger emission in the
W2 band, the majority of the AGN-hosting galaxies have sim-
ilar W1-W2 colors to SF and of course to composite galaxies.
This creates confusion on a classifier that is defined based only
Article number, page 6 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
Table 2. Definition of each performance metric used for the evaluation of the performance of our diagnostic tool.
Description Equation
Term
True Positive (TP) An object that has been correctly predicted to be in a specific
galaxy class.
-
True Negative (TN) An object is correctly predicted not to belong to a class. -
False Positive (FP) An object that is falsely predicted to belong to a galaxy class. -
False Negative (FN) An object that is falsely predicted not to belong to a galaxy class. -
Performance metric
Accuracy The ratio of the correct predictions to the total predictions made
by the classifier.
T P+T N
T P+T N+F N+F P
Precision The number of objects correctly predicted to belong to a class
divided by the total objects that the classifier predicted to belong
to that class.
T P
T P+FP
Recall The objects correctly predicted to belong to a class divided by all
the objects that belong to that specific class.
T P
T P+FN
F1-score The harmonic mean of precision and recall. 2T P
2T P+FP+F N
Table 3. Dierent combinations of features (colors) that were tested as
potential models for the definition of the diagnostic.
Scheme Features (colors)
Model 1 W2-W3, g-r
Model 2 W2-W3, u-g
Model 3 W1-W2, W2-W3
Model 4 W1-W2, W2-W3, g-r
Model 5 W1-W2, W2-W3, u-g
Model 6 W1-W2, W2-W3, g-r,u-g
Fig. 4. Recall scores of the dierent models (features schemes) consid-
ered for our diagnostic tool. The description of the models is presented
in Table 3. The error is the standard deviation of the k recall scores (here
k=10), calculated using the CV method. The error on the recall scores
for the SF galaxies is too small to be depicted here. Blue points corre-
spond to the SF, yellow to AGN, green to LINER, red to composite, and
purple to passive galaxies.
on the feature schemes of Models 1 and 2, leading to an exten-
sive mixing between these two classes. Based on these results,
we decided to adopt Model 4 as our basic model, which has the
following features: two infrared colors (WISE), W1-W2, W2-
W3, and one optical color (SDSS), the g-r. In Fig. 4, we see that
Model 5 has a similar performance as Model 4. However, Model
5, relies on u-band photometry, which often has lower signal-to-
noise measurements than the r band, limiting the applicability of
the classifier to larger data samples.
After the optimal combination of features was determined,
we proceeded with the optimization of the algorithm. In this
process we searched for the values of the algorithm’s hyperpa-
rameters that oer the best performance in a particular problem.
Upon investigation, we find that by tweaking nearly half of the
algorithm’s hyperparameters the scores do not improve signifi-
cantly and thus they are left in their default values, as imported
from scikit-learn. The hyperparameters that have a signif-
icant impact and hence are worth optimizing are the following:
max_depth, max_leaf_nodes, max_samples,
min_samples_leaf, min_samples_split, and
n_estimators. The exact procedure of the optimization
as well as a table (Table A.1) with the best values for each
important hyperparameter are presented in the appendix A.1.
3.4. Implementation
The standard procedure for this step is to separate the sample
of all available data, into three random subsets. One of them
contains the majority of the objects (50% of the total or 20476
galaxies) and it will be used for the training. The data from that
subset are used to adapt the algorithm to the individual prob-
lem. The rest of the data form the test and the validation set.
The validation is used for the calibration of the classifier while
the test set is only used for the evaluation of its performance af-
ter the training and calibration processes (see Sect. 3.5). For this
project, we performed a training-validation-test set with propor-
tions of 50%-25%-25% or 20476-10239-10239 galaxies, respec-
tively. The split was stratified, which ensures that each subset has
the same percentage of objects in each class as the original sam-
ple.
Article number, page 7 of 17
A&A proofs: manuscript no. output
The high imbalance in the number of objects between the
five classes in the training sample cannot be left unnoticed, since
it can lead to biases in the classification in favor of the class that
has the higher frequency of appearance. To avoid such an ef-
fect, we have two options: one is to reduce the sample in such
a way that every individual class has the same number of ob-
jects, and the second is to assign weights to each class, that is,
to adjust the impact of each object will have during the training
of the algorithm. In the implementation of the random forest we
used here, these weights are calculated internally by setting the
class_weight parameter to "balanced_subsample." We se-
lected the second option, as it makes use of all available data,
leading to a more robust classification tool as the training will
contain a broader range of examples.
3.5. Classifier calibration
Every probabilistic classifier provides us with not only the class
of the object under investigation but also the probability that this
object belongs to each of the classes that the classifier was de-
signed to discriminate. When the classes are very well separated
in the feature space these probabilities represent the actual like-
lihood of finding an object to belong to a certain class given its
location in the feature space. However, in classification problems
where there is some mixing between the distribution of the fea-
tures for the considered classes, these probabilities do not nec-
essarily reflect the actual probability of the true class. To correct
for this eect in our classification problem, we calibrated our
classifier. This way the output of the classifier represents the ac-
tual probability of finding an object of a particular class within a
given region of the feature space.
In the case of a multi-class classification problem, the pro-
cess of calibration is performed in a one versus rest fashion, that
is, following the same process as in the binary classification for
each class individually and splitting the multi-class problem into
multiple binary-class problems. The calibration in a binary clas-
sification is achieved by applying a regression algorithm (cali-
brator) that rescales the raw predicted probabilities so that these
probabilities match the expected distribution of the actual prob-
abilities. This is based on the frequency of an object of a given
class appearing among the sample in the feature space. The sub-
sample that is used for the calibration must contain objects that
have not been used during the training of the classifier to avoid
any bias.
The data we used for the calibration of our diagnostic
was from the validation subset (see Sect. 3.4), a held-out
subset of the data that was not a part of the training pro-
cess. The algorithm we used to perform the calibration is
the CalibratedClassifierCV, which is provided by the
scikit-learn package. We opted to use the "sigmoid" over
the "isotonic regression" method as the latter is prone to over-
fitting, especially in problems with severely under-presented
classes. Due to the fact that SF galaxies are the overwhelming
majority of the objects in our sample, a feature that is preserved
in the stratified split of the data among the three subsets of data
(i.e., training, validation, and test set), the validation set has an
excess number of SF galaxies.
This imbalance in the number of objects in each class can
lead to biases in the calibration in favor of the class with the
higher frequency of appearance. To avoid such eects, only for
the validation set, we manually balanced the sample by keeping
the same number of objects from each class.
4. Results
4.1. Performance
We started by calculating the overall accuracy of our diagnos-
tic. As before, we used the k-fold CV method. However, here
we used a reduced number of folds because, in this case, we
had to keep an additional subset of data separate from the final
sample, which we later used for the probability calibration of
the algorithm. We find that the maximum number of folds we
can split our data into while maintaining enough objects to ade-
quately represent the minority classes and still have good statis-
tics is five. The overall accuracy achieved is 81% ±1%. We
acknowledge that the accuracy alone is not enough to describe
the performance. Thus, we considered additional performance
evaluation metrics, such as the confusion matrix. This is a two-
way table in which the lines (y-axis) represent the true labels and
the columns (x-axis) are the predicted labels for each data point
made by the classifier. Each cell gives the fraction of the objects
from a given true-class that is classified in each of the considered
classes (columns). Therefore, the confusion matrix is the sum-
mary of the results made by the algorithm when evaluated on the
test subset on a class-by-class basis. In a perfect classifier, all
objects populate only the primary diagonal (y=x). The con-
fusion matrix provides information not only for the number of
correct predictions but also about the objects that were misclas-
sified, by checking what classes the classifier mixes. In Fig. 5we
present the confusion matrix calculated on the test subset of the
final data sample defined in Sect. 2.4. Inspecting the confusion
matrix we conclude that the overall performance is good, with
the higher scores achieved for the classes of the SF and passive
galaxies.
Fig. 5. Confusion matrix for the test subset of galaxies. The numbers
(and the color-code) in this plot represent the percentage of the classified
objects in each class with respect to the total true population in each
class. The labels on the x- and y-axis represent the predicted and the
true class of a galaxy, respectively. Here "Comp" stands for composite
galaxies and "Pas" for passive galaxies.
Based on these results we could also calculate additional
metrics that can give us a more detailed view of the performance
per class. In Table 4we present the values for the metrics de-
fined in Table 2calculated for each class. In that table we see
that the SF and passive galaxies have excellent scores while the
Article number, page 8 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
rest of the classes (AGN, LINER, and composite) have good to
moderate performances.
Table 4. Performance metrics calculated for each class.
Class Precision Recall F1-score Number of
galaxies
Star forming 0.99 0.81 0.89 8960
Seyfert 0.45 0.56 0.50 329
LINER 0.66 0.68 0.67 320
Composite 0.14 0.65 0.24 434
Passive 0.76 0.85 0.80 196
4.2. Feature importance
A useful output of the random forest algorithm is the feature im-
portance, which describes the relevance of each feature during
the training of the classifier. Therefore, it provides a measure of
how much a given feature contributes to the ability of the ran-
dom forest to discriminate between the dierent classes. So, a
feature that clearly characterizes a class will have high relevance
(or importance). Furthermore, it provides insights into the phys-
ical parameters that drive the performance of the classifier.
In Fig. 6we present a bar plot of the feature importance
scores. As the feature importance is calculated in each node, we
can calculate the average and its standard deviation. From Fig. 6
we see that the chosen feature scheme is well defined, as all fea-
tures play a similar role in the classification of these five activity
classes. Thus, the feature importance helps us better understand
the operation of this algorithm.
4.3. Application to the different redshift subsamples
After we trained and optimized the diagnostic tool, we pro-
ceeded by applying it to two dierent subsets of the test set (Sect.
3.5), spanning two dierent redshift ranges: 0.02 <z<0.05 and
0.05 <z<0.08. Since the initial training of the algorithm was
performed on the full redshift range of z=0.02 to z=0.08, this
exercise shows whether the performance of our diagnostic has a
redshift dependence. The number of true objects per class in the
0.02 <z<0.05 redshift range is as follows: 3982 SF, 164 AGN,
150 LINER, 216 composite, and 93 passive galaxies. In the red-
shift range of 0.05 <z<0.08 the number of objects is as fol-
lows: 4978 SF, 165 AGN, 170 LINER, 218 composite, and 103
passive galaxies. In Fig. 7we present the recall score for each
class in these two redshift bins when we apply our diagnostic to
each one separately. For reference, we also show the scores of the
diagnostic in the overall redshift range (0.02 <z<0.08). In this
figure we see that the diagnostic has similar behavior (similar
recall scores) for SF galaxies for the whole redshift range of our
sample of galaxies. However, we notice that AGN galaxies have
slightly reduced performance in the lower-redshift compared to
the higher-redshift bin (20% lower recall compared to the bin
having higher-redshift galaxies). Similar discrepancy is seen in
the case of LINER and passive galaxies. The reason for this be-
havior is discussed in Sect. 5.3.
5. Discussion
In this work we have defined a new all-inclusive (i.e., including
active and passive galaxies) diagnostic tool based on the com-
bination of commonly available mid-infrared and optical colors.
Fig. 6. Importance (relevance) of the three features used for the defi-
nition of the diagnosis as calculated during the training of the random
forest. The W2-W3 color is the most important feature, and all other
features are of comparable relevance.
Fig. 7. Fraction of correctly classified objects to the total true objects for
each class, for three redshift ranges. The first redshift range is z=0.02
to z=0.05 (galaxies in the HECATE catalog; Kovlakas et al. 2021),
indicated by blue disks in the plot, the second redshift bin isz=0.05
to z=0.08,indicated by red squares. The third redshift range includes
galaxies from the whole redshift range that the diagnostic was trained
on, 0.02 <z<0.08, which is indicated by black x marks. Here "Comp"
stands for composite galaxies and "Pas" passive galaxies.
In the following sections, we further discuss the behavior and
robustness of the diagnostic and we compare it with other com-
monly used diagnostics.
5.1. Physical interpretation of the results
So far we have seen that our diagnostic achieves very good per-
formance despite the limited information it uses. The use of the
optical color of the galactic nucleus seems to have an important
role in the ecient classification of galaxies (see Figs. 4and 6).
In Fig. 3we see that the five activity classes present dierent be-
havior in terms of the distribution of the colors considered in our
diagnostic.
In particular, in the case of SF and composite galaxies, we
see higher values of the W2-W3 color attributed to significant
emission from PAHs in the W3 band. On the other hand, passive
galaxies are poor in dust and their stellar populations are older,
resulting in a declining emission in redder wavelengths. In con-
trast, AGN galaxies show rising emission in the mid-infrared in
Article number, page 9 of 17
A&A proofs: manuscript no. output
all three WISE bands. This can be explained by emission from
the dusty torus around the accretion disk. However, in contrast to
SF galaxies they have weak PAH emission since these sensitive
molecules are destroyed by the strong UV radiation from the ac-
cretion disk (e.g., Alonso-Herrero et al. 2014) or their emission
is diluted by the AGN continuum (e.g., Genzel 1998).
Composite galaxies have weaker continua in the 3-12 µm
range than AGN, but with stronger PAH emission, which how-
ever is weaker than that of SF galaxies. They also show strong
silicate absorption. This is reflected in their W1-W2 and W2-W3
colors, which are in between those of AGN and SF galaxies. On
the other hand, passive galaxies have W1-W2 and W2-W3 colors
close to 0.
In the paragraphs above we have analyzed the discriminating
power that the mid-infrared colors can have in the activity clas-
sification of a galaxy. However, the mid-infrared color diagnos-
tic tools that have been developed so far often suer from dust
obscuration eects. A well-known example is that a starburst
galaxy can mimic an AGN galaxy (e.g., Hainline et al. 2016).
The introduction of an optical color is able to identify these cases
and breaks the degeneracy present in the mid-infrared color
space. In Fig. 3we see that the distribution of the SF galaxies
have g-rcolor that is clearly separated from the other activity
classes.
Our results (Sect. 4) show that the random forest diagnostic
achieves an overall accuracy of 81%. As this score was calcu-
lated on an independent sample (galaxies that were not used for
its training) it is a good estimation of its general performance.
In addition, the low standard deviation of the accuracy indicates
that our diagnostic has stable performance.
Now, considering the above-mentioned trend, we can look
back at the feature importance (Sect. 4.2) to see why some of
the features were more important than others for the training of
the algorithm. Observing Fig. 6we see that the feature with the
highest impact is the W2-W3 color. This can be explained by
the strong PAH emission of SF galaxies, which dominates the
emission in the W3 (centered at 12 µm) band.
5.2. Probability distributions
Besides the classification of each galaxy, the random forest algo-
rithm can also give an estimation of the probability of an object
belonging to each of the classes individually. A significant dif-
ference between the probability of the first ranking and the prob-
ability of the second-ranking class indicates a highly confident
classification. The probabilities we examine in this section are
the calibrated predicted probabilities (see Sect. 3.5).
In order to evaluate the confidence of the classifications per-
formed for each galaxy, we compared the probability of the high-
est and the second-highest ranking class. For this reason, in Fig.
8, we show the probability dierence between the highest and
the second-highest probabilities, p, and we plot it against the
maximum predicted probability for each of the five classes. Ob-
jects appearing in the top right corner of that plot have high prob-
ability of belonging to the first-rank class (close to 1) while the
probability dierence from the second class is also high. These
objects have been classified with very high confidence and thus
have high reliability. Another test we performed to check the re-
liability of the predicted probabilities was plotting the "recall"
and the "precision" for each class as a function of their predicted
probability. The process to calculate these curves is the follow-
ing: after we took objects that belong to only one class, we split
them into bins based on their predicted probability. Then, we cal-
culated the fraction of objects that were correctly predicted to be-
long to the class under examination to the total predictions made
in that bin to belong to that class (i.e., a measure of precision).
Also, we calculated the fraction of galaxies in each probability
bin that were identified correctly as belonging to a class by the
new diagnostic to the total number of objects that truly belong to
that class (i.e., a measure of recall). These plots are also shown in
Fig. 8. The error bars that are displayed in the above-mentioned
fractions (recall and precision are proportional to the square root
of the instances found in each bin. So, through error propagation,
we find the error for the recall to be error =qn2
N3+n
N2, where
n is the true positive and Nis the sum of the true positive and
false negative examples in each predicted probability bin, while
for the precision we used the same equation, but the n is the true
positive and Nis the sum of the true positive and false positive
examples in each predicted probability bin. By inspecting Fig. 8
we see that pis high for almost all classes indicating a highly
confident classification. Also, we see that as the predicted prob-
ability increases the recall and precision increase, indicating the
reliability of the classifications as a function of the maximum
predicted probability.
By inspecting the probabilities for each class individually,
we deduce that, especially for the classes of SF and passive
galaxies, the combination of high recall and high classification
probabilities makes it a highly confident classifier. However, de-
spite the excellent results for these two classes in the case of
AGNs, we see moderate performance scores. As shown in Sect.
4, regarding the predictions on the true AGN galaxies (ground
truth), there was considerable confusion with the class of com-
posite and LINER galaxies. This is because the AGN galaxies
share common properties with the LINER but also with the com-
posite galaxies. For example composite galaxies are the result of
AGN activity superimposed on a SF component (Kewley et al.
2001) or a SF component with photoionization by old stellar
populations (Cid Fernandes et al. 2010). Similarly, old stellar
populations (Stasi´
nska et al. 2008) or AGNs could be the exci-
tation in LINER galaxies (González-Martín et al. 2009). This is
reflected in the feature distributions we present in Fig. 3.
Another reason for this is the fact that the spectroscopic clas-
sifications that we considered as true (ground truth) are subject to
aperture eects. Maragkoudakis et al. (2014) studied the eect
of a changing aperture on the classification of an AGN galaxy
finding that AGN features change as a function of the physi-
cal distance of the region within the spectral aperture and hence
as a function of the observed distance. For example, an AGN
galaxy observed with an increasing aperture (starting from the
core) tends to move toward the Hii region in a BPT diagram.
Two techniques to mitigate this problem are the definition of di-
agnostics in a specific redshift range and the star-light subtrac-
tion but this behavior is not fully removed. Another source of
aperture eects is the dierence between the optical spectra and
the infrared photometry (WISE) we used for the definition of our
diagnostic. This is discussed in detail in the next section.
In an attempt to explore the role of aperture eects for the op-
tical colors, we explored two distinct diagnostic schemes based
on the redshift. We created two separate classifiers, each special-
ized in a specific redshift area. One classifier was trained in the
range 0.02 <z<0.05 and the other is 0.05 <z<0.08. Also, we
tried a unified scheme that contained galaxies across the entire
range of the redshift (0.02 <z<0.08) with the addition of the
redshift as an extra discriminating feature to the three originally
considered. Both attempts failed to improve the performance.
Article number, page 10 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
Fig. 8. Plots for the reliability analysis of the predicted probabilities of each galaxy activity class. For each class, we plot two diagrams. First, on the left under each class label, we plot the probability
dierence of the highest and second-highest predicted probability (p) against the maximum predicted probability. In every such plot, each black dot represents a galaxy, while the red line is the
normalized cumulative histogram with respect to p(top x-axis tick marks represent the fraction of the total objects). Second, in the right plot under each class label, we also plot the recall and
precision scores as a function of the maximum predicted probability for each activity class.
Article number, page 11 of 17
A&A proofs: manuscript no. output
5.3. Mixing between classes
In Sect. 4, we measured the performance of the diagnostic based
on its predictions on the test subset of galaxies. We find that
the classifier has excellent performance for the SF and passive
galaxies and good performance on the rest of the classes. But, be-
sides its high performance for SF galaxies, we observe that there
is a non-negligible fraction (confusion matrix; Fig. 5), of these
galaxies that are predicted as composites. Composites galaxies
have some common characteristics with SF galaxies so this is a
somewhat expected behavior. Further analysis of these misclas-
sified SF galaxies shows that they tend to have redder g-rcolors
than a typical SF galaxy.
Fig. 9. Two plots of the recall of SF galaxies as a function of the g-r
(SDSS) color. The blue line shows the fraction of predicted SF galaxies
to the total number of true SF galaxies per bin of increasing g-rcolor.
The rest of the lines describe the fraction of SF galaxies that change
classification (orange line, AGN; green line, LINER; red line, compos-
ite; purple, passive) to the total number of SF galaxies in a particular
bin. On the top, we use the g-rcolor from fiberMag, while on the bot-
tom plot we use the g-rcolor from the cModelMag photometry. Again,
"Comp" stands for composite galaxies and "Pas" for passive galaxies.
Figure 9shows the fraction of the correctly identified SF
galaxies (i.e., recall) and the true SF (spectroscopic classifica-
tion) that are predicted as dierent galaxy classes as a function
fiberMag (g-r) color (top panel), which was used for the classifi-
cation (Sect. 3.3), and the cModelMag (g-r) color, which reflects
the overall light of a galaxy. We estimated the errors as described
in Sect. 5.2. It is clear that for SF galaxies with bluer g-rcolors,
the classifier has a recall rate close to 1. On the other hand, as
the optical colors of the SF galaxies become redder, their recall
drops, and the galaxies are predicted almost exclusively as com-
posites. This result indicates that as these galaxies have gradually
older stellar populations their infrared colors resemble these of
a composite galaxy. Another interesting fact is that the recall of
the SF galaxies is more gradual if we use the integrated photom-
etry to calculate the g-rcolor, suggesting that SF galaxies with
a prominent bulge are more likely to be classified as compos-
ite. However, the value of the g-rcolor at which the fraction of
the SF galaxies predicted as SF becomes equal to the fraction
of SF galaxies predicted as composites remains almost the same
for both photometries (g-r0.65). This suggests that the clas-
sification of a SF galaxy is insensitive to aperture eects as star
formation is a galaxy-wide phenomenon and is not concentrated
in one area as the H ii regions are scattered across the galaxy
disk.
To further analyze the performance of the classifier, we
needed to understand the underlying activity of each class in
more detail. Starting from AGN galaxies, their emission comes
from accretion of circumnuclear material onto the SMBH at
their cores. The energy source of a LINER galaxy can be either
a SMBH or old stellar populations, including post-asymptotic-
giant-branch stars (Binette et al. 1994;Stasi´
nska et al. 2008;
Papaderos et al. 2013). Composite galaxies may have some
star-formation activity but can also harbor an accreting SMBH.
Lately, it has been suggested that these galaxies may host old
stellar populations that can actively contribute to their emission-
line spectrum (Stasi´
nska et al. 2008). Finally, a passive galaxy is
a system without any star formation or AGN activity, with low
to no reservoirs of dust and cold gas, with its main component
being the old stellar populations.
The class of passive galaxies is very well characterized by
this diagnostic tool. However, the confusion matrix (Fig. 5)
shows that some passive galaxies are misclassified by the diag-
nostic as LINER galaxies. This is consistent with studies claim-
ing that LINER-like activity originates from old stellar popula-
tions like a passive galaxy (Byler et al. 2019). For the class of
LINER galaxies, we see that most of the misclassified galax-
ies are predicted as passive (in agreement with the connection
between the LINER-like spectra and old stellar populations) fol-
lowed by the composite, and finally the AGN galaxies. Compos-
ite galaxies often have large bulges (Feltzing & Gilmore 2000;
Ortolani et al. 2001) and they contain old stellar populations sim-
ilar to passive galaxies. Therefore, composite galaxies could be
excited by old stellar populations as discussed earlier, and their
photometric colors can resemble those of passive galaxies (Fig.
3). Finally, LINER activity can sometimes be attributed to an ac-
tive nucleus (i.e., Ho 1999). In this case, the optical and infrared
colors of the galaxy would be consistent with those of AGNs.
For the class of composite galaxies, we see that (Fig. 5) the mis-
classified galaxies are mainly predicted as AGNs, which is an
expected result considering all the above-mentioned facts in this
section.
During the evaluation of our model, we found that the recall
of the AGN galaxies tends to increase with increasing distance
(Fig. 7). An explanation of this eect is that due to the aper-
ture eects (as well as volume and sensitivity eects) the AGNs
identified at larger distances tend to be more luminous. This is
seen in Fig. 10, which shows the histograms of Hαluminosity of
the AGN galaxies for two redshift bins, one for the very nearby
galaxies (0.02 <z<0.05) and one for galaxies further away
(0.05 <z<0.08). More luminous AGNs are more likely to
dominate the mid-infrared colors of a galaxy. Further support-
ing evidence is provided by Fig. 11, which shows the fraction of
spectroscopically selected AGNs classified in dierent classes
Article number, page 12 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
by the classifier as a function of their Hαluminosity. To pro-
duce this plot we split the AGNs (regardless of redshift) in our
sample into bins of increasing Hαluminosity. Then, we applied
our diagnostic and calculated the fraction of objects predicted
to belong to each class with respect to the spectroscopic AGN
in each bin. We can see that the fraction of the correctly iden-
tified (i.e., the recall) AGN increases as their Hαluminosity in-
creases. It is also clear that the diagnostic confuses cases of low-
luminosity AGNs (LLAGNs) as composites, which is reasonable
if we also consider the aperture eects. Based on this we esti-
mate that AGN galaxies with Hαluminosities below 5×1040
erg·s1are increasingly misclassified as composite galaxies.
Fig. 10. Histogram of the Hαluminosity for our sample of spectroscop-
ically classified AGN galaxies (considered as the ground truth). We split
them into two redshift bins. The first bin is from z=0.02 to z=0.05
(HECATE catalog; Kovlakas et al. 2021), plotted with the blue line, and
the second is from z=0.05 to z=0.08, plotted with the green line.
Another interesting fact about the class of AGN galaxies
comes from the misclassification instances that our diagnos-
tic tool makes. In Fig. 12 we plot the emission-line ratios of
[O iii]/Hβagainst [N ii]/Hα, [S ii]/Hα, and [O i]/Hα. The loca-
tion of spectroscopic AGN classified in dierent classes on the
[O iii]/Hβagainst [N ii]/Hαdiagram shows that the AGN galax-
ies that have been misclassified as SF are located primarily close
to the line of maximum starburst defined by Kewley et al. (2001),
which is the line that separates AGN and SF galaxies. In addi-
tion, in the plot of [O iii]/Hβagainst [S ii]/Hα, we see that AGN
galaxies that have been predicted as LINER galaxies are located
very close to the separating line of Schawinski et al. (2007) that
separates AGN and LINER galaxies and are systematically lo-
cated in the upper right area of the AGN locus having higher
values of [S ii]/Hα. Also, we see that AGN galaxies misclassified
as LINER galaxies show a similar trend in the plot of [O iii]/Hβ
against [O i]/Hαas in the [O iii]/Hβagainst [S ii]/Hα. However,
we notice that there is significant mixing between the composite
and AGN galaxies. The misclassified AGN predicted as compos-
ites have no specific trend as they are scattered across the AGN
locus of these plots, an eect that can be attributed to the use of
the galaxy-wide infrared colors. Nonetheless this is acceptable,
especially considering the flexibility provided by using the in-
Fig. 11. Fraction of the correctly identified AGN galaxies (true posi-
tives) to the total true AGN galaxies (i.e., a measure of recall or "com-
pleteness") as a function of the AGN Hαluminosity for all spectro-
scopically selected AGN galaxies in our sample (orange line). All other
colored lines represent the fraction of true AGN galaxies that the diag-
nostic predicted to belong to a class other than AGN (false negatives)
to the total true AGN (blue, SF; green, LINER; red, composite; pur-
ple, passive). All fractions were calculated after the galaxies were split
into bins of increasing Hαluminosity that contain the same number of
galaxies.
tegrated colors of the galaxies and the excellent performance in
the case of the other classes.
5.4. Comparison with other methods
In order to determine if this new diagnostic, which is based on
infrared and optical colors, provides any advantage over the al-
ready established ones, we compared their performances against
the performance of our diagnostic. Taking into account the fact
that they are based on dierent criteria and parent samples, we
discuss their advantages and disadvantages.
A widely used infrared diagnostic is the W1-W2 0.8 cri-
terion (Stern et al. 2012;Assef et al. 2013), where a demarca-
tion line based on two WISE bands (band 1 and 2) separates
AGN galaxies from the rest of the galaxies. Other similar di-
agnostic tools have been introduced by Jarrett et al. (2011),
Mateos et al. (2012), and Coziol et al. (2014). The first two
are two-dimensional diagnostics defined based on the W1-W2
against the W2-W3, while the third was based on a plot of W3-
W4 against W2-W3 colors. In the case of the Mateos et al.
(2012) diagnostic tool, which was focused on high-luminosity
AGNs, the authors define an AGN selection wedge on the upper
right corner of the plot defined by the equations: (W1W2) =
0.315(W2W3)+0.796, (W1W2) =0.315(W2W3)0.222,
and (W1W2) =3.172 (W2W3) +7.624. To test the appli-
cability of this diagnostic to the wider population of (non-X-ray
selected) AGN galaxies, in Fig. 13 we plot the W1-W2 against
the W2-W3 colors of the galaxies in our full sample (Sect. 2.4)
color-coded according to their spectroscopic classification (left)
and the classification based on our diagnostic (right). The galax-
ies presented in Fig. 13 originate from the SDSS sample in the
redshift range of z=0.02 to z=0.08. We see that a significant
number of AGN galaxies are located outside of the AGN locus
of Mateos et al. (2012) and below the demarcation line of Stern
et al. (2012). This behavior holds even with the diagnostic of Jar-
Article number, page 13 of 17
A&A proofs: manuscript no. output
Fig. 12. Diagrams of [O iii]λ5007/Hβagainst [N ii]λ6584/Hα(left), [S ii]λλ6716,6731/Hα(middle), and [O i]λ6300/Hα(right), showing the
location of an optically selected sample of AGN galaxies from our test sample (Sect. 4.1). The points are color-coded depending on their classifi-
cation based on our diagnostic: AGN in green, SF in red, LINER in blue, and composite in yellow. We see that, since these are two-dimensional
projections of the four-dimensional space used for the optical line-ratio classification, some AGN galaxies may fall outside the AGN demarcation
line. The solid black curve is the extreme starburst line defined by Kewley et al. (2001). The straight black line is the separating line between
AGN and LINER galaxies as defined by Schawinski et al. (2007). The dashed black curve is the Kaumann et al. (2003) line separating SF from
composite galaxies.
Fig. 13. Color-color plots of W1-W2 against W2-W3 for our sample of galaxies. Left: Galaxies based on their spectroscopic classification (true
class). Right: Same but for galaxies whose class labels were assigned by the new diagnostic tool. The solid black line is the locus of AGN galaxies
as defined by Mateos et al. (2012), while the dashed black line is the demarcation line between an AGN and a non-AGN galaxy as defined by
Stern et al. (2012). The dashed blue lines define the AGN galaxy selection box as defined by Jarrett et al. (2011). SF galaxies are shown in blue,
LINER in yellow, passive in red, composite in purple, and AGN in green. Labels in the legend are the same as in Fig. 9. We see that there is
a significant population of spectroscopic AGN galaxies located below the existing infrared diagnostics, which are correctly identified with our
diagnostic. There is also a population of extreme SF galaxies located in the AGN locus of the existing AGN diagnostics that are also correctly
classified by our diagnostic.
rett et al. (2011), since all three diagnostics (Jarrett et al. 2011;
Mateos et al. 2012;Assef et al. 2013) are based on luminous
AGN samples for their definition. Furthermore, we observe that
there are SF galaxies that the two methods of AGN identification
(Mateos et al. 2012;Assef et al. 2013) classify wrongly as AGN
galaxies, which we discuss further in Sect. 5.5.
For the AGN galaxies, we included one more mid-infrared
diagnostic and performed a quantitative comparison between
our diagnostic tool and the rest widely used infrared diagnos-
tic tools that were mentioned earlier. The other diagnostic we
included in our comparison was defined by Coziol et al. (2014).
This particular diagnostic focuses on spectroscopically selected
LLAGNs. The selection criteria of this diagnostic consist of a
two-dimensional plot of W3-W4 against W2-W3 that is sep-
arated into four parts with two crossing lines; (W3W4) =
1.6(W2W3) +3.2 and (W3W4) =2.0(W2W3) +8.0.
To obtain a quantitative comparison between our diagnostic
and the four aforementioned tools, we used the test sample (Sect.
4.1) and chose only the AGN galaxies (329 in total), which will
be considered as the ground truth for the comparison. We fo-
cused on AGN galaxies since these tools are tailored for the
classification of AGN galaxies in which our diagnostic shows
weaker performance. Then we applied our new diagnostic and
the diagnostic tools of Mateos et al. (2012), Assef et al. (2018),
Article number, page 14 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
Table 5. Comparison of the classification results between our diagnostic
and other widely used mid-infrared diagnostics.
Other
diagnostics
This work
AGN Non-AGN
JR11 AGN 25 (7.6%) 1 (0.3%)
Non-AGN 158 (48.0%) 145 (44.1%)
MT12 AGN 22 (6.7%) 1 (0.3%)
Non-AGN 161 (48.9%) 145 (44.1%)
CZ14 AGN 85 (31.4%) 49 (18.0%)
Non-AGN 73 (26.9%) 64 (23.6%)
AS18
(R75)
AGN 55 (16.7%) 2 (0.6%)
Non-AGN 128 (38.9%) 144 (43.8%)
AS18
(R90)
AGN 32 (9.7%) 1 (0.3%)
Non-AGN 151 (45.9%) 145 (44.1%)
Notes. The comparison is performed on an optically selected sample of
329 AGN galaxies. Columns represent our classification results while
rows are the classification obtained by the other diagnostics. The R75
and R90 refer to the two available schemes in the work of Assef et al.
(2018) that give 75% and 90% reliability for AGN selection, respec-
tively. References: JR11, Jarrett et al. (2011); MT12, Mateos et al.
(2012); CZ14, Coziol et al. (2014); AS18, Assef et al. (2018).
Coziol et al. (2014), and Jarrett et al. (2011) to that sample of
optically selected AGN galaxies. Since the other diagnostics of-
fer only AGN or non-AGN classification, we adapted our results
accordingly. Any galaxy that receives a classification by our di-
agnostic other than AGN (SF, LINER, composite, or passive) is
characterized as non-AGN. In Table 5we present the results of
this comparison between the classifications made by our diag-
nostic and the other diagnostic methods. We see that despite our
moderate scores for the class of AGN galaxies, our diagnostic
tool identifies more AGN galaxies than any other method tested
here.
While the existing diagnostics are very eective in identify-
ing reliable samples of AGNs (albeit with some contamination
by extreme starburst; see the next section), they are biased to-
ward the more obscured and more luminous AGNs, missing the
bulk of their populations. In Fig. 14 we plot the W1-W2 against
the W2-W3 for all the spectroscopically classified AGNs in our
sample color-coded with their Hαluminosity. In that figure we
see that only the more luminous AGNs will be selected by the
diagnostics of Jarrett et al. (2011), Mateos et al. (2012), and
Stern et al. (2012). Instead, our diagnostic also provides samples
of LLAGNs, with the unavoidable mixing with composite and
LINER galaxies. Furthermore, since they are driven by the clas-
sification of one activity class they are not ecient in identifying
reliable samples of SF galaxies (which have strong contamina-
tion by LLAGNs) or other types of galaxies (composite, passive,
and LINER).
5.5. Star-forming galaxies with extremely red mid-infrared
colors
By inspecting Fig. 13 (left panel) more closely, we see that there
is a significant number of spectroscopically classified SF galax-
ies with mid-infrared colors of W1-W2 0.8 (Stern et al. 2012).
Normally these galaxies would have been classified as AGN
galaxies by most mid-infrared selection methods (Jarrett et al.
2011;Mateos et al. 2012;Assef et al. 2013). In contrast to these
AGN selection methods, our diagnostic tool is able to separate
these cases as we can see that it manages to retrieve 82% of the
spectroscopically classified SF galaxies that are located above
the Stern et al. (2012) line. This means that our diagnostic has
the ability to correctly identify a case where a starburst galaxy
looks like an AGN (e.g., Hainline et al. 2016). This ability is the
result of the use of optical colors, which is a tell-tale signature
of extreme starburst galaxies.
By further investigating the optical spectra and SDSS images
of these peculiar SF galaxies (W1-W2 0.8) we find that their
majority have spectra that are indicative of an H ii region ap-
pearing as blue compact spherical objects in the SDSS images.
There is also a population with redder SDSS colors, indicating
the presence of dust. These SF galaxies appear to have W2-W3
colors that are systematically redder than the AGNs. An inter-
esting fact is that the "dusty" SF galaxies are mainly located in
the area of W2-W3 3.5, while above that value almost all SF
galaxies seem to be blue and compact.
Fig. 14. W1-W2 color against W2-W3 color for the spectroscopically
classified AGN galaxies in our sample, color-coded by Hαluminosity.
The solid black line is the locus of AGN galaxies as defined by Mateos
et al. (2012), while the dashed black line is the demarcation line between
an AGN and a non-AGN galaxy as defined by Stern et al. (2012). The
dashed blue lines define the AGN selection box as defined by Jarrett
et al. (2011).
Additional evidence for a population of SF galaxies con-
taminating the mid-infrared AGN diagnostics is provided by the
Hainline et al. (2016). The focus of this work is the properties
of a spectroscopically selected sample of dwarf galaxies includ-
ing AGN and SF galaxies. They find that a significant fraction
of the optically selected AGN galaxies is not selected as AGNs
by the mid-infrared diagnostics of Jarrett et al. (2011) and Stern
et al. (2012). They also find that in these two AGN diagnostics,
there is a significant contamination in the mid-infrared selected
samples of AGNs, a result that is supported by our analysis.
These galaxies with AGN-like mid-infrared colors are con-
sistent with the "blueberry" galaxies, which are characterized by
compact sizes, extreme blue colors, and H ii region-like spectra
(Yang et al. 2017). Also, these galaxies tend to be metal-poor
with extreme mid-infrared colors (Hainline et al. 2016). In our
analysis, we find that galaxies with extreme mid-infrared colors
(W2-W3 3.5) that are spectroscopically identified as SF have
Article number, page 15 of 17
A&A proofs: manuscript no. output
g-rcolor <0 (median 0.35). In fact, this is the main dieren-
tiating factor between the AGNs with extreme mid-infrared col-
ors and the blueberry galaxies for our diagnostic. Star-forming
galaxies above the W1-W2 =0.8 and with W2-W3 3.5 seem
to have optical g-rcolors redder than the ones with W2-W3
3.5 (median 0.2).
Yang et al. (2017) provide a catalog of 41 objects with a
spectroscopically selected sample of blueberry galaxies that ap-
pear to be compact and blue in the SDSS gri (blue-green-red)
images. After cross-matching these galaxies with the HECATE
catalog (Kovlakas et al. 2021), a value-added catalog for the lo-
cal Universe (distances up to 200 Mpc), and applying quality
cuts to photometry (as described in Sect. 2.4), we applied our
diagnostic. Seven out of 41 objects satisfied our quality criteria.
All seven of them were classified correctly by the diagnostic as
being SF, while four out of seven of them are above the AGN
demarcation line of Stern et al. (2012). Also, we find that the
Jarrett et al. (2011) wedge classifies as AGNs two out of seven
objects, while Mateos et al. (2012) classifies three out of seven
as AGNs. Of course, these results must be taken with caution as
the number of objects is limited.
6. Conclusions
In this work we have combined mid-infrared and optical pho-
tometry to define a new galactic activity diagnostic that includes
all activity classes under one unified scheme while oering im-
proved performance. Our results are summarized as follows:
1. We have created a new machine learning activity diagnostic
tool for galaxy classification that can discriminate between
five dierent classes of galaxies. This is the first machine-
learning-based tool for galaxy classification that includes not
only active but also passive galaxies. The code, with applica-
tion instructions, is available via the GitHub repository2.
2. This diagnostic extends the existing infrared diagnostics to
additional types of activity, is more sensitive to LLAGNs,
and can be used for local galaxies (z0) up to z0.08.
3. The addition of the optical color, g-r, to the two infrared
WISE colors that were used in previous works means our
tool can go beyond the bimodal classification (AGN or non-
AGN) of most diagnostic methods, allowing their extension
to four dierent activity classes as well as passive galaxies.
4. There is some mixing between the classification of some
objects, but this is observed mainly in classes that share
common properties (e.g., composite and AGN galaxies). Al-
though this results in reduced performance scores for the
class of AGN galaxies, the performance we achieve is still
superior to that achieved by previous works. However, the
addition of extra features (e.g., UV colors) may help improve
the performance, though it will limit its applicability to ob-
jects with photometry from the UV to infrared wavelengths.
5. By including optical information (the g-rcolor), our di-
agnostic is able to distinguish a starburst galaxy with ex-
treme mid-infrared colors (e.g., blueberries) from an ob-
scured AGN galaxy that would have been classified as a true
AGN based on traditional infrared diagnostic methods.
There are several directions that can be taken to improve the
performance of the diagnostic, especially for classifying AGN
galaxies and reducing the mixing between the AGN and com-
posite galaxies. These include: adding features that are charac-
teristic of AGN activity (e.g., UV colors), luminosity, or even
2https://github.com/BabisDaoutis/
GalActivityClassifier
spectral information, but this would be at the cost of reducing
the applicability of the diagnostic to a larger sample of data.
Acknowledgements. We thank the anonymous referee for their constructive com-
ments and suggestions that helped to improve this work and the clarity of this
manuscript. We thank Paolo Bonfini for very useful disscusions on machine-
learning methods that helped to improve the performance of the classifier. We
also thank Sotiria Fotopoulou for discussions on the classification of the AGN.
CD and EK acknowledge support from the Public Investments Program through
a Matching Funds grant to the IA-FORTH. The research leading to these re-
sults has received funding from the European Research Council under the Eu-
ropean Union’s Seventh Framework Programme (FP/2007-2013) /ERC Grant
Agreement n. 617001, and the European Union’s Horizon 2020 research and
innovation programme under the Marie Skłodowska-Curie RISE action, Grant
Agreement n. 873089 (ASTROSTAT-II). KK is supported by the project ”Sup-
port of the international collaboration in astronomy (Asu mobility)” with the
number: CZ 02.2.69/0.0/0.0/18_053/0016972. Funding for SDSS-III has been
provided by the Alfred P. Sloan Foundation, the Participating Institutions, the
National Science Foundation, and the U.S. Department of Energy Oce of Sci-
ence. The SDSS-III web site is http://www.sdss3.org/. SDSS-III is managed by
the Astrophysical Research Consortium for the Participating Institutions of the
SDSS-III Collaboration including the University of Arizona, the Brazilian Par-
ticipation Group, Brookhaven National Laboratory, Carnegie Mellon University,
University of Florida, the French Participation Group, the German Participa-
tion Group, Harvard University, the Instituto de Astrofisica de Canarias, the
Michigan State/Notre Dame/JINA Participation Group, Johns Hopkins Univer-
sity, Lawrence Berkeley National Laboratory, Max Planck Institute for Astro-
physics, Max Planck Institute for Extraterrestrial Physics, New Mexico State
University, New York University, Ohio State University, Pennsylvania State Uni-
versity, University of Portsmouth, Princeton University, the Spanish Participation
Group, University of Tokyo, University of Utah, Vanderbilt University, Univer-
sity of Virginia, University of Washington, and Yale University.
References
Alonso-Herrero, A., Ramos Almeida, C., Esquej, P., et al. 2014, MNRAS, 443,
2766
Assef, R. J., Stern, D., Kochanek, C. S., et al. 2013, ApJ, 772, 26
Assef, R. J., Stern, D., Noirot, G., et al. 2018, ApJS, 234, 23
Baldwin, J. A., Phillips, M. M., & Terlevich, R. 1981, PASP, 93, 5
Bell, E. F., Wolf, C., Meisenheimer, K., et al. 2004, ApJ, 608, 752
Binette, L., Magris, C. G., Stasi´
nska, G., & Bruzual, A. G. 1994, A&A, 292, 13
Brinchmann, J., Charlot, S., White, S. D. M., et al. 2004, MNRAS, 351, 1151
Byler, N., Dalcanton, J. J., Conroy, C., et al. 2019, AJ, 158, 2
Cid Fernandes, R., Stasi´
nska, G., Schlickmann, M. S., et al. 2010, MNRAS, 403,
1036
Coziol, R., Torres-Papaqui, J. P., Plauchu-Frayn, I., et al. 2014, Rev. Mexicana
Astron. Astrofis., 50, 255
Donley, J. L., Koekemoer, A. M., Brusa, M., et al. 2012, ApJ, 748, 142
Feltzing, S. & Gilmore, G. 2000, A&A, 355, 949
Genzel, R. 1998, Nature, 391, 17
González-Martín, O., Masegosa, J., Márquez, I., Guainazzi, M., & Jiménez-
Bailón, E. 2009, A&A, 506, 1107
Hainline, K. N., Reines, A. E., Greene, J. E., & Stern, D. 2016, ApJ, 832, 119
Heckman, T. M. 1980, A&A, 500, 187
Ho, L. C. 1999, Advances in Space Research, 23, 813
Ho, L. C., Filippenko, A. V., & Sargent, W. L. W. 1997, ApJ, 487, 568
Jarrett, T. H., Cohen, M., Masci, F., et al. 2011, ApJ, 735, 112
Kaumann, G., Heckman, T. M., Tremonti, C., et al. 2003, MNRAS, 346, 1055
Kewley, L. J., Dopita, M. A., Sutherland, R. S., Heisler, C. A., & Trevena, J.
2001, ApJ, 556, 121
Koekemoer, A. M., Donley, J., Grogin, N., et al. 2011, in American Astronomical
Society Meeting Abstracts, Vol. 218, American Astronomical Society Meet-
ing Abstracts #218, 328.03
Kovlakas, K., Zezas, A., Andrews, J. J., et al. 2021, MNRAS, 506, 1896
Lang, D., Hogg, D. W., & Schlegel, D. J. 2016, AJ, 151, 36
Louppe, G. 2014, arXiv e-prints, arXiv:1407.7502
Maragkoudakis, A., Zezas, A., Ashby, M. L. N., & Willner, S. P. 2014, MNRAS,
441, 2296
Mateos, S., Alonso-Herrero, A., Carrera, F. J., et al. 2012, MNRAS, 426, 3271
Ortolani, S., Barbuy, B., Bica, E., et al. 2001, A&A, 376, 878
Papaderos, P., Gomes, J. M., Vílchez, J. M., et al. 2013, A&A, 555, L1
Schawinski, K., Thomas, D., Sarzi, M., et al. 2007, MNRAS, 382, 1415
Skrutskie, M. F., Cutri, R. M., Stiening, R., et al. 2006, AJ, 131, 1163
Stampoulis, V., van Dyk, D. A., Kashyap, V. L., & Zezas, A. 2019, MNRAS,
485, 1085
Stasi´
nska, G., Vale Asari, N., Cid Fernandes, R., et al. 2008, MNRAS, 391, L29
Stern, D., Assef, R. J., Benford, D. J., et al. 2012, ApJ, 753, 30
Stern, D., Eisenhardt, P., Gorjian, V., et al. 2005, ApJ, 631, 163
Tremonti, C. A., Heckman, T. M., Kaumann, G., et al. 2004, ApJ, 613, 898
Werner, M. W., Roellig, T. L., Low, F. J., et al. 2004, ApJS, 154, 1
Wright, E. L., Eisenhardt, P. R. M., Mainzer, A. K., et al. 2010, AJ, 140, 1868
Yang, H., Malhotra, S., Rhoads, J. E., & Wang, J. 2017, ApJ, 847, 38
Article number, page 16 of 17
C. Daoutis et al.: A versatile classification tool for galactic activity using optical and infrared colors
Appendix A: Optimization of significant
hyperparameters
The process of optimizing this activity diagnostic is performed
with the use of the GridSearchCV algorithm, which is pro-
vided by the scikit-learn Python 3 package, version 1.1.2.
The determination of the optimal values of each hyperparameter
is usually done by training the algorithm several times with dif-
ferent choices of the hyperparameter values each time, typically
by means of a grid search. The performance of the algorithm is
evaluated at each point of the grid, and the optimal set of pa-
rameters that maximizes the performance is chosen. Since only
some of the available hyperparameters have significant impact
on the performance of our diagnostic, it is inecient to perform
a grid search including all of them. We find that only the follow-
ing seven hyperparameters are important:
n_estimators : The number of decision trees.
max_depth : The maximum depth of a decision tree.
min_samples_split : The minimum required number
of samples to split a node.
min_samples_leaf : The minimum number of samples
in a leaf node.
max_leaf_nodes : The maximum allowable number of
terminal nodes in a tree.
max_samples : The number of samples from the training
set to build each tree.
criterion : The function to measure the quality of a split.
However, since a broad grid search in a seven-dimensional
space can still be computationally intensive, we first narrowed
the range of these parameters by calculating its performance for
dierent values of each hyperparameter separately. To do this,
we calculated the validation curves that show dierent perfor-
mance metrics as a function of each hyperparameter value (see
Fig. A.1). More specifically, each validation curve has the perfor-
mance score of a metric (e.g., accuracy) on the y-axis and a range
of the possible values of one hyperparameter (e.g., n_estimators)
on the x-axis while keeping all the other hyperparameters con-
stant. The performance reported on each plot is the performance
on the testing fold calculated with the CV method, which is per-
formed by splitting the data into k folds and using k-1 folds for
its training and one for testing its performance. The algorithm is
trained k times in total for each combination of hyperparameters
by cycling the folds so that all k folds have been in the position
of the testing fold once. The reported performance score is the
average of these k scores.
By inspecting Fig. A.1 we see that the ranges of values for
the best hyperparameters are found in the areas where the scores
start to converge to the best achievable score (curves start to con-
verge parallel to the x-axis). When this happens it means that
overfitting starts to occur. This way we significantly reduced the
ranges of the hyperparameters that we have to test to find the
optimal ones. Afterward, from these validation curves we deter-
mined the sensitivity of the algorithm to the dierent hyperpa-
rameters, we find the ranges of the parameters that significantly
aect its performance in order to perform a grid search around
these ranges. The range for each hyperparameter was found by
inspecting the behavior of the accuracy score.
The next step is to use these best value ranges for each hy-
perparameter extracted by the plots as input to the grid search
algorithm. Then, that algorithm will make combinations of the
hyperparameters from the best value ranges and it will fit each
derived model in order to find which model has the best perfor-
mance scores. In Table A.1 we present the hyperparameters that
have been considered for optimization along with their search
ranges and their best values.
Fig. A.1. Validation curves for the determination of the best hyperpa-
rameter ranges for reducing the possible values of each hyperparameter.
The CV scores refer to accuracy as a function of each hyperparameter.
The red line is the average accuracy calculated using the k-fold CV
method (k =5), while the shaded gray area is the standard deviation
(1σ) of the k-scores for each value of the hyperparameter under exami-
nation.
After we obtained the set of the best hyperparameters, we
could test the stability of the algorithm when it is trained and
tested on all possible subsets for the training and the test data.
One such performance stability test is to perform k-fold CV and
observe the change in the accuracy score. A stable algorithm
should have a low standard deviation on its accuracy, which
means that the performance of the algorithm does not signif-
icantly fluctuate between its subsequent application to similar
data, although small fluctuations are unavoidable as a result of
the stochastic nature of the algorithm. Even though the sam-
ple is fairly large, for some minority classes there are not much
data. For that reason, the number of folds chosen here is 5 (each
fold consists of 8190 objects), as even though we have a signif-
icant number of objects, some classes will be underrepresented
with a choice of a higher number of folds. The overall accuracy
score of the model when calculated with the k-fold CV method is
81%±1%. This suggests that the fluctuations of the performance
scores are low, suggesting a stable algorithm.
Table A.1. Important hyperparameters that were optimized for the adap-
tion random forest algorithm in our sample of galaxies.
Parameter Search range Best value
n_estimators 10-250 120
max_depth 15-20 17
min_samples_split - "default"
min_samples_leaf - "default"
max_leaf_nodes 25-70 30
max_samples 0.1-0.9 0.6
class_weight - "balanced_subsample"
criterion - "Gini"²
Article number, page 17 of 17
Article
Supermassive black holes disrupt passing stars, producing outbursts called tidal disruption events (TDEs). TDEs have recently gained attention due to their unique dynamics and emission processes, which are still not fully understood. Especially, the so-called optical TDEs are of interest as they often exhibit delayed or obscured X-ray emission from the accretion disc, making the origin of the prompt emission unclear. In this paper, we present multiband optical polarization observations and optical spectrometry of a recent TDE candidate AT 2022fpx, alongside monitoring observations in optical, ultraviolet, and X-rays. The optical spectra of AT 2022fpx show Bowen fluorescence as well as highly ionized iron emission lines, which are characteristic of extreme coronal line emitters. Additionally, the source exhibits variable but low-polarized continuum emission at the outburst peak, with a clear rotation of the polarization angle. X-ray emission observed approximately 250 d after the outburst peak in the decay appear flare-like but is consistent with constant temperature blackbody emission. The overall outburst decay is slower than for typical TDEs, and resembles more the ones seen from Bowen fluorescence flares. These observations suggest that AT 2022fpx could be a key source in linking different long-lived TDE scenarios. Its unique characteristics, such as extreme coronal line emission, variable polarization, and delayed X-ray flare, can be attributed to the outer shock scenario or a clumpy torus surrounding the supermassive black hole. Further studies, especially in the context of multiwavelength observations, are crucial to fully understand the dynamics and emission mechanisms of these intriguing astrophysical events.
Article
Full-text available
Finding massive black holes (MBHs, $M_ BH astrosun $) in the nuclei of low-mass galaxies ($M_ M_ astrosun $) is crucial to constrain seeding and growth of black holes over cosmic time, but it is particularly challenging due to their low accretion luminosities. Variability selection via long-term photometric ultraviolet, optical, or infrared (UVOIR) light curves has proved effective and identifies lower-Eddington ratios compared to broad and narrow optical spectral lines searches. In the inefficient accretion regime, X-ray and radio searches are effective, but they have been limited to small samples. Therefore, differences between selection techniques have remained uncertain. Here, we present the first large systematic investigation of the X-ray properties of a sample of known MBH candidates in dwarf galaxies. We extracted X-ray photometry and spectra of a sample of $ UVOIR variability-selected MBHs and significantly detected 17 of them in the deepest available SRG /eROSITA image, of which four are newly discovered X-ray sources and two are new secure MBHs. This implies that tens to hundreds of LSST MBHs will have SRG /eROSITA counterparts, depending on the seeding model adopted. Surprisingly, the stacked X-ray images of the many non-detected MBHs are incompatible with standard disk-corona relations, typical of active galactic nuclei, inferred from both the optical and radio fluxes. They are instead compatible with the X-ray emission predicted for normal galaxies. After careful consideration of potential biases, we identified that this X-ray weakness needs a physical origin. A possibility is that a canonical X-ray corona might be lacking in the majority of this population of UVOIR-variability selected low-mass galaxies or that unusual accretion modes and spectral energy distributions are in place for MBHs in dwarf galaxies. This result reveals the potential for severe biases in occupation fractions derived from data from only one waveband combined with SEDs and scaling relations of more massive black holes and galaxies.
Article
Full-text available
We present photometry of images from the Wide-Field Infrared Survey Explorer (WISE; Wright et al. 2010) of over 400 million sources detected by the Sloan Digital Sky Survey (SDSS; York et al. 2000). We use a "forced photometry" technique, using measured SDSS source positions, star-galaxy separation and galaxy profiles to define the sources whose fluxes are to be measured in the WISE images. We perform photometry with The Tractor image modeling code, working on our "unWISE" coaddds and taking account of the WISE point-spread function and a noise model. The result is a measurement of the flux of each SDSS source in each WISE band. Many sources have little flux in the WISE bands, so often the measurements we report are consistent with zero. However, for many sources we get three- or four-sigma measurements; these sources would not be reported by the WISE pipeline and will not appear in the WISE catalog, yet they can be highly informative for some scientific questions. In addition, these small-signal measurements can be used in stacking analyses at catalog level. The forced photometry approach has the advantage that we measure a consistent set of sources between SDSS and WISE, taking advantage of the resolution and depth of the SDSS images to interpret the WISE images; objects that are resolved in SDSS but blended together in WISE still have accurate measurements in our photometry. Our results, and the code used to produce them, are publicly available at http://unwise.me.
Article
Full-text available
We present Gran Telescopio CANARIAS CanariCam 8.7 μm imaging and 7.5–13 μm spectroscopy of six local systems known to host an active galactic nucleus (AGN) and have nuclear star formation. Our main goal is to investigate whether the molecules responsible for the 11.3 μm polycyclic aromatic hydrocarbon (PAH) feature are destroyed in the close vicinity of an AGN. We detect 11.3 μm PAH feature emission in the nuclear regions of the galaxies as well as extended PAH emission over a few hundred parsecs. The equivalent width (EW) of the feature shows a minimum at the nucleus but increases with increasing radial distances, reaching typical star-forming values a few hundred parsecs away from the nucleus. The reduced nuclear EWs are interpreted as due to increased dilution from the AGN continuum rather than destruction of the PAH molecules. We conclude that at least those molecules responsible for the 11.3 μm PAH feature survive in the nuclear environments as close as 10 pc from the AGN and for Seyfert-like AGN luminosities. We propose that material in the dusty tori, nuclear gas discs, and/or host galaxies of AGN is likely to provide the column densities necessary to protect the PAH molecules from the AGN radiation field.
Article
Full-text available
Activity classification of galaxies based on long-slit and fibre spectroscopy can be strongly influenced by aperture effects. Here, we investigate how activity classification for 14 nearby galaxies depends on the proportion of the host galaxy's light that is included in the aperture. We use both observed long-slit spectra and simulated elliptical-aperture spectra of different sizes. The degree of change varies with galaxy morphology and nuclear activity type. Starlight removal techniques can mitigate but not remove the effect of host galaxy contamination in the nuclear aperture. Galaxies with extranuclear star formation can show higher [O iii] λ5007/Hβ ratios with increasing aperture, in contrast to the naive expectation that integrated light will only dilute the nuclear emission lines. We calculate the mean dispersion for the diagnostic line ratios used in the standard BPT diagrams with respect to the central aperture of spectral extraction to obtain an estimate of the uncertainties resulting from aperture effects.
Article
Early-type galaxies (ETGs) frequently show emission from warm ionized gas. These low-ionization emission regions (LIERs) were originally attributed to a central, low-luminosity active galactic nucleus. However, the recent discovery of spatially extended LIER emission suggests ionization by both a central source and an extended component that follows a stellar-like radial distribution. For passively evolving galaxies with old stellar populations, hot post-asymptotic giant branch (AGB) stars are the only viable extended source of ionizing photons. In this work, we present the first prediction of LIER-like emission from post-AGB stars that is based on fully self-consistent models of stellar evolution and photoionization. We show that models where post-AGB stars are the dominant source of ionizing photons reproduce the signatures of nebular emission observed in ETGs, including LIER-like emission line ratios in standard optical diagnostic diagrams and equivalent widths of the order of 0.1–3 . We test the sensitivity of LIER-like emission to the details of post-AGB models, including the mass loss efficiency and convective mixing efficiency, and show that line strengths are relatively insensitive to variations in post-AGB timescale. Finally, we examine the UV–optical colors of the models and the stellar populations responsible for the UV excess observed in some ETGs. We find that allowing as little as 3% of the horizontal branch population to be uniformly distributed to very high temperatures (30,000 K) produces realistic UV colors for old, quiescent ETGs.
Article
We present the Heraklion Extragalactic Catalogue, or HECATE, an all-sky value-added galaxy catalogue, aiming to facilitate present and future multi-wavelength and multi-messenger studies in the local Universe. It contains 204,733 galaxies up to a redshift of 0.047 (D≲ 200 Mpc), and it is >50 per cent complete in terms of the B-band luminosity density at distances in the 0–170 Mpc range. By incorporating and homogenising data from astronomical databases and multi-wavelength surveys, the catalogue offers positions, sizes, distances, morphological classifications, star-formation rates, stellar masses, metallicities, and nuclear activity classifications. This wealth of information can enable a wide-range of applications, such as: (i) demographic studies of extragalactic sources, (ii) initial characterisation of transient events, and (iii) searches for electromagnetic counterparts of gravitational-wave events. The catalogue is publicly available to the community at a dedicated portal, which will also host future extensions in terms of the covered volume, and data products.
Article
We propose a new soft clustering scheme for classifying galaxies in different activity classes using simultaneously four emission-line ratios: log ([|${\rm N\,\small {II}}$|]/H α), log ([|${\rm S\,\small {II}}$|]/H α), log ([|${\rm O\,\small {I}}$|]/H α), and log ([|${\rm O\,\small {III}}$|]/H β). We fit 20 multivariate Gaussian distributions to the four-dimensional distribution of these lines obtained from the Sloan Digital Sky Survey in order to capture local structures and subsequently group the multivariate Gaussian distributions to represent the complex multidimensional structure of the joint distribution of galaxy spectra in the four-dimensional line ratio space. The main advantages of this method are the use of all four optical-line ratios simultaneously and the adoption of a clustering scheme. This maximizes the use of the available information, avoids contradicting classifications, and treats each class as a distribution resulting in soft classification boundaries and providing the probability for an object to belong to each class. We also introduce linear multidimensional decision surfaces using support vector machines based on the classification of our soft clustering scheme. This linear multidimensional hard clustering technique shows high classification accuracy with respect to our soft clustering scheme.
Article
We present two large catalogs of AGN candidates identified across ~75% of the sky from the Wide-field Infrared Survey Explorer's AllWISE Data Release. Both catalogs, some of the largest such catalogs published to date, are selected purely on the basis of mid-IR photometry in the WISE W1 and W2 bands. The catalogs are designed to be appropriate for a broad range of scientific investigations, with one catalog emphasizing reliability while the other emphasizes completeness. Specifically, the R90 catalog consists of 4,543,530 AGN candidates with 90% reliability, while the C75 catalog consists of 20,907,127 AGN candidates with 75% completeness. We provide a detailed discussion of potential artifacts, and excise portions of the sky close to the Galactic Center, Galactic Plane, nearby galaxies, and other expected contaminating sources. Our final catalogs cover 30,093 deg^2 of extragalactic sky. These catalogs are expected to enable a broad range of science, and we present a few simple illustrative cases. From the R90 sample we identify 45 highly variable AGN lacking radio counterparts in the FIRST survey, implying they are unlikely to be blazars. One of these sources, WISEA J142846.71+172353.1, is a mid-IR-identified changing-look quasar at z=0.104. We characterize our catalogs by comparing them to large, wide-area AGN catalogs in the literature, specifically UV-to-near-IR quasar selections from SDSS and XDQSOz, mid-IR selection from Secrest et al. (2015) and X-ray selection from ROSAT. From the latter work, we identify four ROSAT X-ray sources that each are matched to three WISE-selected AGN in the R90 sample within 30". Palomar spectroscopy reveals one of these systems, 2RXS J150158.6+691029, to consist of a triplet of quasars at z=1.133 +/- 0.004, suggestive of a rich group or forming galaxy cluster.(Abridged)
Article
Searching for extreme emission line galaxies allows us to find low-mass metal-poor galaxies that are good analogs of high redshift Ly$\alpha$ emitting galaxies. These low-mass extreme emission line galaxies are also potential Lyman-continuum leakers. Finding them at very low redshifts ($z\lesssim0.05$) allows us to be sensitive to even lower stellar masses and metallicities. We report on a sample of extreme emission line galaxies at $z\lesssim0.05$ (blueberry galaxies). We selected them from SDSS broadband images on the basis of their broad band colors, and studied their properties with MMT spectroscopy. From the whole SDSS DR12 photometric catalog, we found 51 photometric candidates. We spectroscopically confirm 40 as blueberry galaxies. (An additional 7 candidates are contaminants, and 4 remain without spectra.) These blueberries are dwarf starburst galaxies with very small sizes ($< 1\hbox{kpc}$), and very high ionization ([OIII]/[OII]$\sim10-60$). They also have some of the lowest stellar masses ($\log(\hbox{M}/\hbox{M}_{\odot})\sim6.5-7.5$) and lowest metallicities ($7.1<12+\log(\hbox{O/H})<7.8$) starburst galaxies.
Article
Searching for active galactic nuclei (AGN) in dwarf galaxies is important for our understanding of the seed black holes that formed in the early Universe. Here, we test infrared selection methods for AGN activity at low galaxy masses. Our parent sample consists of ~18,000 nearby dwarf galaxies (M*< 3 x 10^9 Msun, $z<0.055$) in the Sloan Digital Sky Survey with significant detections in the first three bands of the AllWISE data release from the Wide-field Infrared Survey Explorer (WISE). First, we demonstrate that the majority of optically-selected AGNs in dwarf galaxies are not selected as AGNs using WISE infrared color diagnostics and that the infrared emission is dominated by the host galaxies. We then investigate the infrared properties of optically-selected star-forming dwarf galaxies, finding that the galaxies with the reddest infrared colors are the most compact, with blue optical colors, young stellar ages and large specific star formation rates. These results indicate that great care must be taken when selecting AGNs in dwarf galaxies using infrared colors, as star-forming dwarf galaxies are capable of heating dust in such a way that mimics the infrared colors of more luminous AGNs. In particular, a simple $\mathrm{W1}-\mathrm{W2}$ color cut alone should not be used to select AGNs in dwarf galaxies. With these complications in mind, we present a sample of 41 dwarf galaxies worthy of follow-up observations that fall in WISE infrared color space typically occupied by more luminous AGNs.
Article
We examine the properties of the host galaxies of 22 623 narrow-line active galactic nuclei (AGN) with 0.02 < z < 0.3 selected from a complete sample of 122 808 galaxies from the Sloan Digital Sky Survey. We focus on the luminosity of the [OIII]λ5007 emission line as a tracer of the strength of activity in the nucleus. We study how AGN host properties compare with those of normal galaxies and how they depend on L[OIII]. We find that AGN of all luminosities reside almost exclusively in massive galaxies and have distributions of sizes, stellar surface mass densities and concentrations that are similar to those of ordinary early-type galaxies in our sample. The host galaxies of low-luminosity AGN have stellar populations similar to normal early types. The hosts of high-luminosity AGN have much younger mean stellar ages. The young stars are not preferentially located near the nucleus of the galaxy, but are spread out over scales of at least several kiloparsecs. A significant fraction of high-luminosity AGN have strong Hδ absorption-line equivalent widths, indicating that they experienced a burst of star formation in the recent past. We have also examined the stellar populations of the host galaxies of a sample of broad-line AGN. We conclude that there is no significant difference in stellar content between type 2 Seyfert hosts and quasars (QSOs) with the same [OIII] luminosity and redshift. This establishes that a young stellar population is a general property of AGN with high [OIII] luminosities.