ArticlePDF Available

Mining for Computing Jobs

Authors:

Abstract and Figures

A Web content mining approach identified 20 job categories and the associated skills needs prevalent in the computing professions. Using a Web content data mining application, we extracted almost a quarter million unique IT job descriptions from various job search engines and distilled each to its required skill sets. We statistically examined these, revealing 20 clusters of similar skill sets that map to specific job definitions. The results allow software engineering professionals to tune their skills portfolio to match those in demand from real computing jobs across the US to attain more lucrative salaries and more mobility in a chaotic environment.
Content may be subject to copyright.
feature
78 I EE E S O F TW A RE Pu b li s he d b y t h e IE E E C om p ut e r So c i et y 0 7 4 0-74 5 9 /1 0 /$ 2 6. 00 © 20 10 I E EE
Software engineering professionals can use these
job types to appraise their own skills portfolio and
better understand the skills similar jobs demand in
other organizations.
Data Collection and Analysis
The rst task was to nd job advertisements and
extract the job skills from the text. This compo-
nent of Web content mining employed information
extraction from semistructured documents.
2
We
developed software that systematically searched
Monster.com, HotJobs.com, and SimplyHired.com
daily between July 2007 and April 2008 for jobs re-
quiring a degree in computer science, management
information systems, computer information sys-
tems, and other computing programs. We extracted
244,460 unique job advertisements.
The software then parsed the ad texts to iden-
tify and extract job skill terms, basing the initial set
of terms on prior research.
3,4
These skills included
various technical, programming, business, and soft
skills. We excluded single words used in common
language (such as basic) from the search so that the
frequency of similar skills wouldn’t be exaggerated
(although we included the complete skill name Vi-
sual Basic). The nal collection of skills included
239 terms and synonyms.
After eliminating ads mentioning too few skills
(the organization was looking for one or two spe-
cic skills) or too many skills (the organization
was most likely a headhunter looking for anyone),
209,655 ads remained. Additionally, we eliminated
skills that appeared in fewer than two percent of
the remaining jobs, leaving 69 skills for this analy-
sis. Table 1 lists the skills occurring in 10 percent
or more of the job ads and the frequencies in which
they appeared. This articles companion Web site,
the Degree-Oriented Guide to Skills in Informa-
tion Technology (www.dogs-it.org/data), contains a
complete list of all the skills, their frequencies, and
other data.
We analyzed the data using cluster analysis, a
statistical technique for classifying cases in groups
that maximizes differences between groups and
minimizes those within a group.
5
This technique
allows classication using quantitatively derived
measures while still allowing for some control in
guiding clusters. This lets researchers ensure that
U
sing a Web content data mining application, we extracted almost a quarter
million unique IT job descriptions from various job search engines and dis-
tilled each to its required skill sets. We statistically examined these, revealing
20 clusters of similar skill sets that map to specic job denitions. The re-
sults allow software engineering professionals to tune their skills portfolio to match those
in demand from real computing jobs across the US (see the sidebar “Computing Jobs in
the US) to attain more lucrative salaries and more mobility in a chaotic environment.1
A Web content mining
approach identied
20 job categories
and the associated
skills needs prevalent
in the computing
professions.
Chuck Litecky, Andrew Aken, Altaf Ahmad, and H. James Nelson,
Southern Illinois University, Carbondale
Mining for
Computing Jobs
skill sets
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
Ja nuar y/Fe brua ry 2 010 IE E E S OF T WA R E
79
resulting clusters are meaningful and not the re-
sult of arbitrary statistical incidence alone. Re-
searchers can use Web content data mining utiliz-
ing cluster analysis to classify data or discover new
resources.
6,7
Consequently, each cluster we derived
contained ads closely related to other ads in the
cluster on the basis of the skills they specied, while
minimizing the relationship to ads in other clusters.
Cluster analysis revealed 20 job denitions,
which we then veried using a manual review of
100 random job ads, with an overall successful clas-
sication rate of 91 percent (see the sidebar “Cluster
Analysis” for a description of the process we used).
Table 2 describes each cluster; Figure 1 shows the
relative numbers of jobs from all job ads analyzed
that were placed into each cluster.
Job Denitions
Table 2 shows the major details for each of the 20
job types we identied. Job types that have skills
with lower frequencies require a more diverse range
of skills. For example, a Microsoft Web developer
job might require C# or ASP, but not both, resulting
in lower overall frequencies for both skills. Never-
theless, it still indicates a strong preference for Mi-
crosoft-oriented technologies.
The clusters show that database skills are uti-
lized in several different types of jobs. A database
administrator is typically responsible for efcient
hardware maintenance, logical database implemen-
tation, and security. Database developers can focus
on database design or focus only on query writing,
whereas database application programmers are
Although the demand for computing professionals in the
US has recovered to pre-“dotbomb” levels,1–5 the current
economic downturn and threat of recession mean that
professionals must focus their recruiting and job searches
more sharply. Search tools connect available jobs to tal-
ent, ranging from professional “headhunters” to Web sites
such as Monster.com, HotJobs.com, and SimplyHired.com.
However, even a brief examination of these tools shows that
US job titles vary substantially and that job denitions are
often misleading. There’s also an underlying discrepancy
between the required skills and those that job hunters can
infer from different employers’ titles for similar positions.
Many human relations departments develop advertisements
for computing professionals starting with a xed job title
and then have the IT department add the list of skills that
position requires. In our examination of job listings, we’ve
found considerable mismatch and inconsistencies among
the resulting skills and titles.
Consistent job titles and denitions composed of consistent
skill sets can help software engineering personnel, education-
al institutions, students, and individual career planning. One
of the more comprehensive attempts to provide such skill-set-
based job denitions is the US Department of Labor’s Occu-
pational Information Network (O*NET) project (http://online.
onetcenter.org). However, the report seems to lack system-
atic methods for determining which skill sets are required for
which job denition. For example, for the position of data-
base administrator, O*NET lists “knowledge of circuit boards
but not the dominant database products.2
Starting in the late ’80s, various researchers began study-
ing advertisements to determine what skills most IT jobs de-
mand.6 In the early ’90s, researchers such as Chuck Litecky
and Kirk Arnett expanded previous work to include system-
atic nationwide sampling of US newspaper ads.7 In the late
’90s and early 2000s, various studies began mapping the
migration of IT job skills ads to the Internet, generally show-
ing that most ads were migrating and that demanded skills
were changing with the adoption of new technologies. Other
researchers’ methods included interviews and surveys, with
many focusing on the importance of managerial and techni-
cal skills in computing jobs.8–10 Occupation analysts initially
created the O*NET database and supplemented it with annu-
al surveys. However, it’s often unrepresentative of actual jobs
and the complete skill sets that employers might require. Ad-
ditionally, research methods often don’t attempt to determine
which combinations of skills businesses frequently desire.
References
1. D. Callahan and B. Pedigo, “Educating Experienced IT Professionals by
Addressing Industry’s Needs,IEEE Software, vol. 19, no. 5, 2002, pp.
57–62.
2. B. Prabhakar, C.R. Litecky, and K. Arnett, “IT Skills in a Tough Job Market,”
Comm. ACM, vol. 48, no. 10, 2005, pp. 91–94.
3. F. Niederman, “IT Employment Prospects in 2004: A Mixed Bag,IEEE
Computer, vol. 37, no. 1, 2004, pp. 69–77.
4. C.R. Litecky, B. Prabhakar, and K. Arnett, “The Size of the IT Job Market,”
Comm. ACM, vol. 51, no. 4, 2008, pp. 107–109.
5. R.R. Panko, “IT Employment Prospects: Beyond the Dotcom Bubble,”
European J. Information Systems, vol. 17, no. 3, 2008, pp. 182–197.
6. S. Athey and W.J. Plotnicki, “A Comparison of Information Systems Job
Requirements in Major Metropolitan Areas,” Interface, vol. 13, no. 4,
1988, pp. 47–53.
7. C.R. Litecky and K. Arnett, “Job Skill Advertisements and the MIS Curricu-
lum: A Market-Oriented Approach,Interface, vol. 14, no. 4, 1992, p. 45.
8. A.J. Aken and M.D. Michalisin, “The Impact of the Skills Gap on the
Recruitment of MIS Graduates,Proc. 2007 ACM Special Interest Group
Management Information Systems Computer Personnel Research Conf.
(CPR 07), ACM Press, 2007, pp. 105–111.
9. T. Goles, S. Hawk, and K.M. Kaiser, “Information Technology Workforce
Skills: The Software and IT Services Provider Perspective,Information
Systems Frontiers, vol. 10, no. 2, 2008, pp. 179–194.
10. C.L. Noll and M. Wilkins, “Critical Skills of IS Professionals: A Model for
Curriculum Development,” J. Information Technology Education, vol. 1, no.
3, 2002, pp. 143–154.
Computing Jobs in the US
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
80 I E EE S O F TW A RE ww w . c o m p u t e r . o r g / s o f t w a r e
expected to work on larger-scale applications sup-
ported by extensive database engines. The context
of database skills signicantly affects the nature of
the job; thus, many different job types emphasize
database skills. The job type titles reect the specic
emphasis on the database skills required.
Similarities among the skills required in each
of the clusters suggest that we can abstract the 20
job denitions into ve larger general job classi-
cations: Web developers, software developers, da-
tabase developers, managers, and analysts, shown
in Figure 2.
For the Web developers group, job types focus
on different Web technologies and emphasize re-
quired skills differently. For example, the Java da-
tabase Web developer jobs requires Java and Java
server skills along with Java and database pro-
gramming skills. The Microsoft Web application
project analyst job requires skills for technologies
such as active server pages and C#.
The software developers group consists of ve
clusters of traditional non-Web-based develop-
ment, with moderate demands for programming
in general, software development, and object-
oriented programming skills, plus specic lan-
guage skills such as C/C++, Java, or C#. For ex-
ample, two clusters focus on C/C++ and generic
programming skills. The two clusters are distin-
guished through the supplementary skills required
for those jobs. C/C++ programmer jobs focus pri-
marily on programming-language skills, whereas
the system-level C/C++ programmer jobs also re-
quire skills in general programming, software de-
velopment, operating systems, security, and Perl.
This indicates that the latter cluster undertakes
work at the operating systems level as well as sup-
porting traditional Perl-based work.
The database developers group consists of four
clusters that demand a moderately high degree of
skills in SQL and Oracle or other databases along
with moderate degrees of programming skills. For
example, the more generic database developer jobs
focus primarily on SQL programming. Other da-
tabase developer jobs have supplementary focuses
on Java or Microsoft skills.
The managers group consists of both personnel
and technology managers. The rst cluster, IT man-
ager jobs, requires leadership, business strategy, and
business skills such as marketing, nance, account-
ing, and business process design. This is a large
group that includes diverse staff with some manage-
ment. However, on the basis of the included skills,
the IT managers cluster doesnt specify a purely
personnel management role, which would render it
quite different from the other clusters in the man-
agers category. IT managers manage various types
of systems as well as personnel and have a range of
technical skill requirements. At the same time, they
arent exactly like network or system administra-
tors, who have a much more focused role in spe-
cically managing networks or operating systems.
So, these clusters are somewhat similar and belong
under the same parent category. Conventional tech-
nology administrators such as database and system
administrators are in the second subgroup, which
has a lower requirement for leadership.
The technology managers subgroup includes
two job types that sometimes overlap: network
and system administrators. On the basis of the
skills in this analysis, network administrator
jobs focus largely on network administration and
internetworking various operating systems to that
Table 1
The most frequently mentioned
skills in computing job ads
Skill Frequency (%)
Security 33.29
C/C++ 28.69
SQL 27.57
Programming 26.08
Microsoft operating systems 23.18
Java/Java 2 Enterprise Edition/Java to Python 21.09
Leadership 20.10
Project management/planning/budgeting/scheduling 18.86
Software development 18.01
Oracle databases 17.19
Unix 17.15
Business strategy 17.06
Certification 14.88
Finance 13.98
XML 13.56
Generic databases 13.43
HTML /XHTML /DHTML 12.80
Open source operating systems 12.50
Marketing 12.47
JavaScript 12.10
Accounting 11.70
Microsoft databases 11.37
Object-oriented programming 11.16
.NET 10.55
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
Ja nuar y/Fe brua ry 2 010 IE E E S OF T WA R E
81
Table 2
Job denitions
Job title Job description Major skills required
Web programmers
(6,187 / 3.0%)
Generic Web development using a variety of development
platforms.
HTML*, JavaScript*, Java, XML, AJAX
MS Web developers
(4,297 / 2.0%)
Web development specializing in Microsoft technologies. C/C++†, ASP*, C#*, SQL*, HTML*, Java-
Script*, XML, .NET, VB
MS Web application project analysts
(3,843 / 1.8%)
Application development using primarily Microsoft
technologies, including some system analysis.
C/C++†, SQL†, C#*, XML, MS Databases,
.NET*, ASP, VB, OOP, SDLC
Java database Web developers
(6,766 / 3.2%)
Web-based database application development using
Java.
Java†, JSP, SQL*, MS, XML*, HTML, Java-
Script, Oracle
Open source Web application
developers
(3,185 / 1.5%)
Web application development using open source tech-
nologies, GUIs, and back-end development.
HTML, open source/Unix operating sys-
tems, PHP, Java, JavaScript, Perl, SQL,
open source databases
Java programmers
(12,380 / 5.9%)
Programming position specializing in Java and Java-
related programming.
Java†, programming, sof tware develop-
ment, OOP
MS developers
(7,018 / 3.3%)
Traditional development specializing in Microsoft lan-
guages with high C# requirement.
C/C++, C#, .NET, object-oriented program-
ming
Open source developers
(8,258 / 3.9%)
Primarily a programming position working with many
languages associated with the open source community.
C/C++, Java, open source/Unix operating
systems
C/C++ programmers
(12,919 / 6.2%)
Programming specializing in C/C++. Few other major skill
requirements.
C/C++, programming skills
System-level C/C++ programmers
(6,485 / 3.1%)
Specialized C/C++ programming, developing applications
that interface at the operating system level.
C/C++, programming skills*, security,
operating systems*, TCP/IP, Perl, Java
Database developers
(14,580 / 7.0%)
Working with SQL and different database systems. Mod-
erate amounts of programming and system analysis.
SQL, Oracle, MS and generic databases,
programming skills
Java database application
developers
(7,210 / 3.4%)
Development of database applications in Java. Primarily
focused on using Oracle and Unix.
Java†, Oracle*, SQL*, Unix*, Perl, XML
MS Visual Basic (VB) database appli-
cation developers
(6,849 / 3.3%)
Development of database applications primarily using VB,
.NET, and ASP.
SQL†, Microsoft databases, Visual Basic,
.NET, ASP
MS database application developers
(6,975 / 3.3%)
Development of database applications using Microsoft
technologies. Distinguished from MS VB Database Appli-
cation Developer by requirement of C#, C/C++, SQL, and
ERP skills.
C#, C/C++, SQL, .Microsoft databases,
NET, ASP
IT managers
(26,656 / 12.7%)
Includes a variety of jobs, most of which include a leader-
ship component as well as a high frequency of non-IT-
oriented business skills.
Leadership, strategy, finance, marketing,
accounting, telecom, CASE tools, SCM,
BPR, ERP
System administrators
(15,248 / 7.3%)
Administration of end-user computing systems and
workstations (primarily MS operating systems) as well as
networking and telecommunications.
MS operating systems, security, certifica-
tion, net working
Network administrators
(7,982 / 3.8%)
Similar to system administrators but heavier emphasis
on Unix, open source, Sun, and IBM operating systems.
Special focus on networking multiple technologies.
Open source/Microsoft/Unix/IBM operat-
ing systems, security, TCP/IP, Cisco, Perl
Database administrators
(8,062 / 3.8%)
Works with the administrative component of databases.
Oracle stands out as the dominant database management
system (DBMS).
Oracle, Unix*, SQL, databases, ERP, data
warehousing, security
Security specialists
(22,813 / 10.9%)
These positions all include some securit y aspect but are
otherwise wide-ranging.
Security, certification, leadership
Project analysts/ managers
(21,942 / 10.5%)
Project management, often including a leadership or
strategy component.
Project management planning, budgeting,
scheduling, leadership, strategy, certifica-
tion, finance, ERP, responsibility
Note: Dagger ( ) indicates 90 percent+ frequency; asterisk (*) indicates 80 percent+.
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
82 I E EE S O F TW A RE ww w . c o m p u t e r . o r g / s o f t w a r e
end. On the other hand, although system admin-
istrator jobs might incorporate some network ad-
ministrator roles, they focus primarily on the ad-
ministration of end-user computing systems and
workstations.
Analysts are the nal group in Figure 2. This
group consists of the project analyst/managers
cluster, which requires skills such as project man-
agement, planning, budgeting and scheduling, and
leadership. This group requires project manage-
ment skills uniformly through all of its job ads in
contrast to the IT managers cluster, which relies
more consistently on leadership skills.
As Figure 1 shows, the IT managers job type
has the largest number of available jobs. Security
specialists and project analysts/managers follow
closely. Like the IT manager cluster, this cluster’s
size results from the inclusion of diverse staff with
some management function. The size also indi-
cates high demand for management skills in the
marketplace. Each job in the security specialists
cluster has security mentioned in its ad. Over 31
percent also listed industry certications as re-
quirements. The high number of ads, along with
the demand for security in this cluster, indicates
high demand for this skill. Considering that secu-
rity is important even for common software de-
velopment projects,
8
security awareness among
We performed a cluster analysis on the ads in two phases.
Because we had no preconceived notions about the actual
number of job types that the ads represented,1 we used hier-
archical agglomerative clustering in the rst step to identify
20 unique skill set clusters. We used this number as input to
the second step: a k-means cluster analysis. This technique
produced the easiest-to-interpret set of clusters, had stable
clusters with split samples,2 and was unaffected by random-
ized input.3
We validated the classication of 209,655 ads into 20
clusters by again performing k-means cluster analysis with
inputs from 5 to 50 clusters. We then calculated the mean dif-
ferences to the cluster centroids for each cluster.4 Twenty clus-
ters produced the lowest average mean difference to the cen-
troid. This indicated that 20 clusters had signicant cohesion
among the cases in those clusters, meaning that the jobs (clus-
ters) with those skill sets were the most consistently dened. In
addition, with 20 clusters, the job denitions with the smallest
number of ads were already less than two percent of the to-
tal. Analyses with larger cluster numbers led to job denitions
that were too small (in other words, that were too specialized)
to be of practical use for most career planning.
Other researchers have criticized cluster analyses in prior
studies for small sample sizes, lack of stability of the number
of clusters,2 and the effects of input order.5 Our study used
a very large sample—almost a quarter-million job ads—and
had some assurance of stability by using repeated runs with
random ordering. This ensured a consistent and stable num-
ber and nature of clusters. We iteratively found an average
90 percent consistency between the sets of clusters.
We then named and interpreted each cluster as an IT job
type, independently evaluating the clusters, proposed names,
and descriptions of the job types on the basis of the frequen-
cies of skills in each cluster. There was 95 percent agreement
in the jobs’ initial names and characteristics. We achieved
consensus through discussion that led to minor revisions of the
names.
The results underwent a nal validation to test the ac-
curacy of the parsing software and the cluster analysis. We
drew stratied random samples of job ads for each of the 20
clusters. Each author manually classied the ads into one of
the 20 clusters judged to be the best t on the basis of each
ad’s original text.6 This test uncovered issues with the soft-
ware, which we corrected before reanalyzing the data. The
nal iteration of this test (with a different random sample)
found three ads that were misclassied because of limita-
tions in the parsing software and another six that were either
placed in the wrong cluster or didn’t have an appropriate
cluster. Neither of these types of errors seemed to affect the
clustering process, and this rate has reliability similar to stud-
ies exclusively employing human judges.7 The tests indicated
that we achieved an overall successful classication rate of 91
percent, which is signicantly better than comparable data-
mining studies.6
References
1. U.M. Fayyad, G. Piatetsky-Shapiro, and R. Uthurusamy, “Summary from
the KDD-03 Panel: Data Mining: The Next 10 Years,” Explorations, vol. 5,
no. 2, 2003, pp. 191–196.
2. S. Dolnicar, “Using Cluster Analysis for Market Segmentation—Typical
Misconceptions, Established Methodological Weaknesses and Some
Recommendations for Improvement,J. Market Research, vol. 11, no. 2,
2003, pp. 5–12.
3. J.F. Hair et al., Multivariate Data Analysis, 6th ed., Pearson Education,
2006.
4. N. Zhong, J. Liu, and Y. Yao, “Envisioning Intelligent Information Technolo-
gies through the Prism of Web Intelligence,Comm. ACM , vol. 50, no. 3,
2007, pp. 89–94.
5. G.D. Garson, Statnotes: Topics in Multivariate Analysis, 23 Nov. 2007;
www2.chass.ncsu.edu/garson/pa765/statnote.htm.
6. V. Jijkoun and M. de Rijke, “Retrieving Answers from Frequently Asked
Questions Pages on the Web,Proc. 14th Int’l Conf. Information and
Knowledge Management (CIKM 05), ACM Press, 2005, pp. 76–83.
7. E.J. Barry, C.F. Kemerer, and S.A. Slaughter, “On the Uniformity of Soft-
ware Evolution Patterns,” Proc. 25th Int’l Conf. Software Eng. (ICSE 03),
IEEE CS Press, 2003, pp. 106–113.
Cluster Analysis
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
Ja nuar y/Fe brua ry 2 010 IE E E S OF T WA R E
83
employers is therefore probably leading to a new
career path. Jobs in the project analysts/manag-
ers cluster demanded project management skills,
closely followed by leadership. This traditional ca-
reer path is still in high demand.
9
Implications
for Career Development
In any discipline, and especially in a discipline with
a dynamic, highly competitive technology environ-
ment, professionals should periodically review the
skills sets in high demand and identify industry
trends in which their skill sets might be falling be-
hind. Where downsizing and outsourcing is com-
mon, keeping up with current skill sets is critical.
Microsoft is a dominant presence in software.
However, our research indicates that few organi-
zations focus exclusively on Microsoft-oriented
technologies. Although ve of 20 clusters require
skills specically in Microsoft technologies, they
account for only about 14 percent of all job ads.
Open source and Java jobs are very competitively
positioned, and they comprise about 12 percent
of the ads. Open source technologies might soon
match the demand for Microsoft technologies in
terms of the number of jobs.
10
Microsoft develop-
ment skills and open source skills should consti-
tute a signicant component of skill sets for both
rst-time job seekers and established computing
professionals.
The Web developers group accounts for 12 per-
cent of all jobs, yet when considered among all de-
velopment groups, Web development jobs account
for almost a quarter of all programming work. This
nding highlights the surprisingly low number of
jobs whose only focus is Web-based programming.
Additionally, although Microsoft isnt a dominant
force in overall Web development, its technologies
account for a third of all such work, whereas Java
and open source development account for over 40
percent. It appears that Web development is an
arena without a dominant technology but rather
with focused niches from all platforms and that
many different skills are marketable in this arena.
In the database administrators cluster, 91 per-
cent of all jobs require Oracle. There’s no cor-
responding cluster with such a high preference
for any other database management system. The
strong preference for Oracle skills is signicant for
computing professionals’ career planning.
Open-source Web application developers
MS Web application project analysts
MS Web developers
Web programmers
System-level C++ programmers
Java database Web developers
MS VB DB application developers
MS DB application developers
Microsoft developers
Java DB application developers
Network administrators
Database administrators
Open-source developers
Java developers
C/C++ programmers
Database developers
System administrators
Project analysts/managers
Security specialists
IT managers
Job type
3,185
3,843
4,297
6,187
6,485
6,766
6,849
6,975
7,018
7,210
7,982
8,062
8,258
12,380
12,919
14,580
15,248
21,942
22,813
26,656
Distribution of jobs
Figure 1. Distribution
of jobs by job type.
The gure graphically
represents the relative
frequency and number
of job ads placed into
each cluster.
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
84 I E EE S O F TW A RE w w w. c o m p u t e r . o r g / s o f t w a r e
Implications for Organizations
From the 1970s through the 1990s, the program-
mer’s role expanded to include not only technical
programming but also an increasing knowledge
of business, communication skills, and critical
thinking.
11
The programmer became a program-
mer/analyst. Through 2003, programmer/analyst
was the job term occurring most frequently in job
ads.
12
In a study of ads placed online between
2001 and 2003, ads for programmer/analysts
specically requested skills in software develop-
ment and software (98 percent), but also business
(83 percent), social (83 percent), problem-solving
(77 percent), management (67 percent), architec-
ture/network (67 percent), and hardware (42 per-
cent) skills.
13
Our study determined job titles composed of
collections of skill sets that appeared together in on-
line ads. The results indicate that a split is occurring
between programmer jobs and analyst jobs. There
are three different developer job types: Web, soft-
ware, and database developer. Each leans heavily
on technical skills. The other two job types, manag-
ers and project analysts, have more general techni-
cal skills but also include skills such as leadership,
strategy, and security.
Development jobs requiring considerable techni-
cal knowledge but needing relatively little business
or management knowledge are more easily out-
sourced than jobs requiring a great deal of business
domain knowledge. Perhaps organizations are con-
sciously or unconsciously preparing to outsource
more development jobs by separating the program-
mer from the analyst. This has clear implications
to people seeking to upgrade their skill sets. For
organizations, this result might indicate where the
industry is headed, and they might wish to prepare
accordingly.
O
ur study is constrained by the data sources
we mined. Although since 2006 employers
have placed most US job ads online
14
and
ofine or print ads are often replicated online,
15
we
can’t incorporate ads that are exclusively in print
into the data collection. However, because print
ads have space constraints that aren’t imposed on
online ads, they cant be as comprehensive in listing
needed skills as online ads. So, not including print
ads could improve our results. Our data sources in-
cluded only ads for the US national job market. Fu-
ture research could also incorporate international
job ads.
Our analysis focuses on a snapshot of the cur-
rent job market and doesn’t attempt to predict the
future skills industry might need, or even show
which skills might be becoming more or less popu-
lar. Although there’s a signicant amount of prior
research in this area, the number of skills that the
current research methodology covers is far greater
than previous research, so we can’t reliably accom-
plish trend analysis across these very different tech-
Web developers
General
Software developers
General
Database developers
General
Managers
General
Project analysts
General
• Web programmers
• MS Web developers
• Java programmers
• MS Web developers
• Open source developers
• Database developers • IT managers • Project analysts/
managers
Web application
developers C/C++ programmers
Application developers Technology
• MS Web application
project analysts
• Java database Web
developers
• Open source Web
application developers
• C/C++ programmers
• System-level C/C++
programmers
• Java database
application developers
• MS Visual Basic
database application
• MS database
application
developers
• System administrators
• Network administrators
• Database administrators
• Security specialists
12% 10%
39%
17%22%
Figure 2. Types of
information technology
jobs. The gure shows
the ve groups of
clusters, the job types
in each group, and the
relative distribution
of job ads for those
groups.
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
Ja nuar y/Fe brua ry 2 010 IE E E S OF T WA R E
85
niques. However, as more research uses these infor-
mation extraction techniques, it will be possible to
track past and future trends in skill popularity, job
availability, and shifts in skill combinations.
This article has statistically grouped related
computing job ads into consistent job denitions
based on current skill sets for the rst time. Hu-
man resource managers and IT managers can take
these job denitions as a set of widely used US jobs
they can contrast with their organizations’ job de-
nitions to better word their own job ads. More-
standardized job titles and skill requirements will
improve employers’ abilities to nd and correctly
place needed employees. Similarly, educators might
use these job denitions to identify the necessary
skills for their graduates to sustain a growing econ-
omy and continued expansion of computing jobs.
9
Students can also use them to plan which courses
they should take to get the job they desire.
This article can serve as a baseline for future re-
search because it quanties the current relative fre-
quency of specic skills in job denitions. As time
passes, we expect job denitions to change and
emerging technologies to replace older technolo-
gies. Documenting this trend should provide an in-
teresting timeline for the evolution of the comput-
ing eld. Once more data has been collected, it will
also be useful to compare the differences between
the skill requirements for graduates of different
computing degree programs to help students and
industry understand the differences among com-
puter science, management information systems,
and IT degrees.
References
1 K. S. Koong, L.C . Liu, and X. Liu, “A Study of t he
Demand for Infor mation Technology Professionals in
Selected Internet Job Por tals,” J. Inform ation Systems
Educ ation, vol. 13, no. 1, 20 02, pp. 21–28.
2. R. Kosala and H . Blockeel, “Web Mining Research: A
Survey,” Explorations, vol. 2 , no. 1, 2000, pp. 1–15.
3. B. Prabhaka r, C.R. Litecky, and K. Arnett, “IT Ski lls
in a Tough Job Market,” Com m. ACM , vol. 48, no. 10,
2005, pp. 91–94.
4. A.J. Aken and M .D. Michalisi n, “The Impact of t he
Skills Gap on the Recruitment of M IS Graduates,”
Proc. 2007 AC M Special Interest G roup M anage ment
Informatio n Syste ms Co mputer Personnel Research
Conf. (CPR 07), ACM Press, 2007, pp. 105–111.
5. J.F. Hair et al., Multivari ate Data Ana lysis, 6th ed.,
Pearson Education, 200 6.
6. S. Chakrabarti, Mining the Web, Morgan Kaufmann,
2002 .
7. R. C ooley, B. Mobasher, and J. Srivastava, “Web Min-
ing: Informat ion and Pattern Discovery on the World
Wide Web,” Proc. 9th Int’l Conf. Tools with Articial
Intelligenc e (ICTAI 97), IEEE CS Press, 1997, pp.
558–567.
8. I.A. Tondel, M.G . Jaatun, and P.H. Meland, “S ecur ity
Requirements for t he Rest of Us: A Survey,” IEEE
Software, vol. 25, no. 1, 2008 , pp. 20–27.
9. D. Callahan and B. Ped igo, “Educating E xperienced I T
Professionals by Addressi ng Industry’s Needs,” IEEE
Software, vol. 19, no. 5, 20 02, pp. 57–62.
10. D. Spinellis, “Open Source and Professional Advance -
ment,” IEEE Sof tware, vol. 23, no. 5, 20 06, pp. 70 –71.
11. L. Chen, A. Muthitacharoen, and M. N. Frolick, “In-
vestigating the Use of Role Play Trai ning to I mprove the
Communication Skills of IS Profe ssional s,” J. Computer
Informatio n Syste ms, vol. 43, no. 3, 200 4, pp. 67–74.
12. M.J. Gallivan, D.P. Truex, a nd L. Kvasny, “Changing
Patterns in IT Skill Sets 1988 –2003: A Content A nalysis
of Classi ed Advertising,” Database for A dvanc es in
Informatio n Syste ms, vol. 35, no. 3, 200 4, pp. 64 –87.
13. K.L. Choong, and H. Hyo -Joo, “Analy sis of Ski lls
Requirements for E ntry- Level Pro grammer/Ana lysts in
Fortu ne 500 Corporations,” J. Infor mati on Syst ems
Educ ation, vol. 19, no. 1, 200 8, pp. 17–27.
14. “Online Help-Wanteds Outstripping Print,” Market-
ingVox.com, 21 De c. 2006; www.marketi ngvox.com/
online-help-wanteds -outst ripping -print- 026135.
15. M. Oneal, “5 New spaper Giants in Talk s about Onl ine
Ad Network,” ChicagoTribune.com, 6 Nov. 20 07;
http: //archive s.chicagotribune.com/2007/nov/06/
business/chi-tue_tribu ne_1106nov06.
About the Authors
Chuck Litecky is a professor of management information systems at Southern Illinois
University, Carbondale. His research interests include IT career development and technology
adoption. He worked as a commercial programmer before entering academia. Litecky
has a PhD in management information systems from the University of Minnesota. He’s a
member of the ACM Special Interest Group on Computer Personnel Research. Contact him at
clitecky@cba.siu.edu.
Andrew Aken is a visiting research programmer at the University of Illinois in Urba-
na-Champaign and a PhD candidate in management information systems at Southern Illinois
University, Carbondale. His research interests include Web content mining, environmental
sustainability, computing curriculum development and strategy, and software development
methodologies. A ken has a master’s in computer science from Southern Illinois University.
He’s a member of the IEEE Computer Society and the AC M (including SIGMIS, SIGIT E, and
SIGCSE). Contact him at ajaken@illinois.edu.
Altaf A hmad is a PhD student of management information systems at Southern Illinois
University, Carbondale. His research interests include information privacy, knowledge
management, and job skills research. Ahmad has an MBA from the University of Technology,
Sydney. Contact him at altaf@siu.edu.
H. James Nelson is an assistant professor of management information systems at
Southern Illinois Universit y, Carbondale. His research interests include developing theoreti-
cally grounded models of information systems quality, investigating how people make IT
paradigm shifts, and determining the business value of information technology. Nelson has
a PhD in information systems from the University of Colorado at Boulder. He’s a member
of the IEEE , the ACM, the Academy of Management, and the Association for Information
Systems. Contact him at jimbo@cba.siu.edu.
Selected CS articles and columns are also available
for free at http://ComputingNow.computer.org.
Authorized licensed use limited to: Charles Munk. Downloaded on February 1, 2010 at 17:28 from IEEE Xplore. Restrictions apply.
... Dirichlet Allocation(De Mauro et al., 2018;Gurcan & Cagiltay, 2019), or K-means clustering(Gurcan & Cagiltay, 2019;Litecky et al., 2010). For instance,Litecky et al. use K-means clustering to group job descriptions with similar skills and identify a variety of skill clusters(Litecky et al., 2010). ...
... Dirichlet Allocation(De Mauro et al., 2018;Gurcan & Cagiltay, 2019), or K-means clustering(Gurcan & Cagiltay, 2019;Litecky et al., 2010). For instance,Litecky et al. use K-means clustering to group job descriptions with similar skills and identify a variety of skill clusters(Litecky et al., 2010). Similarly, Iezzi clusters job profiles based on educational qualifications such as is limited. ...
Article
Full-text available
Businesses constantly strive to build organizational capacity to use data strategically. As a result, there is a growing demand for business analytics professionals. While higher education systems worldwide have been adapting to build competencies, they must meet employees' expectations. Curriculum design for delivering business analytics competencies remains a challenge due to the rapidly evolving nature of business analytics as a discipline. The paper aims to decode the industry expectations for the Business Analytics profile. This study investigates the skills employers value by analyzing job descriptions. We use a text-mining approach to understand the weightage of different skills and mine skill clusters within business analytics roles. The core skill clusters are hard skills related to Big data, Business Intelligence, and analytical techniques. Results also suggest that traditional machine learning (ML) skills, typically expected in a data science profile, are also being sought after in a business analytics role. Surprisingly soft communication and stakeholder management skills are also emerging as essential skills for business analytics roles. This study provides a better understanding by investigating the interplay between the demand for skills in the job market and curriculum development.
... Most frequently, job advertisements are utilized as the underlying database (cf. Ang et al. 2013;Litecky et al. 2010) and have been used as a source of data by researchers for several years (cf. e.g., Todd et al. 1995). ...
... Recently, researchers mostly make use of hierarchical clustering methods, which are used frequently and successfully for text mining problems (Inzalkar and Sharma 2015) allowing to find meaningful skill clusters (Pejic-Bach et al. 2020) and adjusting the level of aggregation (Adl et al. 2011). For instance, Litecky et al. (2010) use this approach to determine the future skills of 20 computing professions, while Pejic-Bach et al. (2020) identify eight skill clusters for Industry 4.0 jobs. Hence, hierarchical clustering has the potential to comprehensively determine future skills without bias toward particular branches or regions, provided the underlying sample of job advertisements is representative (i.e., the sample is large enough and not skewed toward particular branches or regions). ...
Article
Full-text available
The future of work is changing rapidly as result of fast technological developments, decarbonization and social upheavals. Thus, employees need a new skillset to be successful in the future workforce. However, current approaches for the identification of future skills are either based on s small sample of expert opinions or the result of researchers interpreting the results of data-driven approaches and thus not meaningful for the stakeholders. Against this background, we propose a novel process for the identification of future skills incorporating a data-driven approach with expert interviews. This enables identifying future skills that are comprehensive and representative for a whole industry and region as well as meaningful for the stakeholders. We demonstrate the applicability and utility of our process by means of a case study, where we identify 33 future skills for the manufacturing industry in Baden-Wuerttemberg, Germany. Our work contributes to the identification of comprehensive and representative future skills (for whole industries).
... With massive amounts of digitally available job ads, a crucial task in the field of human resource management (HRM) is to systematically analyze them to be able to answer questions like "What does the market want?", "Which companies search for which person types?" or "In which domains are social skills more important?" [4,21] Typical study examples include attempts to detect trending key requirements demanded by employers in a specific domain, or the extraction of what is offered by companies in return (e.g., [1,3]). Most of those studies are done by manual inspection of advertisements, which has the drawback that only a small subset of potential sources can be examined due to limitations of human resources. ...
... Job advertisements have become more prevalent online, and they have shown to improve the chances of being employed [16]. Therefore, research on job ads is becoming more important in many fields, with classic examples ranging from optimizing skill sets for specific positions [1] to optimizing strategies for HR departments [11]. The information to be gained from the text differs greatly. ...
Chapter
As job ads are getting more prevalent online, an automated analysis is becoming increasingly important, especially in the field of human resource management. In this paper, we propose an approach to automatically segment job ads by predefined categories like the description of a job or the offering company, which is needed to categorize and quantify different aspects of job ads. Using a manually annotated data set, textual features are extracted for each segment type in a first step and utilized to train state-of-the-art machine learning classification methods. Subsequently, these models are used by iterative algorithms to detect the individual segments. Using several optimization techniques like detecting typical segment start phrases, comprehensive evaluations show promising results.
... In the recent decade, various text mining tools have been proposed for automatic skill characterization in different industries, e.g., [6,11,13], as well as for exploring academic curricula [14]. Text mining enables construction of topic models from the underlying data. ...
... The rapid development of text mining and NLP techniques offers a powerful tool for automatic skill characterization. One of the earliest works of this kind matched a pre-defined set of related keywords against 244,460 unique job advertisements and used clustering to group most frequent skills into job profiles [6]. Text mining approaches have also been used to compare skill sets in different branches of the IT industry [11], mapping skills to different professional roles [13], design and analysis of curricula [14], as well as for longitudinal studies of industrial skill demands [16], to name only a few exemplary works. ...
... In addition, job postings can be an important data source to directly understand the expectations of the business world (Bell and Oudshoorn 2019;Gardiner et al. 2018;Gurcan and Cagiltay 2019). Indeed, there are studies in the literature revealing the knowledge areas and skill sets obtained from the job advertisements that are required for big data (De Mauro et al. 2018;Debortoli et al. 2014;Gardiner et al. 2018;Gurcan and Cagiltay 2019), cloud computing , and software/computer engineering and IT (Aken et al. 2010;Gurcan and Kose 2017;Smith and Ali 2014). In addition to these, a limited number of studies can be mentioned in terms of determining the required KSAs, degrees, and certificates for the field of CS according to the data obtained from the job advertisements. ...
Article
Full-text available
This study aims to reveal the competence areas and skill sets needed in the business world in the field of cyber security (CS). For this purpose, descriptive analysis, topic modeling analysis, and semantic content analysis were conducted on 9407 CS job advertisements obtained from Indeed.com. The results of the study revealed a total of 10 job titles and 23 skill sets expected by the business world in CS job postings. The first three titles in terms of volume were “Engineer”, “Analyst” and “Specialist”, while the first three skill sets were determined as “Business/Customer services”, “System engineering”, and “Bachelor degree”. In addition, maps showing the relationships between titles and skills were created with a title-skill set matrix. The results of our study can be expected to contribute to candidates and professionals in the field of cyber security, IT organizations, and educational institutions in the cyber security business world by seeing, evaluating, developing, and expanding the current knowledge, skills, and competencies needed in the field.
... Every day, a large number of SE jobs are posted on these platforms. SE job posts can be viewed in this light as a sign of industry demands and trends in this discipline [3]. In this regard, this study performs an analysis of SE employment and a discovery of demands and trends that may be useful to engineers, teachers, curriculum bodies and enterprises in the field of SE. ...
Article
Full-text available
In present digital era, being skillful and updated to the modern software development practices has become of crucial importance for the software engineering graduates. Moreover, freelancing industry has grown immensely in recent years and individuals, more than ever before, are fascinated to opportunities it offers, and have greater assurance that it can be a successful and satisfying alternative to a regular employment. Unlike others, in case of software, industry is leading the education. This makes the Software Engineering Education (SEE) additionally responsible to minimize the gap between the skills of the graduating students and the skills needed by the employers out there. There is not any previous work available in this that focuses on the skills required to cope up the freelancing industry by graduate students and recommendations for improvements to make Pakistan higher education curriculum that helps producing graduates who are capable enough to get themselves employed in freelancing platforms. This study aims to dissect the software industry needs and trends related to freelancing industry, and to uncover the suggestions for training in this dynamic field. The data was extracted through different freelancing platforms using scrapy framework of python and then LDA analysis was performed on the scraped data using python to find most trending topics in the SE field and better analyze the situation. Using a LDA analysis, the dataset extracted at two distinct time period is investigated to describe that how software industry changing time to time. For validity, the updated data was scraped on runtime from freelancing websites. Results of the analysis are shown in different formats and empirical findings are discussed reference to two different time periods and in relation to previous studies.
Conference Paper
Full-text available
Through recognizing the significance of a qualified workforce, skills, career, research has become one of the focal points in education, economics, and placements. In this work we concentrate on skills needs, nature of job, enticing career are dynamic variables dependent on many aspects such as geography time, a vocation or aspiration. The purpose of this paper will identify current trends and issues in research focusing on career and technical education. The term career and technical education (CTE) was viewed from a broad perception that included workforce education, technical education, technical college and community college etc. Results should allow researchers, practitioners and policy makers to identify instant and emerging research needs in career and technical education. Queries were constructed based on 546 students' data summaries available as training data (i.e. resume). Performance was measured on a test dataset of various filled documents (questioners).
Article
Web applications play a vital role in modern digital world. Their pervasiveness is mainly underpinned by numerous technological advances that can often lead to misconfigurations, thereby opening a way for a variety of attack vectors. The rapid development of E-commerce, big data, cloud computing and other technologies, further enterprise services are entering to the internet world and have increasingly become the key targets of network attacks. Therefore, the appropriate remedies are essential to maintain the very fabric of security in digital world. This paper aims to identify such vulnerabilities that need to be addressed for ensuring the web security. We identify and compare the static, dynamic, and hybrid tools that can counter the prevalent attacks perpetrated through the identified vulnerabilities. Additionally, we also review the applications of AI in intrusion detection and pinpoint the research gaps. Finally, we cross-compare the various security models and highlight the relevant future research directions.
Article
Full-text available
This paper examines trends in required job skills for IT professionals. Through an empirical study of classified job advertising for IT professionals over the past 17 years, we evaluate whether the observed trends support earlier predictions offered by researchers who sought to anticipate future job and skill demands (Leitheiser 1992; Trauth, Farwell, & Lee 1993). Many of the findings are consistent with previous studies and support the notion that employers are seeking an ever-increasing number and variety of skill sets from the new hires. In addition, we found ongoing evidence of a recruitment gap (Todd, McKeen & Gallupe 1995) where, despite many firms' stated emphasis on well-rounded individuals with business knowledge and strong "soft skills," the job advertising aspect of the recruiting process continues to focus on "hard skills". The changing demand patterns for IT professionals necessitate life-long learning skills not only for IT practitioners but also for the academics who teach them.
Article
Full-text available
The demand for information technology (IT) professionals has grown rapidly in the last decade. Parallel to this increasing need for IT personnel is the continuous change in the type of skills that are brought about by innovations in cutting edge technologies. However, the type of new IT skills and knowledge needed to keep companies competitive in the global market extend beyond the ability to apply the updated hardware and software to make business processes more efficient. Communication excellence and managerial expertise are just two of the other more commonly needed skills demanded by employers. This study identifies and classifies information technology related job listings that are disclosed in the databases of two leading e-recruiting services. Two secondary variables, written and oral communica-tions and experience, were also collected and examined in this study. The results of this research should be of interest to job seekers, human resources administrators, career counselors, corporate trainers, information systems consultants, labor attorneys, immigration and naturalization officers, and agency recruiters. Educators will find the outcomes of this study useful for the design and development of new curricula that can prepare students for the job market. Stu-dents will find this study particularly helpful since the trends identified in this research can have important implications for them in their selection of elective courses and in choosing a track for specialization.
Article
Full-text available
This paper presents the most up-to-date skill requirements for programmer/analyst, one of the most demanded entry-level job titles in the Information Systems (IS) field. In the past, several researchers studied job skills for IS professionals, but few have focused especially on "programmer/analyst." The authors conducted an extensive empirical study on programmer/analyst skill requirements by collecting and analyzing 837 job ads posted on Fortune 500 corporate websites for three years. The results indicate that the programmer/analysts in the Fortune 500 are expected to fulfill a combination of roles from computer program writers to technical experts as well as businessmen. The results of this study are discussed in terms of their implications for the IS 2002 curriculum model.
Conference Paper
Full-text available
Enrollment in MIS degree programs has been dropping significantly since 2001. Some of the reasons for this decline include the perceived lack of jobs for MIS graduates, confusion over the variety of computing degree programs, disinterest in MIS careers, and MIS programs which fail to prepare students adequately for careers in MIS. This paper looks at each of these potential causes for the enrollment decline and offers suggestions to reverse this trend. Given the lack of material support for the first three potential causes of the decline in enrollment, the bulk of this research is devoted to investigating the deficiencies of the MIS program from the employer's perspective. We will conduct a survey which will identify which skills MIS graduates are lacking from the employer's point of view and what impact this skills gap has on the recruitment practices of the employers. Our belief is that once employers begin actively recruiting for MIS graduates, the misperceptions of the job prospects for MIS graduates by potential students will likewise diminish and enrollment in the programs which respond to industries' needs will subsequently rebound.
Article
For over 30 years, this text has provided students with the information they need to understand and apply multivariate data analysis. Hair, et. al provides an applications-oriented introduction to multivariate analysis for the non-statistician. By reducing heavy statistical research into fundamental concepts, the text explains to students how to understand and make use of the results of specific statistical techniques. In this seventh revision, the organization of the chapters has been greatly simplified. New chapters have been added on structural equations modeling, and all sections have been updated to reflect advances in technology, capability, and mathematical techniques. Statistics and statistical research can provide managers with invaluable data. This textbook teaches them the different kinds of analysis that can be done and how to apply the techniques in the workplace.
Article
Today, one of the most sought after skills among IS professionals is effective communication. The lack of communication skills of IS professionals has resulted in IT failures in a number of crucial areas including information requirement determination, knowledge discovery, and end-user support. Although it has long been acknowledged that the communication skills of IS professionals is critical to information system success, little effort has been made to investigate how to improve such important skills. This study investigates the effectiveness of the role play exercise, an active training strategy that has been proven successful in many different fields, for the communication skill improvement of IS professionals. Ninety-three role play exercises aiming to improve the two dimensions of the communication skills, content and process related skills were conducted among ninety-two graduate students enrolled in systems analysis and design courses between 1998 and 2000. The analysis of the data collected from these exercises suggests that role play is a viable training method that can yield measurable results of communication skill improvement. Repeated measure results demonstrate a significant improvement in both content and process related skills after the role play exercises, and ANOVA results additionally illustrate the improvement patterns of the two types of skills. The recommendations and implications to both researchers and practitioners are also discussed.
Article
Collaborative CAD systems enabling collaboration in CAD among distributed designers are gaining more and more attention. Yet, such systems, especially in support of collaborative assembly modeling, are hardly achievable. In an effort to bridge this gap, we are dedicated to developing a collaborative CAD system with aim at 3D assembly modeling. As part of this effort, this paper addresses one function module of the system, a Web-based Product Structure Manager, that enables Collaborative Product Structure Management (CPSM) in collaborative assembly modeling. In particular, CPSM facilitates product data sharing among distributed designers and supports collaboration in product structure creation and modification. A bench clamp assembly is used as an example for the illustration of the Product Structure Manager in support of collaborative assembly modeling.
Conference Paper
We address the task of answering natural language questions by using the large number of Frequently Asked Questions (FAQ) pages available on the web. The task involves three steps: (1) fetching FAQ pages from the web; (2) automatic extraction of question/answer (Q/A) pairs from the collected pages; and (3) answering users' questions by retrieving appropriate Q/A pairs. We discuss our solutions for each of the three tasks, and give detailed evaluation results on a collected corpus of about 3.6Gb of text data (293K pages, 2.8M Q/A pairs), with real users' questions sampled from a web search engine log. Specifically, we propose simple but effective methods for Q/A extraction and investigate task-specific retrieval models for answering questions. Our best model finds answers for 36% of the test questions in the top 20 results. Our overall conclusion is that FAQ pages on the web provide an excellent resource for addressing real users' information needs in a highly focused manner.