ArticlePDF Available

Online community information seeking: The queries of three communities in Southwestern Ontario

May 2010
Information Processing & Management 46(3):343-361

May 2010
46(3):343-361

DOI:10.1016/j.ipm.2009.10.008

Source
DBLP

Authors:

Frank Lambert

Middle Tennessee State University

This paper presents not only mycommunityinfo.ca (MCI) as an innovative World Wide Web (WWW)-based community information (CI) site, but also how its unique approach to facilitating online CI searching on the Web reveals through empirical data how people use such information and communication technologies (ICTs) to address their everyday information needs. The geographic focus for this study is on three communities in Southwestern Ontario. MCI collects unobtrusively query data that are logged daily from its own Web site, the Web sites of three municipal governments, and one municipal agency from this region. One year’s worth of these data was supplied to determine the types of CI that are sought through Web searching. A content analysis of a large purposive sample of all of MCI’s query data reveals more specific and diverse conceptual CI needs between and within communities than those reported in other studies employing different data collection methods. As a result, using a centralized approach to online CI access via the WWW by other CI providers such as the 211 network may be a disservice to its users. Additionally, the findings demonstrate how a thorough analysis of such data may improve the informational content and overall design of municipal government Web sites. The analysis of these data also has the potential of improving current CI taxonomies.

Mycommunityinfo.ca search interface.

…

Major conceptual categories and sub-categories.

…

Minor conceptual categories derived from MCI Web query log data.

…

Descriptive statics of queries for AII MCI and municipal government Web sites.

…

Proportion of top 100 queries, all Web sites, by major conceptual category.

…

Figures - uploaded by Frank Lambert

Content may be subject to copyright.

Content uploaded by Frank Lambert

Content may be subject to copyright.

This article appeared in a journal published by Elsevier. The attached

copy is furnished to the author for internal non-commercial research

and education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling or

licensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of the

article (e.g. in Word or Tex form) to their personal website or

institutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies are

encouraged to visit:

http://www.elsevier.com/copyright

Author's personal copy

Online community information seeking: The queries of three communities

in Southwestern Ontario

Frank Lambert

School of Library and Information Science, Kent State University, P.O. Box 5190, 314 University Library, Kent, OH 44242, United States

article info

Article history:

Received 16 February 2009

Received in revised form 18 September 2009

Accepted 29 October 2009

Available online 27 November 2009

Keywords:

Community information

Community informatics

Information seeking

Web search

Web log analysis

abstract

This paper presents not only mycommunityinfo.ca (MCI) as an innovative World Wide Web

(WWW)-based community information (CI) site, but also how its unique approach to facil-

itating online CI searching on the Web reveals through empirical data how people use such

information and communication technologies (ICTs) to address their everyday information

needs. The geographic focus for this study is on three communities in Southwestern

Ontario. MCI collects unobtrusively query data that are logged daily from its own Web site,

the Web sites of three municipal governments, and one municipal agency from this region.

One year’s worth of these data was supplied to determine the types of CI that are sought

through Web searching. A content analysis of a large purposive sample of all of MCI’s query

data reveals more speciﬁc and diverse conceptual CI needs between and within communi-

ties than those reported in other studies employing different data collection methods. As a

result, using a centralized approach to online CI access via the WWW by other CI providers

such as the 211 network may be a disservice to its users. Additionally, the ﬁndings demon-

strate how a thorough analysis of such data may improve the informational content and

overall design of municipal government Web sites. The analysis of these data also has

the potential of improving current CI taxonomies.

1. Introduction

There is no shortage of research that studies information seekers’ queries logged by various types of search engines. An

incomplete but demonstrative list of such research might include Silverstein, Henzinger, Marais, and Moricz (1999), Spink,

Wolfram, Jansen, and Saracevic (2001), Beitzel, Jensen, Chowdhury, Frieder, and Grossman (2007), Chau, Fang, and Sheng

(2005), Ross and Wolfram (2000), Wang, Berry, and Yang (2003), Pu, Chuang, and Yang (2002), Lau and Goh (2006) and Rieh

and Xie (2006). These and additional studies examine information seeking behaviours and information retrieval within on-

line environments to address a number of research problems. However, the published Web search, community information

(CI) seeking, and community informatics literature lack the analysis of large amounts of unobtrusively collected data such as

those found in Web logs to determine conceptual types of CI that are sought through local online resources. This study at-

tempts to begin to address this deﬁciency by examining a large purposive sample of 1 year’s worth of query data that re-

sulted from hundreds of thousands of Web searches submitted by community information seekers through an online CI

provider located in Southwestern Ontario, Canada, called mycommunityinfo.ca (MCI) and through the Web sites of its

municipal government partners and clients.

doi:10.1016/j.ipm.2009.10.008

*Tel.: +1 330 672 0015; fax: +1 330 672 7965.

E-mail address: ﬂamber1@kent.edu

Information Processing and Management 46 (2010) 343–361

Contents lists available at ScienceDirect

Information Processing and Management

journal homepage: www.elsevier.com/locate/infoproman

Author's personal copy

1.1. Deﬁning concepts related to Web queries, and community information and informatics

With queries being the primary units of analysis for this study, it is important to clarify what constitutes a query. It is in

essence the ﬁnal product of integrated parts as constructed by the information seeker. Or, as Spink and Jansen (2004) elab-

orate, ‘‘Terms are the basic building blocks through which a Web searcher expresses their information problem when search-

ing on a Web search engine. Single or multiple term and operators form a Web query” (p. 55). While operators do not form

any signiﬁcant part of the queries in the data analyzed and presented in this paper, the single or multiple terms that con-

stitute the queries are essential.

It is also worth examining some deﬁnitions and concepts related to the study of community information provision and

the application of various information and communications technologies (ICTs) to facilitate access to this type of informa-

tion. This is important to consider because of MCI’s particularly unique ICT model that relies primarily on keyword searching,

an attribute seen very rarely in online CI providers. This has allowed hundreds of thousands queries targeted toward local

online government and CI resources to be captured unobtrusively by MCI for analysis.

Community information (CI) and information and referral (IandR) have been used often interchangeably. Durrance

(1984), for instance, examined then–current scholarship in an attempt to distinguish between CI and IandR so that the

parameters of the two services could be clariﬁed and applied by information centres and especially public libraries. As a re-

sult, Durrance identiﬁed a working deﬁnition of community information that works as an umbrella. First, CI must be recog-

nized as a service. Then,

two types of information may be provided by such a service: (1) ‘‘survival information such as that related to health,

housing, income, legal protection, economic opportunity, political rights, etc.” and (2) ‘‘citizen action information, needed

for effective participation as individual or as member of a group in the social, political, legal, economic process” (Donohue,

1976, p. 126; cited in Durrance, 1984b, p. 108).

Childers (1984) offers a broad deﬁnition of IandR: it facilitates ‘‘the link between a person with a need and the resource or

resources outside the library which can meet the need” (p. 1). The resource or resources that Childers refers to ‘‘connotes any

service, activity, individual, organization, information, or advice that may fulﬁll a need” (p. 1). As Durrance (1984) alluded to

above, Childers also acknowledged that there is some disagreement as to what IandR is and what constitutes legitimate

IandR activities. However, when compared to the ﬁrst facet of CI services as deﬁned by Donohue, Childers’s deﬁnition is

not far off the mark. It is Donohue’s second facet of CI’s deﬁnition that implies some form of empowering an individual

or group of individuals to participate more fully in civic life that distinguishes CI from IandR.

Prior to the introduction of computers as a tool to access local information, a ‘‘community network was a sociological concept

that described the pattern of communications and relationships in a community” (Schuler, 1996, p. 25). With the application of

computer and Internet based technology, the deﬁnition of community network changed somewhat, but its intent remained the

same. The computer based CNs that emerged in the 1990s were designed with the intention of being a tool to revitalize,

strengthen, and expand existing people based community networks. The intent of CNs then was to ‘‘advance social goals, such

as building community awareness, encouraging involvement in local decision-making, or developing economic opportunities in

disadvantaged communities” by ‘‘supporting smaller communities within the larger community and by facilitating the

exchange of information between individuals and these smaller communities” (Schuler, 1996, p. 25). They did this by:

being run by and for the local community; address the information and communication needs of everyday life; foster

equal access to new media; strengthen cohesion of the local community; be provided at no or little cost; and, refer to

a geographic space containing community members in close physical proximity, rather than issue-oriented ‘virtual com-

munities’ or communities of interest. (Kubicek & Wagner, 2002, p. 292).

To accomplish these goals, CNs were to offer what Schuler (1996) deﬁnes as electronic ‘‘one-stop shopping” by offering a

plethora of online resources to its users. While this was undeniably a worthwhile goal, this placed a tremendous strain on

CNs’ ﬁnances in the 1990s, and it would, as predicted by Cohill (2000), continue to be a signiﬁcant challenge for some time.

However, much like the deﬁnition of CI offered above by Donohue, CNs offer a form of empowerment but just through a dif-

ferent medium (electronic versus personal communication).

While most CNs offered access to the Internet through dial-up, the emergence and spread of high-speed, low cost Internet

service providers (ISPs) led CNs to abandon this ‘‘one-stop shopping” and server maintenance role (Gurstein, 2000). CNs still

play a role to this day, although they operate still under various pressures, especially ﬁnancial, despite the greater affordabil-

ity of Internet services. Research and development in the area of information and communications technologies (ICTs) in the

very late 1990s and going forward focused on creating hardware and software to ﬁnd further cost efﬁciencies (Gurstein,

2000). However, there seemed to be little concern over the potential users of these faster, better (based on one’s perception),

and more affordable Web-based technologies, especially those for whom there were various educational, access, and ﬁnan-

cial barriers. Community informatics concerns itself with the study of those individuals, groups, or communities who were

and who are continuing to be excluded from the many potential beneﬁts of ICTs (Gurstein, 2000). The Journal of Community

Informatics thus deﬁnes community informatics as

the study and the practice of enabling communities with Information and Communications Technologies (ICTs). [Commu-

nity informatics] seeks to work with communities towards the effective use of ICTs to improve their processes, achieve

344 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

their objectives, overcome the ‘‘digital divides” that exist both within and between communities, and empower commu-

nities and citizens in the range of areas of ICT application including for health, cultural production, civic management, e-

governance among others. (Accessed January 25, 2007, from http://www.ci-journal.net/index.php/ciej).

Fundamental to how community informatics studies how ICTs can help communities realize their social, economic, polit-

ical, or cultural goals on the path to community empowerment is access to ICTs ‘‘since without at least minimal access, little

can be accomplished” (Gurstein, 2000, p. 3). Thus the focus for community informatics is still, like the study of CI, on how

information can empower individuals and groups of individuals. However, the role of ICTs as a tool, especially related to on-

line access and issues of the so-called ‘‘Digital Divide,” are an added facet of concern to researchers.

1.2. Conceptual framework for this study

Broder’s (2002) model for IR augmented for the World Wide Web (WWW) provides a helpful framework for this study.

Unlike the classic IR model, the augmented model recognizes that human–computer interaction factors and cognitive as-

pects play a role in Web searching. These additional attributes take into account that the information seeker’s information

need is associated with some task. As a result, ‘‘this need is verbalized (usually mentally, not out loud) and translated into a

query posed to a search engine.” (Broder, 2002, p. 4). What makes Broder’s model useful in this paper’s context is the link of a

verbalized mentally articulated information need based on some task that is submitted through a Web-based tool to the def-

initions above of CI and IandR as resources for solving everyday information needs. Since MCI is a Web search engine with a

community focus for information sources, the data that result from its activities (captured query data) become an important

source for attempting to model online community information needs. While Broder deﬁnes the needs behind the query that

is posed to the search engine as one of three types (navigational, informational, and transactional), this study’s goal is to

move beyond a cursory classiﬁcation of the queries collected by MCI based on these criteria (cf. Broder, 2002; more recently

Jansen, Booth, & Spink, 2008). In order to give the units of analysis, the queries, more meaning in terms of trying to link the CI

seeker’s task and information need, further conceptual categorization was required to address the research questions pre-

sented in Section 4. This further categorization is intended to make a contribution not only to the Web search literature,

but also to the community informatics literature by providing the ﬁrst empirical study that demonstrates how and for what

purpose information and communications technologies (ICTs) are used as a tool to meet the various local information needs

by the communities it has been designed to serve. While the ﬁndings are limited to the Middlesex–London and Waterloo

regions in Southwestern Ontario, this study intends to demonstrate that using a centralized approach to online CI access

via the WWW may be a disservice to its users. Additionally, the ﬁndings demonstrate how proper analysis of such data

may improve the informational content and overall design of municipal government Web sites.

2. Rationale for the study

Despite the limitations of Web log query analysis as a methodology (e.g. Spink & Jansen, 2004; Thelwall, Vaughan, &

Björneborn, 2005) that are dealt with in further detail later in this paper, it ‘‘can reveal ﬁrst-hand and real-world behaviour

and interests of users. It enables researchers to better understand Web site user behaviours and the service quality that the

Web site provides. It also can be used to optimize the effectiveness of information services.” (Zhang, Wolfram, Wang, Hong, &

Gillis, 2008, p. 1934). The observation of ﬁrst-hand and real-world behaviour and interests of users in Web log data is pos-

sible due to the unobtrusive nature of data collection (Spink & Jansen, 2004). MCI is used as a case site to investigate issues

related to Web search behaviours and interests as well as the effectiveness of its information services and those of its partner

government Web sites (City of London and County of Middlesex municipal governments and London Police Service) and its

one client (Region of Waterloo municipal government). This study addresses Spink and Jansen’s (2004) assertion that ‘‘fur-

ther single Web site studies are needed to replicate and extend the previous studies” cited by the authors and other studies

that have been published since 2004 (p. 25). Additionally, this study addresses Durrance and Pettigrew’s (2002) concerns

about the lack of comprehensive data and it subsequent analysis regarding citizens’ use of networked CI services. One of

the beneﬁts of this current study is that it analyzes a large purposive sample of the query data from all of the single Web

sites mentioned above through the use of specially designed search bars rather than focusing on only a single Web site. Be-

hind these Web sites a Google search appliance is used for crawling, indexing, and retrieval purposes. Additionally, MCI uses

customized data collection software created by the City of London’s Technology Services Division to archive these data from

these Web sites in a Microsoft Access database even though these Web sites are hosted on different servers. Thus, all incom-

ing query data from all Web sites are treated the same, facilitating data processing and analysis procedures.

2.1. Background – a brief history of mycommunityinfo.ca

MCI was chosen as a case site because its information retrieval (IR) model of delivering online CI has been shaped and

designed purposely to be a simple, innovative, cost-effective, and potentially sustainable approach to providing this infor-

mation during a time when CI providers are faced with the challenges of remaining ﬁnancially viable (e.g. Cohill, 2000; Gur-

stein, 2000; Hearn, Kimber, Lennie, & Simpson, 2005; Rideout & Reddick, 2005). MCI’s service has been available to

approximately 942,000 residents in 24 urban and rural municipalities located in the Middlesex–London region and the

F. Lambert / Information Processing and Management 46 (2010) 343–361 345

Author's personal copy

Region of Waterloo since 2003–2004 (Cummings, 2004). These rural and urban municipalities are located in the geographic

region of Southwestern Ontario, Canada, between Detroit, Michigan, and Toronto, Ontario. Additionally, MCI offers ever-

expanding single point access to the online information resources of not only local non-proﬁt community organizations

and municipal governments, but also it allows immediate access to the online information resources of the Ontario provin-

cial and the Canadian federal governments. MCI’s approach may thus prove to be an affordable and effective model for other

communities to consider as an alternative method of delivering information to community members (cf. Lambert, 2008).

MCI was conceived ﬁrst in 1999 when representatives of various ministries, departments, and agencies from the muni-

cipal, provincial, and federal governments met to address the question of how these groups could provide ‘‘a cost-effective

integration of information service offerings from three levels of government” (Cummings, 2004, p. 1) in the Middlesex–Lon-

don region of Southwestern Ontario. The end result of these meetings (Fig. 1) culminated in the creation of an online com-

munity information portal that avoids the traditional, often considerably more expensive and difﬁcult to maintain,

networked directory model of organized, static links to government and non-proﬁt organization information resources.

The drawback to the directory model, such as that exempliﬁed by the North American-wide 211 service, is that the links have

to be monitored and maintained on an ongoing basis by a complement of paid staff. While this model may be useful to many

CI seekers, the excessive control inherent in systems that rely on a directory approach may be a potential disservice to their

users. The online CI directory model assumes primarily a search method where the information seeker does not make a

formal expression of information needs but rather navigates Web sites through the chain of links found in the Web sites’

pages. ‘‘However, when some speciﬁc information is searched, this point-and-click access paradigm is unpractical, and

the effectiveness of the results strongly depends on the starting page” (Herrera-Viedma & Pasi, 2006, p. 511). Instead of

devoting its resources to maintaining links and metadata, MCI relies for indexing and retrieval purposes on search engine

technology that targets speciﬁcally local public sector Web sites through its own index. Queries aimed at only provincial

and federal government Web sites can also be launched from this one Web site through the publicly available Google search

engine.

On July 31, 2003, the automated collection of queries submitted through MCI’s Google search appliance, and of queries

sent to the public version of Google via MCI, began, with the query data being logged in a Microsoft Access database (Cum-

mings, 2004). As of early 2007, 223 local municipal governments, their associated agencies, and local non-proﬁt community

Fig. 1. Mycommunityinfo.ca search interface.

346 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

organizations in London and Middlesex County (198) and in the Region of Waterloo (25) have had their Web sites indexed

and made accessible through MCI’s search appliance.

3. Review of related work

3.1. Community information seeking and uses

Within the context of community information seeking research, Durrance and Pettigrew (2002) note that there is a lack of

comprehensive data regarding citizens’ use of networked CI services. In-depth examinations are also lacking regarding cit-

izen’s information behaviour in networked CI environments. This study has been designed to help begin to ﬁll the voids iden-

tiﬁed by Durrance and Pettigrew.

There is little literature that focuses exclusively on online CI Web searching or seeking. Any studies that do provide in-

sight on this type of information searching do so within contexts that have different foci. Closely related literature includes

Bishop, Tidline, Shoemaker, and Salela (1999) who examined community information use and computer access of a low-in-

come locale in the United States populated primarily by African-Americans. Their sample group (n= 34) reported through

household interviews seven subject areas of information ranked in priority order based on their open responses that they

would like to have access to online: community services and activities, resources for children, healthcare, education, employ-

ment, crime and safety and general reference tools such as dictionaries. Similarly, through the use of online surveys posted

on NorthStarNet in Northeastern Illinois, Three Rivers Free-Net (TRFN) in Pittsburgh, PA, and CascadeLink in Portland, OR,

Durrance and Pettigrew (2002) discerned 20 community information categories of interest from their respondents

(n= 197). The alphabetical list of categories of digital CI needs included:

Business Health

Computer and technical information Housing

Education Library operations and services

Employment opportunities Local events

Financial support Local history and genealogy

Government and civic

Local information (local accommodations, community features) Parenting

Local news (weather, trafﬁc, school closures) Recreation and hobbies

Organizations and groups Sale, exchange, or donation of goods

Other people (both local and beyond the community) Social services

Volunteerism (Durrance & Pettigrew, 2002, p. 2)

While the data collection methods used by Bishop et al. (1999) and Durrance and Pettigrew (2002) were certainly appropri-

ate, neither study may claim that the ﬁndings are generalizable. This study is limited too in its generalizability as it focuses

on two small regions within Canada. Additionally, this study does not consider navigational information seeking as informa-

tion seekers go from link to link within Web sites such as those owned by the municipal governments introduced in this

paper until they reach their ultimate information destination. However, for this study large samples of the units of analysis

may be quantiﬁed, conceptualized, and then ranked. This is a distinct advantage that the studies above were not able to ex-

ploit. The conceptual meaning of these queries may be compared then from within and between the communities within

these regions to give a more comprehensive picture of online everyday information seeking through Web search from a rel-

atively large portion of the province of Ontario’s population.

3.2. Web log analysis

This study ﬁts well within the research that studies search engine log ﬁles where data are recorded about what the user is

actually seeking as represented by the actual keyword queries that the user submits (Thelwall et al., 2005). Research pub-

lished on the analysis of Web query logs has focused on querying behaviours through either commercial search engines,

search engines that are used exclusively for searching through a Web site, or for searching through library OPACs. Silverstein

et al. (1999) examined a data set recorded over 43 days from the AltaVista search engine comprising over 900 million total

requests. Their interest lay in determining which queries were most common, the average length of queries, how many que-

ries were submitted during an individual session and, especially, correlations between query terms and other ﬁeld values.

Spink et al. (2001) conducted a similar study involving over one million query logs (531,416 of which were unique) sent

to the Excite search engine on one day by 211,063 users. They examined the number of queries submitted per identiﬁed user,

measured the change in unique queries submitted by each identiﬁed user, the number of results pages viewed, and whether

multi-term queries used advanced search features such as Boolean operators. Rieh and Xie (2006) analyzed also a data set of

313 search sessions from the Excite search engine to characterize facets of query reformulation and identify patterns of mul-

tiple query reformulation in Web searching. Their goal was to explore ‘‘the ways in which search engines can support query

F. Lambert / Information Processing and Management 46 (2010) 343–361 347

Author's personal copy

reformulation more effectively in Web searching” (p. 752) by using Saracevic’s stratiﬁed interaction model as an analytical

framework (Rieh & Xie, 2006).

Following the parsing of multi-term queries, Ross and Wolfram (2000) used the frequency of binary term co-occurrence

to determine facets of multi-term queries without having to look at every query. This was a bottom-up grounded theory

method with no a priori categorization beforehand. Pu et al. (2002) used this same approach to develop their subject taxon-

omy for the automatic classiﬁcation of Web query terms into broad subject categories. Beitzel et al.’s study (2007) involved

the analysis of the Web query logs of the America Online (AOL) Web search service. However, the authors used a longitudinal

analysis ‘‘to examine static and topical changes (in querying) over longer periods such as days, weeks, and months” (Beitzel

et al., 2007, p. 167). Additionally, the authors ‘‘analyzed the queries representing different topics using a topical categoriza-

tion of (their) query stream” in an effort to determine how querying behaviour for some categories would either change or

remain static over time (p. 167).

Chau, Fang, and Sheng’s study (2005) focused on keyword queries that were submitted through the Utah state govern-

ment Web site. The authors’ goal was not only to determine the characteristics of queries that were submitted through this

Web site search engine but also to compare those same queries to those submitted through general-purpose search engines.

They determined ‘‘that Web users behave similarly when using a Web site engine and a general-purpose search engines in

terms of the average number of terms per query and the average number of result pages viewed per sessions(sic)” (p. 1374).

However, the users of the Utah state government Web site search engine submitted, on average, fewer queries per session

than users of general-purpose search engines. Additionally, Utah government Web site users use different sets of terms and

topics in their queries compared to general-purpose search engine users.

Lau and Goh (2006) analyzed 641 991 queries from the Nanyang Technological University OPAC to determine

what caused failed search sessions. Their objective was to identify areas of improvement for the OPAC to improve

users’ search experiences with the OPAC through the use primarily of failure analysis. As a result, the authors recommend

improvements to the OPAC through enhancements to interactive query reformulation, browsing, and context-sensitive

assistance.

4. Research questions

What should be noticed from this literature review is that Web log analysis used as a research method to analyze Web

search needs and behaviours has continued to evolve in innovative and informative ways. Web log analysis’s applicability as

a methodological tool for determining how CI seekers use online resources should continue to be considered seriously as an

approach for improving the quality of and access to local information, regardless of the design of the online CI provider’s

electronic infrastructure.

This study thus attempts to address the following research questions:

What types of conceptual CI are being sought in an online environment through MCI’s service approach and the Web sites

of its partners?

Are there differences in the types of CI being sought between MCI and its partner Web sites?

Will the collection and analysis of these data present ﬁndings different than those of other researchers who rely primarily

on other methodological approaches?

The ﬁndings of this study will provide a perspective of CI seeking that may make other online CI organizations and mu-

nicipal governments consider further the design and content of their respective Web sites. Additionally, this study and other

studies that may employ a similar methodological approach may lead to the further consideration of theoretical issues of

user warrant versus literary warrant concerning the creation of new or additional taxonomies for community and social ser-

vices information.

5. Methods

5.1. Attributes of mycommunityinfo.ca’s query logs

MCI supplied voluntarily their complete Web query log covering August 2005–July 2006 in a Microsoft Access database.

The full year’s worth of cleaned data were analyzed to provide a snapshot of the local online information inquiries of the

communities under examination. The following gives a description of the different attributes of the Web log data and then

details how the data was processed and analyzed.

The Microsoft Access database table that captures the Web query data is comprised of eleven attributes, with each

attribute recording different aspects of each query event. Table 1 gives brief descriptions of the relevant attributes in the

Access database table. These descriptions are necessary because a number of these ﬁelds play a major role in data prepara-

tion and analysis. Additionally, it is likely that the Web logs of many other Web sites will not contain the same types of

attributes as MCI’s customized log. This is because the Access table is a customized, in-house solution for capturing these

data.

348 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

5.2. The processing and analysis of mycommunityinfo.ca’s query logs

The MCI query logs, like those in Ross and Wolfram’s study (2000), included many examples of identical queries being

submitted by the same session identiﬁer in succession. This was due to users submitting a request to view the next page

of hits for the originally submitted query within each session. For this study a ‘ﬁnd duplicates’ query was used in Access

to retrieve all of the duplicate session numbers. Then, another ‘ﬁnd duplicates’ query was used to group all of the duplicate

query text within each session with duplicate identiﬁers that resulted from additional page views. All but one of the identical

additional queries within each session were deleted manually to create a new Microsoft Access table of ‘‘clean” queries for

frequency analysis. Similar to Ross and Wolfram (2000) and Wang et al. (2003) universal resource locators (URLs) and email

addresses were not included in the analysis. Once ‘‘cleaned,” the MCI and municipal government query data were processed

again using another separate ‘ﬁnd duplicates’ query in Microsoft Access. The ﬁnal process was reﬁned further by focusing on

the respective Web site through which the queries were submitted by using the site ﬁeld (Table 1).

Following the parsing of multi-term queries, Ross and Wolfram (2000) used the frequency of binary term co-occurrence

to determine facets of multi-term queries without having to look at every query. This was a bottom-up grounded theory

method with no a priori categorization beforehand. Pu et al. (2002) used this same approach to develop their subject taxon-

omy for the automatic classiﬁcation of Web query terms into broad subject categories. In this study, an empirical approach is

used to gather and group the most frequently occurring top 100 cleaned queries from each Web site based on their manifest

or visible surface content (Babbie, 2008) Several attempts then were made to conceive and construct a list of categories that

were well grounded in the Web log data based on the queries’ conceptually similar latent content to give the queries some

conceptual meaning (Babbie, 2008). This was done to provide a valid and reliable list of categories that could deﬁne what

sorts of community information people are searching for online. Once the list of categories was deemed sufﬁcient to ensure

mutual exclusivity, a copy of this list along with a representative systematic sample of the top 100 most frequently occurring

queries from each Web site was given to a Ph.D. student to classify. This was done to determine ﬁrst the content validity of

the categories by ensuring that the queries covering the range of meanings included in the conceptual categories were as

clearly understandable and mutually exclusive as possible (Babbie, 2008). The second purpose of this exercise was to test

the reliability of the categories for coding the query data by measuring the rate of intercoder agreement between my coding

and that of the Ph.D. student (Babbie, 2008). The end result was an intercoder rate of agreement of 89%. When the coding of

the queries using the sub-categories is taken into consideration, the rate of agreement dropped slightly to 85%. Following a

discussion, some minor changes were made to the wording of the categories. The ﬁnal list of categories is shown in Figs. 2

and 3.

6. Findings

6.1. Basic statistical ﬁndings of the mycommunityinfo.ca Web log data

Over the course of the 1 year under study, 733,331 raw query events were captured in the Microsoft Access database de-

signed by Technology Services Division, City of London. Following the cleansing of the queries, the Web sites contributed the

breakdown of data for analysis shown in Table 2.

Table 1

Types of ﬁelds/attributes in MCI Wen log database table with descriptions.

Attribute/ﬁeld Description

Time and date Time and date that the query was received

Query text Actual text of query as typed and submitted by user

Site Identiﬁes the site from which the search was launched using an alias created by TSD technicians and then assigned by logic

to referring URLs. If the referring URL comes from the City of London Web site, then ’City’ is entered in the site ﬁeld

Type of search Has two values: ‘‘0” for MCI’s search appliance; or, ‘‘1” for Google Web API. ‘‘0” shows the query targeting the 223 Web sites

after being submitted through MCI or searching within one of the ﬁve municipal Web sites. ‘‘1” is recorded when the search

is launched through public Google using the search domain restrictions of ‘‘.gc.ca” or ‘‘.gov.on.ca”. This is done by the user

clicking on the ‘‘Search Province of Ontario and Government of Canada Resources” button at either the top or the bottom of

the returned results page

Estimated number of

hits

The returned results from the searcher’s query through MCI or the municipal Web sites

Referrer Records the referring URL accompanying the incoming query to MCI’s search appliance from the request header of the site

issuing the request

Start page A pagination of the number of pages of returned hits that the user has examined during his/her search session

Session ID An artiﬁcial session identiﬁcation number is assigned every time the MCI search appliance is ‘‘touched” by an incoming query

request from the municipal Web sites or from MCI. This is initiated by the incoming IP address of the user’s computer or

router

Number of searches A recording of the number of queries submitted during the search session

Site search Records ‘‘community” for queries targeting Web sites in MCI’s local index (using MCI’s Web site); and, ‘‘site search” that

indicates the search was launched on a Web site from its own search bar, to search only its collection of Web pages (e.g. City

of London, County of Middlesex, London Police Services, and Waterloo County)

F. Lambert / Information Processing and Management 46 (2010) 343–361 349

Author's personal copy

Considering the number of raw query events in relation to the total number of cleaned queries in column one of Table 2,

48% of all events that were in MCI’s query log were users requesting views of additional pages of ‘‘hits” returned by the

search engine. Table 2 also shows that municipal Web sites, with the exception of County of Middlesex and the more top-

ically specialized London Police Services Web site, were queried far more as potential sources of local information compared

to the more diverse MCI Web site that indexes 223 Web sites from London–Middlesex and Region of Waterloo.

Fig. 2. Major conceptual categories and sub-categories.

Fig. 3. Minor conceptual categories derived from MCI Web query log data.

350 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

The disproportionate total of MCI-Middlesex–London’s top 100 most frequently occurring queries relative to that of the

other Web sites needs to be noted. This anomaly is due to MCI’s use of Life Events bundles

(Cummings, 2002). MCI uses a

number of actively hyperlinked keywords and very short keyword phrases that have been embedded in Life Events pages to

offer the CI seeker the option of launching a search of MCI’s index if the particular Life Event does not contain the information

that is needed. Once the hyperlinked keyword or keyword phrase is clicked on, a search is launched in the same way as typing

the actual query and clicking on the ‘‘go!” button. Sixty-nine out of seventy Life Event bundle keywords were exact matches to

sixty-nine of the queries that appeared in the top 100 of MCI-Middlesex–London’s Web log data. These Life Events may be an

attractive alternative for CI organization and searching rather than relying solely on either keyword searching or a directory of

ﬁxed links.

6.1.1. Mean number of terms per query

The mean number of terms used per query for the top 100 most frequently occurring queries and for all cleaned queries

was calculated to compare some of the descriptive statistical ﬁndings of past Web log studies to this study in an effort to

determine if CI seekers in Southwestern Ontario tend to use more or fewer words to ﬁnd the information being sought.

For the top 100 most sought queries, the proportion of multi-term queries ranged widely based on the Web site used.

MCI-Middlesex–London had the highest proportion of multi-term queries at 49%. Again, this was inﬂuenced by the use of

active hyperlinked keywords and short keyword phrases in MCI-Middlesex–London’s Life Events pages. While at least

50% of all top 100 queries are composed of only one word, the average number of words or terms that form queries in Table

2are fewer than those reported in other studies. Wang et al. (2003) report an average of two words per query submitted by

users. Beitzel et al. (2007) found over one week in December 2003 that users submitted on average 2.2 terms per query ses-

sion. When this was expanded to six months, the average number of query terms increased slightly to 2.7 terms. Wang, Ber-

ry, and Yang’s (2003) and Beitzel et al.’s (2007) studies match more closely the mean number of terms per query for all

queries shown in Table 2. Additionally, the average ‘‘popular query” length in Beitzel et al.’s (2007) study was 1.7 terms

per session for the two time periods they examined. While Beitzel et al. (2007) do not deﬁne what constitutes a ‘‘popular

query,” an average of 1.7 terms per session is the closest match in the Web log literature to the ﬁndings for the top 100 que-

ries presented in Table 2.

6.2. Online community information seeking in London–Middlesex and Region of Waterloo

The sections that follow highlight some of the more interesting conceptual categories that emerged from the Web query

log data collected from MCI’s Web site and the City of London, County of Middlesex, Region of Waterloo, and London Police

Service Web sites. The primary goal of this analysis is to present empirical evidence of actual Web-based CI inquiries that

focus on the geographic region of Southwestern Ontario. Within each of the select categories, an examination of the CI inqui-

ries that emerged from the individual Web sites is presented to demonstrate how CI inquiries may differ between munici-

palities or smaller regions within Southwestern Ontario. Not every single category from all Web sites will be discussed;

rather, the focus will be on any interesting trends or patterns that are typical of particular categories. Additionally, this anal-

ysis of CI seeking in Southwestern Ontario is considered within the context of the design of online CI portals and the design

of CI taxonomies.

The conceptual categories that emerged from the queries were divided into major and minor categories (Figs. 2 and 3).

Within the major categories, more speciﬁc sub-categories also emerged. Minor categories were either very Web site speciﬁc

(e.g. ‘‘Special” Middlesex County) or could not be categorized. Categories were considered ‘‘minor” if their proportion of all of

Table 2

Descriptive statics of queries for AII MCI and municipal government Web sites.

Web site Total #of cleaned

queries/Web site

Mean words/query

for all cleaned

queries

Total of

frequencies

for top 100 queries

Proportion of top

100 to all cleaned

queries (%)

Mean

words/

query, top

100

Percentage of top

100 queries with

word (%)

City of London 217,162 2.1 36,838 17 1.28 25

Region of Waterloo 92,584 2.25 14,016 15 1.4 34

MCI-Middlesex–

London

42,873 1.95 21,876 51 1.53 49

London Police Service 16,355 1.97 3087 19 1.31 28

MCI-Region of

Waterloo

10,230 2.15 2172 21 1.4 32

County of Middlesex 7867 1.85 1986 25 1.2 19

Grand total 387,071 2.11 79,975 21 1.37 33

See http://www.mycommunityinfo.ca/life/default.asp.

F. Lambert / Information Processing and Management 46 (2010) 343–361 351

Author's personal copy

the most frequently occurring queries was equal to or less than the percentages of the lowest percentage subcategory (cf.

Tables 3 and 4).

As may be seen in Tables 5–10, each individual Web site’s conceptual categories may be ranked, showing the conceptual

‘‘popularity” of the submitted queries rather than relying simply on an alphabetical list to report ﬁndings (e.g. Durrance &

Pettigrew, 2002). This gives some sense as to what really matters to information seekers as they search for information on-

line to help them address their everyday inquiries.

Table 3

Proportion of top 100 queries, all Web sites, by major conceptual category.

Major categories Number of queries Percentage of categories of total frequencies

1. Recreation, Entertainment, and Leisure 15,186 19.0

General 7288 9.1

Food and Drink 2169 2.7

Holidays 1833 2.3

Parks 1371 1.7

Shopping 1406 1.8

Sports and Physical Fitness 612 0.8

Worship 507 0.6

2. Work 10,954 13.7

Employment and Training 9781 12.2

Human Resources 669 0.8

Volunteerism 504 0.6

3. Family 6934 8.7

Marriage 767 1.0

Children and/or Parenting 6167 7.7

4. Municipal Government Business 5356 6.7

5. Transportation 4715 5.9

Ground Transportation 2334 2.9

Roads 1877 2.3

Air Transportation 504 0.6

6. Housing a Shelter 4540 5.7

7. Solid Waste Collection and Recycling 4160 5.2

8. Crime and Public Safety 3386 4.2

Crime 798 1.0

Public Safety 2588 3.2

9. Animals 2836 3.5

10. Taxation 2758 3.4

11. Geographic Information 2413 3.0

12. Population, Demographics and Statistics 2410 3.0

13. Health 2409 3.0

14. Libraries 1936 2.4

15. Place Names 1794 2.2

16. Water 1709 2.1

17. Ageing, Dying, and Death 1398 1.7

18. Education 1271 1.6

19. Business 901 1.1

20. Media 610 0.8

Total frequencies. Major Categories 77,676 97.1

Total frequencies, all queries 79,975 100

Table 4

Proportion of top 100 queries, all Web sites, by minor conceptual category.

Minor categories Number of queries Percentage of categories of total frequencies

Environment 480 0.6

Construction 236 0.3

Historical 172 0.2

‘‘Special” Middlesex County 166 0.2

Groups 141 0.2

Culture 92 0.1

Persons by Name 78 0.1

Job titles 31 0.0

Unable to categorize 903 1.1

Total frequencies. minor categories 2299 2.9

Total frequencies, all queries 79,975 100

352 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

6.2.1. ‘‘Recreation, Entertainment, and Leisure”as the most frequently sought local information

Nearly one in every ﬁve queries submitted through the six combined CI-based Web sites pertained to this conceptual cat-

egory (Table 4). Users of particularly the City of London Web site were the largest absolute contributors of queries pertaining

to ‘‘Recreation, Entertainment, and Leisure.” This is hardly surprising since most of the overall queries came from the City of

London Web site to begin with. However, as Table 5 shows, this category also had the largest impact on the types of CI sought

through the City of London Web site.

Even though the proportion of queries devoted to the ‘‘Recreation, Entertainment, and Leisure” category is quite high,

there are often considerable differences within its sub-categories. Queries related to ‘‘Holidays” focused on ﬁnding informa-

tion about statutory holidays (e.g. ‘‘Victoria day,” ‘‘Canada day”) and events usually associated with them. Note that MCI-Re-

gion of Waterloo Web site is used more frequently than the municipal Region of Waterloo Web site to ﬁnd this particular

type of information. However, MCI-Region of Waterloo will search all of the lower-tier municipal Web sites (e.g. the cities

of Kitchener, Waterloo, and Cambridge, the townships of North Dumfries, Wellesley, Wilmot, and Woolwich, and the Region

of Waterloo) all at once. This will thus give the MCI-Region of Waterloo user a greater chance of ﬁnding information pertain-

ing to ‘‘Holidays” than if they only searched the Region of Waterloo Web site. City of London CI searchers were the largest

absolute contributors to queries concerning ‘‘Holidays” but as a proportion of the top 100 cleaned queries submitted through

the City of London Web site, the percentage is quite small (3.7%). Surprisingly, London Police Service had a very large pro-

portion of queries (nearly 30%) that focused on ‘‘Recreation, Entertainment, and Leisure.” However, all of these ‘‘Recreation,

Entertainment, and Leisure” queries fall into the more speciﬁc subcategory of ‘‘Shopping.” These information seekers wanted

to know where and when police auctions were being held and used ‘‘auction” and ‘‘auctions” as the ﬁrst and second most

popular queries, respectively.

It was unanticipated that ‘‘Recreation, Entertainment, and Leisure” information would be the most sought after type of

local information. Despite what may be considered more ‘‘serious” categories (e.g. ‘‘Work”, ‘‘Housing and Shelter”, ‘‘Health”),

residents from the two geographic regions served by MCI and its partner Web sites seem to place some priority with seeking

Table 5

Proportion of top 100 queries City of London, by major conceptual category.

Major categories Number of queries Categories as % of queries

1. Recreation, Entertainment, and Leisure 10,679 22.9

General 4825 13.5

Food and Drink 2064 5.8

Holidays 1314 3.7

Parks 1139 3.2

Shopping 514 1.4

Sports and Physical Fitness 368 1.0

Worship 455 1.3

2. Work 6789 19.0

Employment and Training 6069 17.0

Human Resources 377 1.1

Volunteerism 343 1.0

3. Municipal Government Business 3563 10.0

4. Transportation 2024 5.7

Ground Transportation 975 2.7

Roads 695 1.9

Air Transportation 354 1.0

5. Solid Waste Collection and Recycling 1931 5.4

6. Housing and Shelter 1918 5.4

7. Population, Demographics and Statistics 1910 5.3

8. Taxation 1546 4.3

9. Libraries 1343 3.8

10. Family 1129 3.2

Marriage 723 2.0

Children and/or Parenting 406 1.1

11. Geographic Information 1083 3.0

12. Crime and Public Safety 971 2.7

Crime 0 0.0

Public Safety 971 2.7

13. Media 484 1.4

14. Education 204 0.6

15. Water 179 0.5

16. Animals 0 0.0

17. Health 0 0.0

18. Place Names 0 0.0

19. Ageing, Dying, and Death 0 0.0

20. Business 0 0.0

Total queries 35,753 100.0

F. Lambert / Information Processing and Management 46 (2010) 343–361 353

Author's personal copy

information related to leisure activities. Considering the time pressures that working people often feel, this makes sense

since ‘‘the average daily time spent on paid work, housework and other unpaid household duties (including child care)

for those aged 25–54 (in Canada) has increased steadily over the past two decades, rising from 8.2 h in 1986 to 8.8 h in

2005” (Marshall, 2006, pp. 5 and 7). This overall increase in work of all kinds is due solely to the amount of paid work hours

Canadian workers perform (Marshall, 2006). Thus trying to ﬁnd leisure activities within close proximity to where the MCI

and municipal Web site CI seekers live takes some precedence.

6.2.2. Job seeking

Notwithstanding the emphasis on searching for recreational activities in Southwestern Ontario, ﬁnding information about

employment is still a much sought after information-seeking category (Table 3). Even with reasonably low and declining

unemployment rates in Middlesex–London and Region of Waterloo during 2005 and 2006 (6.8% and 6.2% for the former,

respectively, and 5.7% and 5.2% for the latter, respectively), there was still a large focus on ﬁnding employment information

(City of Kitchener, 2007; LEDC, xxxx). There is quite a contrast in the percentage of queries devoted to ﬁnding employment

between the very rural County of Middlesex (which surrounds City of London) and City of London. While ﬁnding information

about ‘‘Work” by querying the City of London Web site was the second most popular category, it was the primary type of

inquiry for County of Middlesex (Table 7) which surrounds City of London.

6.2.3. Health ‘‘Matters”

A disproportionate number of the queries (>50%) related to ‘‘Health” actually come from the Region of Waterloo muni-

cipal government Web site, and thus ranks as the ﬁfth most sought local information inquiry (Table 9). The majority of Re-

gion of Waterloo’s health-related queries were concerned with virulent diseases (inﬂuenza and its iterations, ‘west Nile,’ and

‘pandemic’) and two narrow public health issues (‘smoking’ and ‘food safety,’). A closer examination of the Region of Water-

loo Web site’s main page reveals a link called ‘‘Pandemic Inﬂuenza Planning” (http://www.region.waterloo.on.ca/web/re-

Table 6

Proportion of top 100 queries, London Police Service, by major conceptual category.

Major categories Number of queries Categories as % of queries

1. Crime and Public Safety 1347 45.8

Crime 639 21.7

Public Safety 708 24.1

2. Recreation, Entertainment, and Leisure 873 29.7

General 0 0.0

Food and Drink 0 0.0

Holidays 0 0.0

Parks 0 0.0

Shopping 873 29.7

Sports and Physical Fitness 0 0.0

Worship 0 0.0

3. Work 254 8.6

Employment and Training 149 5.1

Human Resources 52 1.8

Volunteerism 53 1.8

4. Transportation 118 4.0

Ground Transportation 94 3.2

Roads 24 0.8

Air Transportation 0 0.0

5. Municipal Government Business 107 3.6

6. Population, Demographics and Statistics 81 2.8

7. Geographic Information 63 2.1

8. Media 54 1.8

9. Family 26 0.9

Marriage 0 0.0

Children and/or Parenting 26 0.9

10. Place Names 16 0.5

11. Housing and Shelter 0 0.0

12. Solid Waste Collection and Recycling 0 0.0

13. Animals 0 0.0

14. Taxation 0 0.0

15. Health 0 0.0

16. Libraries 0 0.0

17. Water 0 0.0

18. Ageing, Dying, and Death 0 0.0

19. Education 0 0.0

20. Business 0 0.0

Total queries 2939 100.0

354 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

gion.nsf/fmFrontPage?OpenForm. Accessed June 30, 2007). Clicking on this link opens a Web site in a new browser window

called Waterloo Region Pandemic Inﬂuenza Planning (http://www.waterlooregionpandemic.ca/en/index.shtml. Accessed June

30, 2007). However, prior to August 18th, 2006, this link did not exist on the Region of Waterloo Web site. Conﬁrmation of

this fact is possible by using the Internet Archive Wayback Machine (http://www.archive.org/web/web.php. Accessed July 1,

2007). Examining past versions of the Region of Waterloo Web site using the Wayback Machine for the time span during

which the Web log data were collected did not reveal any links to the Waterloo Region Pandemic Inﬂuenza Planning Web site

let alone links to Web pages within the Region of Waterloo Web site that addressed pandemic planning or virulent disease.

Due to the relatively large number of health-related queries in MCI’s Web logs, the Region of Waterloo Web site may be per-

ceived to be a very important source of health information for residents of Region of Waterloo. However, this perception

could be attributed to poor design of the main Web page.

6.2.4. ‘‘Municipal Government Business”

For the County of Middlesex, City of London, and Region of Waterloo, the proportions of queries that pertain to ﬁnding

information concerning ‘‘Municipal Government Business” are very similar. This similarity in the proportion of queries be-

tween the different Web sites for ‘‘Municipal Government Business” indicates that, regardless of whether an online informa-

tion seeker lives in either a primarily rural or a primarily urban setting, he or she relies on the World Wide Web to a fairly

signiﬁcant extent to ﬁnd information about his or her government’s operations and policies. Some examples of these infor-

mation inquiries are concerned with ‘‘bylaws’, ‘‘zoning”, ‘‘development charges”, ‘‘social services”, and other topics that are

too numerous to list.

MCI-Middlesex–London and MCI-Region of Waterloo were used rarely if at all to ﬁnd information concerning ‘‘Municipal

Government Business.” When CI seekers are looking for information about their local government in an online environment,

they seem to prefer to use the most direct and relevant source: municipal government Web sites. Also, MCI-Middlesex–Lon-

don and MCI-Region of Waterloo index more than one municipal government’s Web site. A local information seeker looking

Table 7

Proportion of top 100 queries, County of Middlesex, by major conceptual category.

Major categories Number of queries Categories as % of queries

1. Work 562 33.6

Employment and Training 550 32.9

Human Resources 12 0.7

Volunteerism 0 0.0

2. Place Names 182 10.9

3. Municipal Government Business 180 10.8

4. Education 101 6.0

5. Recreation, Entertainment, and Leisure 90 5.4

General 31 1.9

Food and Drink 0 0.0

Holidays 12 0.7

Parks 17 1.0

Shopping 0 0.0

Sports and Physical Fitness 30 1.8

Worship 0 0.0

6. Taxation 81 4.8

7. Family 73 4.4

Marriage 44 2.6

Children and/or Parenting 29 1.7

8. Housing and Shelter 63 3.8

9. Crime and Public Safety 63 3.8

Crime 0 0.0

Public Safety 63 3.8

10. Libraries 55 3.3

11. Geographic Information 53 3.2

12. Solid Waste Collection and Recycling 52 3.1

13. Health 36 2.2

14. Population. Demographics and Statistics 32 1.9

15. Ageing, Dying, and Death 29 1.7

16. Water 19 1.1

17. Transportation 0 0.0

Ground Transportation 0 0.0

Roads 0 0.0

Air Transportation 0 0.0

18. Animals 0 0.0

19. Business 0 0.0

20. Media 0 0.0

Total queries 1671 100.0

F. Lambert / Information Processing and Management 46 (2010) 343–361 355

Author's personal copy

for information from his/her municipality would not necessarily be interested in the information that pertains to other

municipalities.

6.2.5. ‘‘Geographic Information”seeking in Southwestern Ontario

While ‘‘Geographic Information” only makes up 3% of the queries submitted through the City of London Web

site, only three queries were used to make up the total frequency for this category: ‘‘map,” ‘‘maps,” and ‘‘city map.” This

focused querying, like Region of Waterloo Web site’s experience with health information pertaining to virulent disease, says

something about Web site design and how users are interacting with it through its links to information sources such as

maps.

7. Discussion

The ﬁndings presented above show that the types of CI categories being sought by CI seekers in London–Middlesex and

Region of Waterloo are, in many instances, considerably different between these communities. These categories are different

also from those discovered and presented by past research. This not only reveals interesting trends in CI seeking, but it also

offers insight into the design of CI Web sites and municipal government Web sites. It raises questions about the effectiveness

of adopting a standardized approach to providing online CI amongst different communities. This standardized approach that

is used by a variety of CI and information and referral services such as the 211 service

may actually be doing the residents of

those communities a disservice.

Table 8

Proportion of top 100 queries, MCI-Middlesex–London, by major conceptual category.

Major categories Number of queries Categories as % of queries

1. Family 5458 25.6

Marriage 0 0.0

Children and/or Parenting 5458 25.6

2. Animals 2836 13.3

3. Recreation, Entertainment, and Leisure 2280 10.7

General 1924 9.0

Food and Drink 82 0.4

Holidays 60 0.3

Parks 88 0.4

Shopping 0 0.0

Sports and Physical Fitness 84 0.4

Worship 42 0.2

4. Housing and Shelter 2263 10.6

5. Work 1781 8.4

Employment and Training 1686 7.9

Human Resources 0 0.0

Volunteerism 95 0.4

6. Ageing, Dying, and Death 1358 6.4

7. Taxation 1131 5.3

8. Health 1100 5.2

9. Business 800 3.8

10. Crime and Public Safety 731 3.4

Crime 159 0.7

Public Safety 572 2.7

11. Libraries 514 2.4

12. Education 486 2.3

13. Place Names 296 1.4

14. Geographic Information 158 0.7

15. Population, Demographics and Statistics 61 0.3

16. Media 48 0.2

17. Municipal Government Business 0 0.0

18. Transportation 0 0.0

Ground Transportation 0 0.0

Roads 0 0.0

Air Transportation 0 0.0

19. Solid Waste Collection and .Recycling 0 0.0

20. Water 0 0.0

Total queries 21301 100.0

See 211 Toronto (http://www.211toronto.ca/index.jsp), 211 Niagara (http://www.211toronto.ca/ont/index.jsp?partner_code=211nia), and 211 Simcoe

(http://www.211toronto.ca/ont/index.jsp?partner_code=211sim) as examples of the blanket application of online CI design and taxonomy organization.

356 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

7.1. Emergence of help-seeking mismatches

Based on the queries that were collected and analyzed for this study, there is an impression that a relatively large number

of Web searchers using the municipal government Web sites presented in this paper seem to perceive that online govern-

ment-based information resources are a relatively important source for ‘‘Recreation, Entertainment, and Leisure” informa-

tion. For instance, ‘‘movies”, ‘‘restaurants”, ‘‘bars”, ‘‘shopping”, and ‘‘malls” are all examples of very frequently occurring

navigational and informational queries that are part of this category submitted speciﬁcally through municipal government

Web sites. However, after querying personally the City of London Web site using each of those same keywords, no hits re-

lated to these queries were returned. The City of London Web site does not contain any Web documents that pertain to many

of these activities. Those who are querying City of London’s Web site in particular might be experiencing what Dewdney and

Harris (1992) deﬁne as a help-seeking mismatch where ‘‘the types of help that might be expected from an agency are not

those which it provides.” (p. 23) Thus, users looking for this information in Middlesex–London are not using the best source

to retrieve this type of information (MCI-Middlesex–London does index Web sites related to tourism). If this is indeed a com-

mon practice with other similar government Web sites, municipal governments should evaluate how their Web sites are

used more closely to help minimize this unintentional information barrier.

7.2. Is health information seeking still an important Web search activity?

The low ranking for ‘‘Health” information seeking was somewhat unexpected as searching for health or medical related

information is generally a relatively popular Internet activity in Canada. While a direct comparison cannot be made using

this study’s data, Statistics Canada’s 2005 survey reports that searching for health or medical related information was the

sixth most popular Internet activity of individual Canadians (Statistics Canada, Nov. 1, 2006). Additionally, there is a signif-

icant body of information science literature that emphasizes consumer health information seeking on the WWW. For

Table 9

Proportion of top 100 queries, Region of Waterloo, by major conceptual category.

Major categories Number of queries Categories as % of queries

1. Transportation 2471 17.8

Ground Transportation 1226 8.8

Roads 1116 8.1

Air Transportation 129 0.9

2. Solid Waste Collection and Recycling 1962 14.2

3. Water 1413 10.2

4. Municipal Government Business 1409 10.2

5. Health 1265 9.1

6. Work 1237 8.9

Employment and Training 1017 7.3

Human Resources 220 1.6

Volunteerism 0 0.0

7. Geographic Information 999 7.2

8. Place Names 972 7.0

9. Recreation, Entertainment and Leisure 606 4.4

General 362 2.6

Food and Drink 0 0.0

Holidays 81 0.6

Parks 85 0.6

Shopping 0 0.0

Sports and Physical Fitness 78 0.6

Worship 0 0.0

10. Education 421 3.0

11. Population, Demographics and Statistics 279 2.0

12. Crime and Public Safety 255 1.8

Crime 0 0.0

Public Safety 255 1.8

13. Housing and Shelter 252 1.8

14. Family 236 1.7

Marriage 0 0.0

Children and/or Parenting 236 1.7

15. Business 80 0.6

16. Media 0 0.0

17. Libraries 0 0.0

18. Ageing, Dying, and Death 0 0.0

19. Animals 0 0.0

20. Taxation 0 0.0

Tola queries 13,857 100.0

F. Lambert / Information Processing and Management 46 (2010) 343–361 357

Author's personal copy

instance, Harris, Wathen, and Chan (2005) cite two studies that suggest anywhere from 20% to 80% of the population of the

United States with Internet access seek health information. Gillaspy (2005) found contradictory studies concerning the pro-

portion of the population that use the Internet for ﬁnding health information. Gillaspy cites the same Pew Internet Project

report as Harris, Wathen, and Chan where 80% of the population reported they use the Internet to seek health information.

However, a study from the Center for Studying Health Care Change reported that only 16% of the United States’ population

used the Internet to ﬁnd health information. Gillaspy considers these data within the context of various factors that may

affect the provision of consumer health information in public libraries. While Gillaspy wonders which study is the most

accurate, she concludes that ‘‘in fact, in terms of providing consumer health information in public libraries, it may not mat-

ter. Adults learn at the point of need, when learning is relevant to their life situation, and certainly health issues become

pertinent to almost everyone at some point in their lives” (p. 482). Spink and Jansen’s (2004) studies conﬁrmed what was

found in this study; that only a small percentage of Web queries are medical or health-related. Considering the plethora

of health information Web sites available on the WWW, it is very possible that MCI and the municipal government Web sites

are not considered to be good or even necessary resources despite the number of local health-related Web sites that MCI

indexes.

7.3. Effectiveness and importance of well designed home pages

The City of London has an impressive number of online maps available through its Web site, including an outstanding

easy to use interactive map of the city available to ﬁnd addresses, provide aerial photographs, ﬁnd schools, parks and rec-

reation centres, points of interest, and even assessment parcels and electoral wards, among other features. (City of London,

xxxx). As a result, it is not difﬁcult to assume that this collection of geographic resources could be a fairly popular tool for

ﬁnding ‘‘Geographic Information” about the city. However, the City of London Web site also has a rather noticeable link to

this interactive map on its main page. The question that is raised, then, is why are keyword queries being submitted through

Table 10

Proportion of top 100 queries, MCI-Region of Waterloo, by major conceptual category.

Major categories Number of queries Categories as % of queries

1. Recreation, Entertainment, and Leisure 658 30.5

General 146 6.8

Food and Drink 23 1.1

Holidays 366 17.0

Parks 42 1.9

Shopping 19 0.9

Sports and Physical Fitness 52 2.4

Worship 10 0.5

2. Work 331 15.4

Employment and Training 310 14.4

Human Resources 8 0.4

Volunteerism 13 0.6

3. Place Names 328 15.2

4. Solid Waste Collection and Recycling 215 10.0

5. Transportation 102 4.7

Ground Transportation 39 1.8

Roads 42 1.9

Air Transportation 21 1.0

6. Water 98 4.5

7. Municipal Government Business 97 4.5

8. Education 59 2.7

9. Geographic Information 57 2.6

10. Population, Demographics and Statistics 47 2.2

11. Housing and Shelter 44 2.0

12. Libraries 24 1.1

13. Media 24 1.1

14. Business 21 1.0

15. Crime and Public Safety 19 0.9

Crime 0 0.0

Public Safety 19 0.9

16. Family 12 0.6

Marriage 0 0.0

Children and/or Parenting 12 0.6

17. Ageing, Dying, and Death 11 0.5

18. Health 8 0.4

19. Animals 0 0.0

20. Taxation 0 0.0

Total queries 2155 100.0

358 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

the Web site to ﬁnd maps when a relatively prominent link is provided on the main page? This, like Region of Waterloo’s

rather prominent link to its pandemic planning Web site, may be an example of the same issue articulated by Herrera-Vied-

ma and Pasi (2006) cited earlier in this study: that the likelihood of a successful search for information based on a point-and-

click access paradigm is dependent largely on the design and related information provided on the starting Web page. Since

such a relatively large number of queries are related to ‘‘Geographic Information,” and more speciﬁcally to maps, it is pos-

sible that an unsuccessful point-and-click access paradigm is being demonstrated. Without Web site navigation data related

to the number of times users have actually clicked on the maps link, this theory cannot be tested. However, further research

into this aspect of human–computer interaction may reveal aspects of online searching behaviour that should be taken into

account in the design of Web sites.

The respective municipal governments mentioned above might ﬁnd it interesting to know how their citizens rely on their

Web sites to ﬁnd information concerning the municipalities’ operations, policies, and responsibilities. Municipal govern-

ments have followed the lead of the federal and provincial governments in offering more services and information through

the WWW. This has occurred despite the fact that municipal governments already tended to interact with a country’s citi-

zens much more closely than a national government. This type of online interactivity between a municipality’s residents and

local government should also enhance residents’ perception that municipal governments still tend to be better at delivering

services compared to the provincial or federal governments (cf. Canadian Centre for Management Development, 1999; Erin

Research, Inc., 1998).

7.4. Implicaton of this study’s ﬁndings and the ﬁndings of other similar studies on information organization

There are some but few similarities between Durrance and Pettigrew’s categories and the major categories that emerged

from the MCI Web log data. Most importantly, like Durrance and Pettigrew’s ﬁndings (2002), the categories in Tables 3 and 4

‘‘are markedly different from those traditionally used to classify CI needs” (p. 2). This should hopefully start a dialogue about

possible changes to Sales’s Taxonomy of Human Services (1994) and its more current iteration, the AIRS/INFOLINE Taxonomy of

Human Services, since so many of the more than 8,500 controlled vocabulary terms contained in the taxonomy are not re-

motely similar when related to MCI’s conceptual CI inquiries (Sales, 2003).

The top 100 most frequently occurring queries from six Web sites formed the basis for the categorization and subsequent

discussion of online CI inquiries. This means that only six hundred queries represented 21% of the tens of thousands of dis-

tinct queries submitted through MCI and its partner municipal Web sites over the course of 1 year. This demonstrates a

remarkable conceptual consistency in local information seeking that may be boiled down to the 20 main conceptual cate-

gories shown in Table 3. Why then are bulky and often complex taxonomies needed to organize online CI directories when

their controlled vocabulary may not reﬂect actual CI seekers’ queries? Despite the artiﬁciality inherent in usability studies

(e.g. Toms & Kinnucan, 1996), perhaps additional usability studies are needed to determine more ideal approaches to search-

ing CI Web sites.

8. Limitations of the study

Thelwall et al. (2005) summarize many of the challenges to using Web log analysis as a research method. Many of these

challenges occurred in this study. One of the most signiﬁcant limitations in the case of MCI’s query log data is trying to iden-

tify accurately individual users from the search session identiﬁers. For session identiﬁcation, Web site designers have imple-

mented the use of cookies because the proliferation of dynamic IP addresses has meant that it is impossible to identify

individual users of a Web site (Rubin, 2001). However, users may disable cookies on their computer, thus leaving this iden-

tifying ﬁeld in Web logs blank and consequently affect individual query submission analysis (Silverstein et al., 1999; Thel-

wall et al., 2005). MCI does not use cookies. This means that individual users and the respective computers they use cannot

be identiﬁed as unique users since only IP addresses are logged. While the session identiﬁcation for the incoming queries

through the MCI and other Web sites is not perfect in identifying individual users, it does provide the ability to distinguish

a unique search session initiated through a particular computer. The Web searcher’s querying behaviour, from the text sub-

mitted to the number of page requests, for the search session started by the computer’s incoming IP address is recorded

faithfully in the logs until the session is ended by either the Web browser being closed or the search session being inactive

for 20 min which causes the session to time out.

The other mentionable limitation of this study is that it does not include the examination of navigational Web logs for the

Web sites that are under scrutiny. No assumption is made that every user of the Web sites under examination uses only key-

word querying to ﬁnd the information that he or she is seeking. Users may ﬁnd the information that they are seeking by

clicking on relevant hyperlinks within a Web site until they reach their ﬁnal information destination. However, this study

attempts to attach meaning to the keywords users submit through the MCI and municipal government Web sites in their

pursuit of community information. This is what makes this study unique in the area of Web searching that focuses on

the local community. It takes large sets of data (keyword queries from Web query logs) and attaches meaning to these data

to provide a thorough analysis of CI needs based on what the CI seeker actually types and submits through a Web site. The

limitations of query analysis should not inordinately hamper the main purpose of the study: discovering what CI Web

resources and are being used most by CI seekers in Southwestern Ontario by using relative comparisons, and, most

F. Lambert / Information Processing and Management 46 (2010) 343–361 359

Author's personal copy

importantly, what the users’ information needs are, based on the type of topical or conceptual CI being sought from these

resources.

9. Conclusion

The online local information inquiries of a particular population or populations, and how these inquiries are being ad-

dressed, are anything but simple and predictable. Even multiple sources of online information, such as those presented in

this paper, that are designed to serve the same population demonstrates often that a population’s Web searching will vary

often depending on how the information seeker perceives the scope of information and the potential utility of the informa-

tion that those sources provide. This study is a ﬁrst step in exploring the complexities of CI seeking in an online environment

that has, for the ﬁrst time, had access to a very large set of data that provides unobtrusive evidence of CI inquiries for speciﬁc

geographic regions. Despite this paper’s focus on a limited geographic region, the ﬁndings have shown that: local informa-

tion seekers during the course of their Web searching may often choose the wrong resource to address their information

need; health information seeking from the local community is not generally an important task despite the fact that people

are generally likely to get medical attention in the immediate geographic vicinity depending on their situation; what may

appear to be well designed home pages to Web sites may not indeed be so; and, that while the types of community infor-

mation sought through MCI and its partner municipal government Web site are more diverse and speciﬁc than those re-

vealed by two other studies, this study’s categories, like those reported by Durrance and Pettigrew (2002), are markedly

different than those found in CI taxonomies. This last point especially will hopefully stimulate debate as to whether these

taxonomies should continue to rely on literary warrant for their empirical basis or whether it might be beneﬁcial to consider

user warrant to go beyond a taxonomic structure and create full-ﬂedged thesauri for indexing and retrieval purposes for

those CI organizations that wish to continue using a directory style model.

9.1. Future research directions

Unlike a lot of past published research that focused on Web searching conducted on individual Web sites, further studies

are under way that will examine the same Web sites presented in this paper. However, these future studies will take a lon-

gitudinal approach by analyzing 3 years’ worth of query data. Will the conceptual categories that emerged from the data in

this study apply still or not? How, if at all, do CI seekers’ Web searches change over this time period? Will the 3 years of

query data provide an accurate depiction of societal, political, and economic changes over this time period? Between this

study and the proposed studies, better models of information needs as expressed through Web searching through single

Web sites will emerge hopefully to give a more complete portrait of how local CI Web sites are used to address citizen every-

day information seeking behaviours.

Acknowledgements

The author wishes to thank: Drs. Liwen Vaughan, Catherine Ross, and Daniel Robinson, Faculty of Information and Media

Studies, University of Western Ontario, for their feedback on a much larger version of this study; Drs. Yin Zhang and Marcia

Zeng, School of Library and Information Science, Kent State University, for their assistance with this current paper; and, Dr.

Stephen Cummings, mycommunityinfo.ca, for his assistance in facilitating access to the log data discussed herein.

References

Babbie, E. (2008). The basics of social research (4th ed.). Belmont, CA: Wadsworth.

Beitzel, S. M., Jensen, E. C., Chowdhury, A., Frieder, O., & Grossman, D. (2007). Temporal analysis of a very large topically categorized Web query log. Journal

of the American Society for Information Science and Technology, 58(2), 166–178.

Bishop, A. P., Tidline, T. T., Shoemaker, S., & Salela, P. (1999). Public libraries and networked information services in low-income communities. Library and

Information Science Research, 21(3), 361–390.

Broder, A. (2002). A taxonomy of Web search. ACM SIGIR Forum, 36(2), 3–10.

Canadian Centre for Management Development (1999). Citizen-centred service: Responding to the needs of Canadians; for the Citizen-Centred Service Network.

Ottawa: Canadian Centre for Management Development, Citizen-Centred Service Network [Microlog #100-01945].

Chau, M., Fang, X., & Sheng, O. (2005). Analysis of the query logs of a Web site search engine. Journal of the American Society for Information Science and

Technology, 56(13), 1363–1376.

Childers, T. (1984). Information and referral: Public libraries. Norwood, NJ: Ablex Publishing.

City of Kitchener (2007). Demographic proﬁle/labour force proﬁle: Fast facts. City of Kitchener. <http://www.kitchener.ca/pdf/fast_facts.pdf> Retrieved

19.04.07.

City of London (n.d.). E-services and maps: Interactive maps: City map. City of London. <http://www.london.ca/_private/Maps/Maps.htm> Retrieved

27.04.07.

Cohill, A. M. (2000). The future of community networks. In A. M. Cohill & A. L. Kavanaugh (Eds.), Community networks: Lessons from Blacksburg, Virginia (2nd

ed., pp. 357–380). Boston: Artech House.

Cummings, S. (2002). Life events, life episodes: Instances of the approach in the United Kingdom, Ireland, Spain, Canada and the private sector and large public

institutions. London, Ontario: Mycommunityinfo.ca.

Cummings, S. (2004). Mycommunityinfo.ca. London, Ontario: Mycommunityinfo.ca.

Dewdney, P., & Harris, R. M. (1992). Community information needs: The case of wife assault. Library and Information Science Research, 14, 5–29.

360 F. Lambert / Information Processing and Management 46 (2010) 343–361

Author's personal copy

Donohue, J. C. (1976). Community information services – a proposed deﬁnition. In Information politics: Proceedings of the ASIS annual meeting (p. 126).

Washington, DC: ASIS [Cited in Durrance, J. C. (1984b). Community information services: An innovation at the beginning of its second decade. In W.

Simonton (Ed.), Advances in librarianship (Vol. 13, pp. 99–128). Orlando: Academic Press].

Durrance, J. C. (1984). Community information services: An innovation at the beginning of its second decade. In W. Simonton (Ed.). Advances in librarianship

(Vol. 13, pp. 99–128). Orlando: Academic Press.

Durrance, J. C., & Pettigrew, K. E. (2002). Online community information: Creating a nexus at your library. Chicago: American Library Association.

Erin Research, Inc. (1998). Citizens ﬁrst: Summary report. Ottawa: Canadian Centre for Management Development, Citizen-Centred Service Network.

Gillaspy, M. L. (2005). Factors affecting the provision of consumer health information in public libraries: The last ﬁve years. Library Trends, 53(3), 480–495.

Gurstein, M. (2000). Introduction. Community informatics: Enabling community uses of information and communications technology. In M. Gurstein (Ed.),

Community informatics: Enabling community uses of information and communications technology (pp. 1–30). Hershey, PA: Idea Group.

Harris, R., Wathen, C. N., & Chan, D. (2005). Public library responses to a consumer health inquiry in a public health crisis: The SARS experience in Ontario.

Reference and Users Service Quarterly, 45(2), 147–154.

Hearn, G., Kimber, M., Lennie, J., & Simpson, L. (2005). A way forward: Sustainable ICTs and regional sustainability. The Journal of Community Informatics,

1(2). <http://www.ci-journal.net/index.php/ciej/article/view/201/160> Accessed 17.06.07.

Herrera-Viedma, E., & Pasi, G. (2006). Soft approaches to information retrieval and information access on the Web: An introduction to the special topic

section. Journal of the American Society for Information Science and Technology, 57(4), 511–514.

Jansen, B. J., Booth, D. L., & Spink, A. (2008). Determining the informational, navigational, and transactional intent of Web queries. Information Processing and

Management, 44, 1251–1266.

Kubicek, H., & Wagner, R. M. (2002). Community networks in a generational perspective: The change of an electronic medium within three decades.

Information, Communication and Society, 5(3), 291–319.

Lambert, F. (2008). Rewriting the ‘‘rules”of online networked community information services: A case study of the mycommunityinfo.ca model. London, Ontario:

Faculty of Graduate Studies, University of Western Ontario.

Lau, E. P., & Goh, D. H.-L. (2006). In search of query patterns: A case study of a university OPAC. Information Processing and Management, 42(5), 1316–1329.

LEDC (London Economic Development Corporation) (n.d.). Unemployment rate. <http://www.ledc.com/forsiteselectors/workforce/unemploymentrate/>

Retrieved 19.04.07.

Marshall, K. (2006). Converging gender roles. Perspectives on Labour and Income, 7(7), 5–17. <http://www.statcan.ca/english/freepub/75-001-XIE/10706/art-

1.htm> Retrieved 05.07.06.

Pu, H., Chuang, S., & Yang, C. (2002). Subject categorization of query terms for exploring Web users’ search interests. Journal of the American Society for

Information Science and Technology, 53(8), 617–630.

Rideout, V. N., & Reddick, A. J. (2005). Sustaining community access to technology: Who should pay and why! The Journal of Community Informatics, 1(2), 45–

62. <http://www.ci-journal.net/index.php/ciej/article/view/202/162> Retrieved 05.01.06.

Rieh, S. Y., & Xie, H. (2006). Analysis of multiple query reformulations on the Web: The interactive information retrieval context. Information Processing and

Management, 42(3), 751–768.

Ross, N. C. M., & Wolfram, D. (2000). End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine. Journal of the

American Society for Information Science, 51(10), 949–958.

Rubin, J. H. (2001). Introduction to log analysis techniques: Methods for evaluating networked services. In C. R. McClure & J. C. Bertot (Eds.), Evaluating

networked information services: Techniques, policy, and issues (pp. 197–212). Medford, NJ: Information Today.

Sales, G. (1994). A taxonomy of human services: A conceptual framework with standardized terminology and deﬁnitions for the ﬁeld (3rd ed.). El Monte, CA:

Information and Referral Federation of Los Angeles County, Inc.

Sales, G. (2003). An orientation to the structure and contents of the AIRS/INFO LINE taxonomy. The Journal of the Alliance of Information and Referral Systems.

<http://www.211taxonomy.org/publicﬁles/view/Taxonomy_Orientation.pdf> Accessed 28.02.07 [last revision, August 2006].

Schuler, D. (1996). New community networks: Wired for change. New York: ACM Press.

Silverstein, C., Henzinger, M., Marais, H., & Moricz, M. (1999). Analysis of a very large Web search engine query log. SIGIR Forum, 33(1), 6–12.

Spink, A., & Jansen, B. J. (2004). Web search: Public searching of the Web.Dordrecht: Kluwer Academic Publishers.

Spink, A., Wolfram, D., Jansen, M. B. J., & Saracevic, T. (2001). Searching the Web: The public and their queries. Journal of the American Society for Information

Science and Technology, 52(3), 226–234.

Statistics Canada (2006). Internet use by individuals, by type of activity. Summary Tables.<http://www40.statcan.ca/l01/cst01/comm16.htm> Accessed

22.06.07 [November 1, last update].

Thelwall, M., Vaughan, L., & Björneborn, L. (2005). Webometrics. In B. Cronin (Ed.). Annual review of information science and technology (Vol. 39, pp. 81–135).

Medford, NJ: Information Today.

Toms, E. G., & Kinnucan, M. T. (1996). The effectiveness of the electronic city metaphor for organizing the menus of Free-Nets. Journal of the American Society

of Information Science, 47(12), 919–931.

Wang, P., Berry, M. W., & Yang, Y. (2003). Mining longitudinal Web queries: Trends and patterns. Journal of the American Society for Information Science and

Technology, 54(8), 743–758.

Zhang, J., Wolfram, D., Wang, P., Hong, Y., & Gillis, R. (2008). Visualization of health-subject analysis based on query term co-occurrences. Journal of the

American Society of Information Science, 59(12), 1933–1947.

F. Lambert / Information Processing and Management 46 (2010) 343–361 361

Exploring the Incentive Function of Virtual Academic Degrees in a Chinese Online Smoking Cessation Community: Qualitative Content Analysis

Article

Full-text available

Jul 2023
J MED INTERNET RES

Background: Previous studies on online smoking cessation communities (OSCCs) have shown how such networks contribute to members' health outcomes from behavior influence and social support perspectives. However, these studies rarely considered the incentive function of OSCCs. One of the ways OSCCs motivate smoking cessation behaviors is through digital incentives. Objective: This study aims to explore the incentive function of a novel digital incentive in a Chinese OSCC-the awarding of academic degrees-to promote smoking cessation. It specifically focuses on "Smoking Cessation Bar," an OSCC in the popular web-based Chinese forum Baidu Tieba. Methods: We collected discussions about the virtual academic degrees (N= 1193) from 540 members of the "Smoking Cessation Bar." The time frame of the data set was from November 15, 2012, to November 3, 2021. Drawing upon motivational affordances theory, 2 coders qualitatively coded the data. Results: We identified five key topics of discussion, including members' (1) intention to get virtual academic degrees (n=38, 2.47%), (2) action to apply for the degrees (n=312, 20.27%), (3) feedback on the accomplishment of goals (n=203, 13.19%), (4) interpersonal interaction (n=794, 51.59%), and (5) expression of personal feelings (n=192, 12.48%). Most notably, the results identified underlying social and psychological motivations behind using the forum to discuss obtaining academic degrees for smoking cessation. Specifically, members were found to engage in sharing behavior (n=423, 27.49%) over other forms of interaction such as providing recommendations or encouragement. Moreover, expressions of personal feelings about achieving degrees were generally positive. It was possible that members hid their negative feelings (such as doubt, carelessness, and dislike) in the discussion. Conclusions: The virtual academic degrees in the OSCC created opportunities for self-presentation for participants. They also improved their self-efficacy to persist in smoking cessation by providing progressive challenges. They served as social bonds connecting different community members, triggering interpersonal interactions, and inducing positive feelings. They also helped realize members' desire to influence or to be influenced by others. Similar nonfinancial rewards could be adopted in various smoking cessation projects to enhance participation and sustainability.

Virtual Trace: A Framework for Applying Physical Trace Research Methodology in a Virtual Electronic Context

Article

Full-text available

Jan 2014

Frank Lambert

Virtual trace is presented as a reconceptualized methodological framework inspired particularly by Webb et al"s physical trace methodology and a variety of Webometric data collection and analysis methods from the LIS literature. With the ongoing proliferation of data from current and new Internet-based sources, virtual trace is intended to be considered by experienced and novice researchers as a comprehensive research approach for studies whose designs are similar conceptually to those described in this article. Other online methodologies such as social tagging analysis and virtual ethnography are examined to provide virtual trace further definition.

Seeking information from government resources: A comparative analysis of two communities' Web searching of municipal government Web sites

Article

Jan 2013
GOV INFORM Q

Frank Lambert

Dynamic Graph Learning Convolutional Networks for Semi-supervised Classification

Article

Mar 2021

Over the past few years, graph representation learning (GRL) has received widespread attention on the feature representations of the non-Euclidean data. As a typical model of GRL, graph convolutional networks (GCN) fuse the graph Laplacian-based static sample structural information. GCN thus generalizes convolutional neural networks to acquire the sample representations with the variously high-order structures. However, most of existing GCN-based variants depend on the static data structural relationships. It will result in the extracted data features lacking of representativeness during the convolution process. To solve this problem, dynamic graph learning convolutional networks (DGLCN) on the application of semi-supervised classification are proposed. First, we introduce a definition of dynamic spectral graph convolution operation. It constantly optimizes the high-order structural relationships between data points according to the loss values of the loss function, and then fits the local geometry information of data exactly. After optimizing our proposed definition with the one-order Chebyshev polynomial, we can obtain a single-layer convolution rule of DGLCN. Due to the fusion of the optimized structural information in the learning process, multi-layer DGLCN can extract richer sample features to improve classification performance. Substantial experiments are conducted on citation network datasets to prove the effectiveness of DGLCN. Experiment results demonstrate that the proposed DGLCN obtains a superior classification performance compared to several existing semi-supervised classification models.

The functional requirements for community information

Article

Jan 2016
J DOC

Philip Martin Hider

Purpose – The purpose of this paper is to consider the nature of community information (CI) and proposes a data model, based on the entity-relationship approach adopted in the Functional Requirements for Bibliographic Records (FRBR), which may assist with the development of future metadata standards for CI systems. Design/methodology/approach – The two main data structure standards for CI, namely the element set developed by the Alliance of Information and Referral Systems (AIRS) and the MARC21 Format for CI, are compared by means of a mapping exercise, after which an entity-relationship data model is constructed, at a conceptual level, based on the definitions of CI found in the literature. Findings – The AIRS and MARC21 data structures converge to a fair degree, with MARC21 providing for additional detail in several areas. However, neither structure is systematically and unambiguously defined, suggesting the need for a data model. An entity-relationship data modelling approach, similar to that taken in FRBR, yielded a model that could be used as the basis for future standards development and research. It was found to effectively cover both the AIRS and MARC21 element sets. Originality/value – No explicit data model exists for CI, and there has been little discussion reported about what data elements are required to support CI seeking.

Community information portals: content and design issues for information access

Article

Sep 2014
LIBR HI TECH

Purpose – The purpose of this paper is to report on the findings of an audit of community information (CI) portals to provide an overview of how CI is being organised and presented on the web by aggregating services, and how CI is being shaped and shared in community networks. It also investigates the role that public libraries play in online CI provision. Design/methodology/approach – The research sampled CI portals online within the Australian web domain (.au). An audit of 88 portals was undertaken to establish the scope, role and usefulness of the portals. The audit included a comprehensive usability analysis of a sub set of 20 portals evaluated for 20 different heuristics based on Nielsen's heuristic model. Findings – The research finds that the challenge facing portals is not a lack of information, it is the need to improve the mediation between the community services and people that CI portals promise useful and usable information for. While public libraries remain integral to the provision of CI in their geographical area, they now form part of a larger online network for CI provision, involving a wide range of organisations. Originality/value – The paper discusses the ways CI portals contribute to the provision of information about community services and identifies areas where improvements are needed. In particular, it discusses how these sites function as part of larger CI networks and where more innovative, and more standardised, design could lead to greater levels of engagement and utility.

Community Driven Search Engine Based on Community's Proxy Server Log

Article

Full-text available

Jan 2012

In this paper, we are introducing a method to improve search engine capabilities by using user preference achieved with the help of community's proxy logs. The goal is focused to build a custom search engine that providing community-specific results. To achieve such search engine, we use proxy server logs from Network Operation Center of EEPIS-ITS and fetch the unique URL and user field as raw data. Getting the needed data, then we crawl the title and meta information from all of the unique URLs. Then, document vector is created in order to make those textual data turn into a machine-friendly numerical data. To find the topic, based on those URLs and its meta information, we cluster it into 10 or more preferable clusters using k-means algorithm. Those results, finally, would be our base to create the search engine and we use vector space model to provide a search result from user's query.

The Persistence of Participation: Community, Disability, and Social Networks

Article

Full-text available

Communication-oriented Internet technologies and activities such as social media sites and blogs, have become an important component of community and employment participation, not just in the specific function of activities, but as a link to larger communities of practice and professional connections. The occurrence of these activities, evident in their presence on Facebook, LinkedIn and other online communities, represents an important opportunity to reframe and re-conceptualize manifestation of communities especially those in which distributed networks and communities substitute for geographic proximity, offering new opportunities for engagement, especially those who might be functionally limited in terms of mobility. For people with disabilities, as well as the aging, increasingly interacting online, the readiness of social networking sites to accommodate their desire to participate, in conjunction with their readiness as users to maximize the potential of platform interfaces and architecture, are critical to achieving the medium’s potential for enhancing community and employment benefits. This paper explores representation/presence of disability and aging using as frames, Facebook and LinkedIn groups. Target identity/member groups on Facebook and LinkedIn were catalogued to explore the presence and representation of disability and aging identities in a socially networked setting. The groups for this study were identified using the search feature designed into the platform architecture, which allow a user to search on specifically designated entities or keywords. Findings suggest that from a policy perspective, institutions need to be cognizant of population characteristics as well as platform opportunities implementing advocacy and relevant support services for people with disabilities and older adults to full ensure engagement and participation.

Searching the web: The public and their queries

Article

Jan 2001
J AM SOC INF SCI TEC

In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms are used with high frequency, and a great many terms are unique; the language of Web queries is distinctive. Queries about recreation and entertainment rank highest. Findings are compared to data from two other large studies of Web queries. This study provides an insight into the public practices and choices in Web searching.

End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine

Article

Aug 2000

Queries submitted to the Excite search engine were analyzed for subject content based on the cooccurrence of terms within multiterm queries. More than 1000 of the most frequently cooccurring term pairs were categorized into one or more of 30 developed subject areas. Subject area frequencies and their cooccurrences with one another were tallied and analyzed using hierarchical cluster analysis and multidimensional scaling. The cluster analyses revealed several anticipated and a few unanticipated groupings of subjects, resulting in several well-defined high-level clusters of broad subject areas. Multidimensional scaling of subject cooccurrences revealed similar relationships among the different subject categories. Applications that arise from a better understanding of the topics users search and their relationships are discussed.

Community information services - An innovation at the beginning of its second decade

Article

Jan 1984

Joan C. Durrance

: Annual Review of Information Science and Technology Vol. 29

Article

Jul 1996

Christine L. Borgman

Web Search: Public Searching of the Web

Article

Jan 2004

Public Libraries and Networked Information Services in Low-Income Communities

Article

Sep 1999
LIBR INFORM SCI RES

This article presents findings from an empirical study of community information exchange and computer access and use among low-income, predominantly African-American residents in one locale. Data were collected through household interviews, focus groups, and surveys. Results indicate that, while computer use is minimal, many low-income community members are poised to participate in the local development of networked information services. The article emphasizes appropriate roles for public libraries in community-wide efforts to bridge the digital divide that cuts computer use along socioeconomic lines.

The future of community networks

Conference Paper

Feb 2000

Andrew Michael Cohill

Converging Gender Roles

Article

Nov 2005

Katherine Marshall

Public Library Responses to a Consumer Health Inquiry in a Public Health Crisis: The SARS Experience in Ontario

Article

Dec 2005
REF USER SERV Q

This article addresses the extent to which public libraries in Ontario were able to respond to inquiries for health information during a major public health crisis. The 2003 outbreak of Severe Acute Respiratory Syndrome (SARS) in Toronto, Ontario, represented a challenge to those charged with providing accurate and timely information to the public. At the onset of the outbreak, the disease was not well understood and information about SARS was sketchy. As the outbreak progressed, information was in flux as more became known about the nature of the disease, methods of transmission, and treatment protocols. Against this background, sixty-nine randomly selected libraries in Ontario were queried by phone and by e-reference service (if it was offered by the library) for information about SARS, its symptoms, and prevention methods. The responses of the libraries were analyzed for the quality of the reference service and types of referrals, particularly Internet sources given the growing popularity of e-health initiatives. The results raise serious questions about the appropriate role of public libraries in the delivery of consumer health information and the preparedness of public library staff to respond to health-related inquiries, particularly in times of crisis.

Community Information Needs: The Case of Wife Assault

Article

Jan 1992
LIBR INFORM SCI RES

The purpose of this study was to examine the relationship between community needs for information about wife assault and the information response offered through social service networks in six communities of varying sizes. In the first phase of the research, 543 randomly selected women were interviewed during household surveys to assess their knowledge of different kinds of information resources that would be helpful in a hypothetical situation involving wife assault. In the second phase, 179 interviews were conducted with agencies and professionals identified as likely sources of help during the household interviews. Analyses of both surveys provided a map of the degree to which the information delivery system's response overlaps with the public's need for information. Consistent with previous studies of information-seeking patterns, the results indicated that many residents expected types of help that these agencies did not, in fact, provide. Agencies and professionals themselves were not always aware of appropriate sources of help, and did not routinely assess the severity of the presenting situation or the kind of help wanted. The implications for the design of community information delivery systems, including information and referral (I & R) centers, are discussed.

Online community information seeking: The queries of three communities in Southwestern Ontario

Abstract and Figures

Recommended publications

The Use of Information and Communication Technology to Meet Chronically Ill Patients’ Needs when Liv...

Data realities in plural contexts: Appraisal of a definition [of social informatics]

The Use of Information and Communication Technologies (ICTs) in Tourist Information and Promotion of...

Organizational strategy, technology and public participation in municipal planning Available at http...