ArticlePDF Available

Online community information seeking: The queries of three communities in Southwestern Ontario

Authors:

Abstract and Figures

This paper presents not only mycommunityinfo.ca (MCI) as an innovative World Wide Web (WWW)-based community information (CI) site, but also how its unique approach to facilitating online CI searching on the Web reveals through empirical data how people use such information and communication technologies (ICTs) to address their everyday information needs. The geographic focus for this study is on three communities in Southwestern Ontario. MCI collects unobtrusively query data that are logged daily from its own Web site, the Web sites of three municipal governments, and one municipal agency from this region. One year’s worth of these data was supplied to determine the types of CI that are sought through Web searching. A content analysis of a large purposive sample of all of MCI’s query data reveals more specific and diverse conceptual CI needs between and within communities than those reported in other studies employing different data collection methods. As a result, using a centralized approach to online CI access via the WWW by other CI providers such as the 211 network may be a disservice to its users. Additionally, the findings demonstrate how a thorough analysis of such data may improve the informational content and overall design of municipal government Web sites. The analysis of these data also has the potential of improving current CI taxonomies.
Content may be subject to copyright.
This article appeared in a journal published by Elsevier. The attached
copy is furnished to the author for internal non-commercial research
and education use, including for instruction at the authors institution
and sharing with colleagues.
Other uses, including reproduction and distribution, or selling or
licensing copies, or posting to personal, institutional or third party
websites are prohibited.
In most cases authors are permitted to post their version of the
article (e.g. in Word or Tex form) to their personal website or
institutional repository. Authors requiring further information
regarding Elsevier’s archiving and manuscript policies are
encouraged to visit:
http://www.elsevier.com/copyright
Author's personal copy
Online community information seeking: The queries of three communities
in Southwestern Ontario
Frank Lambert
*
School of Library and Information Science, Kent State University, P.O. Box 5190, 314 University Library, Kent, OH 44242, United States
article info
Article history:
Received 16 February 2009
Received in revised form 18 September 2009
Accepted 29 October 2009
Available online 27 November 2009
Keywords:
Community information
Community informatics
Information seeking
Web search
Web log analysis
abstract
This paper presents not only mycommunityinfo.ca (MCI) as an innovative World Wide Web
(WWW)-based community information (CI) site, but also how its unique approach to facil-
itating online CI searching on the Web reveals through empirical data how people use such
information and communication technologies (ICTs) to address their everyday information
needs. The geographic focus for this study is on three communities in Southwestern
Ontario. MCI collects unobtrusively query data that are logged daily from its own Web site,
the Web sites of three municipal governments, and one municipal agency from this region.
One year’s worth of these data was supplied to determine the types of CI that are sought
through Web searching. A content analysis of a large purposive sample of all of MCI’s query
data reveals more specific and diverse conceptual CI needs between and within communi-
ties than those reported in other studies employing different data collection methods. As a
result, using a centralized approach to online CI access via the WWW by other CI providers
such as the 211 network may be a disservice to its users. Additionally, the findings demon-
strate how a thorough analysis of such data may improve the informational content and
overall design of municipal government Web sites. The analysis of these data also has
the potential of improving current CI taxonomies.
Ó2009 Elsevier Ltd. All rights reserved.
1. Introduction
There is no shortage of research that studies information seekers’ queries logged by various types of search engines. An
incomplete but demonstrative list of such research might include Silverstein, Henzinger, Marais, and Moricz (1999), Spink,
Wolfram, Jansen, and Saracevic (2001), Beitzel, Jensen, Chowdhury, Frieder, and Grossman (2007), Chau, Fang, and Sheng
(2005), Ross and Wolfram (2000), Wang, Berry, and Yang (2003), Pu, Chuang, and Yang (2002), Lau and Goh (2006) and Rieh
and Xie (2006). These and additional studies examine information seeking behaviours and information retrieval within on-
line environments to address a number of research problems. However, the published Web search, community information
(CI) seeking, and community informatics literature lack the analysis of large amounts of unobtrusively collected data such as
those found in Web logs to determine conceptual types of CI that are sought through local online resources. This study at-
tempts to begin to address this deficiency by examining a large purposive sample of 1 year’s worth of query data that re-
sulted from hundreds of thousands of Web searches submitted by community information seekers through an online CI
provider located in Southwestern Ontario, Canada, called mycommunityinfo.ca (MCI) and through the Web sites of its
municipal government partners and clients.
0306-4573/$ - see front matter Ó2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ipm.2009.10.008
*Tel.: +1 330 672 0015; fax: +1 330 672 7965.
E-mail address: flamber1@kent.edu
Information Processing and Management 46 (2010) 343–361
Contents lists available at ScienceDirect
Information Processing and Management
journal homepage: www.elsevier.com/locate/infoproman
Author's personal copy
1.1. Defining concepts related to Web queries, and community information and informatics
With queries being the primary units of analysis for this study, it is important to clarify what constitutes a query. It is in
essence the final product of integrated parts as constructed by the information seeker. Or, as Spink and Jansen (2004) elab-
orate, ‘‘Terms are the basic building blocks through which a Web searcher expresses their information problem when search-
ing on a Web search engine. Single or multiple term and operators form a Web query” (p. 55). While operators do not form
any significant part of the queries in the data analyzed and presented in this paper, the single or multiple terms that con-
stitute the queries are essential.
It is also worth examining some definitions and concepts related to the study of community information provision and
the application of various information and communications technologies (ICTs) to facilitate access to this type of informa-
tion. This is important to consider because of MCI’s particularly unique ICT model that relies primarily on keyword searching,
an attribute seen very rarely in online CI providers. This has allowed hundreds of thousands queries targeted toward local
online government and CI resources to be captured unobtrusively by MCI for analysis.
Community information (CI) and information and referral (IandR) have been used often interchangeably. Durrance
(1984), for instance, examined then–current scholarship in an attempt to distinguish between CI and IandR so that the
parameters of the two services could be clarified and applied by information centres and especially public libraries. As a re-
sult, Durrance identified a working definition of community information that works as an umbrella. First, CI must be recog-
nized as a service. Then,
two types of information may be provided by such a service: (1) ‘‘survival information such as that related to health,
housing, income, legal protection, economic opportunity, political rights, etc.” and (2) ‘‘citizen action information, needed
for effective participation as individual or as member of a group in the social, political, legal, economic process” (Donohue,
1976, p. 126; cited in Durrance, 1984b, p. 108).
Childers (1984) offers a broad definition of IandR: it facilitates ‘‘the link between a person with a need and the resource or
resources outside the library which can meet the need” (p. 1). The resource or resources that Childers refers to ‘‘connotes any
service, activity, individual, organization, information, or advice that may fulfill a need” (p. 1). As Durrance (1984) alluded to
above, Childers also acknowledged that there is some disagreement as to what IandR is and what constitutes legitimate
IandR activities. However, when compared to the first facet of CI services as defined by Donohue, Childers’s definition is
not far off the mark. It is Donohue’s second facet of CI’s definition that implies some form of empowering an individual
or group of individuals to participate more fully in civic life that distinguishes CI from IandR.
Prior to the introduction of computers as a tool to access local information, a ‘‘community network was a sociological concept
that described the pattern of communications and relationships in a community” (Schuler, 1996, p. 25). With the application of
computer and Internet based technology, the definition of community network changed somewhat, but its intent remained the
same. The computer based CNs that emerged in the 1990s were designed with the intention of being a tool to revitalize,
strengthen, and expand existing people based community networks. The intent of CNs then was to ‘‘advance social goals, such
as building community awareness, encouraging involvement in local decision-making, or developing economic opportunities in
disadvantaged communities” by ‘‘supporting smaller communities within the larger community and by facilitating the
exchange of information between individuals and these smaller communities (Schuler, 1996, p. 25). They did this by:
being run by and for the local community; address the information and communication needs of everyday life; foster
equal access to new media; strengthen cohesion of the local community; be provided at no or little cost; and, refer to
a geographic space containing community members in close physical proximity, rather than issue-oriented ‘virtual com-
munities’ or communities of interest. (Kubicek & Wagner, 2002, p. 292).
To accomplish these goals, CNs were to offer what Schuler (1996) defines as electronic ‘‘one-stop shopping” by offering a
plethora of online resources to its users. While this was undeniably a worthwhile goal, this placed a tremendous strain on
CNs’ finances in the 1990s, and it would, as predicted by Cohill (2000), continue to be a significant challenge for some time.
However, much like the definition of CI offered above by Donohue, CNs offer a form of empowerment but just through a dif-
ferent medium (electronic versus personal communication).
While most CNs offered access to the Internet through dial-up, the emergence and spread of high-speed, low cost Internet
service providers (ISPs) led CNs to abandon this ‘‘one-stop shopping” and server maintenance role (Gurstein, 2000). CNs still
play a role to this day, although they operate still under various pressures, especially financial, despite the greater affordabil-
ity of Internet services. Research and development in the area of information and communications technologies (ICTs) in the
very late 1990s and going forward focused on creating hardware and software to find further cost efficiencies (Gurstein,
2000). However, there seemed to be little concern over the potential users of these faster, better (based on one’s perception),
and more affordable Web-based technologies, especially those for whom there were various educational, access, and finan-
cial barriers. Community informatics concerns itself with the study of those individuals, groups, or communities who were
and who are continuing to be excluded from the many potential benefits of ICTs (Gurstein, 2000). The Journal of Community
Informatics thus defines community informatics as
the study and the practice of enabling communities with Information and Communications Technologies (ICTs). [Commu-
nity informatics] seeks to work with communities towards the effective use of ICTs to improve their processes, achieve
344 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
their objectives, overcome the ‘‘digital divides” that exist both within and between communities, and empower commu-
nities and citizens in the range of areas of ICT application including for health, cultural production, civic management, e-
governance among others. (Accessed January 25, 2007, from http://www.ci-journal.net/index.php/ciej).
Fundamental to how community informatics studies how ICTs can help communities realize their social, economic, polit-
ical, or cultural goals on the path to community empowerment is access to ICTs ‘‘since without at least minimal access, little
can be accomplished” (Gurstein, 2000, p. 3). Thus the focus for community informatics is still, like the study of CI, on how
information can empower individuals and groups of individuals. However, the role of ICTs as a tool, especially related to on-
line access and issues of the so-called ‘‘Digital Divide,” are an added facet of concern to researchers.
1.2. Conceptual framework for this study
Broder’s (2002) model for IR augmented for the World Wide Web (WWW) provides a helpful framework for this study.
Unlike the classic IR model, the augmented model recognizes that human–computer interaction factors and cognitive as-
pects play a role in Web searching. These additional attributes take into account that the information seeker’s information
need is associated with some task. As a result, ‘‘this need is verbalized (usually mentally, not out loud) and translated into a
query posed to a search engine.” (Broder, 2002, p. 4). What makes Broder’s model useful in this paper’s context is the link of a
verbalized mentally articulated information need based on some task that is submitted through a Web-based tool to the def-
initions above of CI and IandR as resources for solving everyday information needs. Since MCI is a Web search engine with a
community focus for information sources, the data that result from its activities (captured query data) become an important
source for attempting to model online community information needs. While Broder defines the needs behind the query that
is posed to the search engine as one of three types (navigational, informational, and transactional), this study’s goal is to
move beyond a cursory classification of the queries collected by MCI based on these criteria (cf. Broder, 2002; more recently
Jansen, Booth, & Spink, 2008). In order to give the units of analysis, the queries, more meaning in terms of trying to link the CI
seeker’s task and information need, further conceptual categorization was required to address the research questions pre-
sented in Section 4. This further categorization is intended to make a contribution not only to the Web search literature,
but also to the community informatics literature by providing the first empirical study that demonstrates how and for what
purpose information and communications technologies (ICTs) are used as a tool to meet the various local information needs
by the communities it has been designed to serve. While the findings are limited to the Middlesex–London and Waterloo
regions in Southwestern Ontario, this study intends to demonstrate that using a centralized approach to online CI access
via the WWW may be a disservice to its users. Additionally, the findings demonstrate how proper analysis of such data
may improve the informational content and overall design of municipal government Web sites.
2. Rationale for the study
Despite the limitations of Web log query analysis as a methodology (e.g. Spink & Jansen, 2004; Thelwall, Vaughan, &
Björneborn, 2005) that are dealt with in further detail later in this paper, it ‘‘can reveal first-hand and real-world behaviour
and interests of users. It enables researchers to better understand Web site user behaviours and the service quality that the
Web site provides. It also can be used to optimize the effectiveness of information services.” (Zhang, Wolfram, Wang, Hong, &
Gillis, 2008, p. 1934). The observation of first-hand and real-world behaviour and interests of users in Web log data is pos-
sible due to the unobtrusive nature of data collection (Spink & Jansen, 2004). MCI is used as a case site to investigate issues
related to Web search behaviours and interests as well as the effectiveness of its information services and those of its partner
government Web sites (City of London and County of Middlesex municipal governments and London Police Service) and its
one client (Region of Waterloo municipal government). This study addresses Spink and Jansen’s (2004) assertion that ‘‘fur-
ther single Web site studies are needed to replicate and extend the previous studies” cited by the authors and other studies
that have been published since 2004 (p. 25). Additionally, this study addresses Durrance and Pettigrew’s (2002) concerns
about the lack of comprehensive data and it subsequent analysis regarding citizens’ use of networked CI services. One of
the benefits of this current study is that it analyzes a large purposive sample of the query data from all of the single Web
sites mentioned above through the use of specially designed search bars rather than focusing on only a single Web site. Be-
hind these Web sites a Google search appliance is used for crawling, indexing, and retrieval purposes. Additionally, MCI uses
customized data collection software created by the City of London’s Technology Services Division to archive these data from
these Web sites in a Microsoft Access database even though these Web sites are hosted on different servers. Thus, all incom-
ing query data from all Web sites are treated the same, facilitating data processing and analysis procedures.
2.1. Background a brief history of mycommunityinfo.ca
MCI was chosen as a case site because its information retrieval (IR) model of delivering online CI has been shaped and
designed purposely to be a simple, innovative, cost-effective, and potentially sustainable approach to providing this infor-
mation during a time when CI providers are faced with the challenges of remaining financially viable (e.g. Cohill, 2000; Gur-
stein, 2000; Hearn, Kimber, Lennie, & Simpson, 2005; Rideout & Reddick, 2005). MCI’s service has been available to
approximately 942,000 residents in 24 urban and rural municipalities located in the Middlesex–London region and the
F. Lambert / Information Processing and Management 46 (2010) 343–361 345
Author's personal copy
Region of Waterloo since 2003–2004 (Cummings, 2004). These rural and urban municipalities are located in the geographic
region of Southwestern Ontario, Canada, between Detroit, Michigan, and Toronto, Ontario. Additionally, MCI offers ever-
expanding single point access to the online information resources of not only local non-profit community organizations
and municipal governments, but also it allows immediate access to the online information resources of the Ontario provin-
cial and the Canadian federal governments. MCI’s approach may thus prove to be an affordable and effective model for other
communities to consider as an alternative method of delivering information to community members (cf. Lambert, 2008).
MCI was conceived first in 1999 when representatives of various ministries, departments, and agencies from the muni-
cipal, provincial, and federal governments met to address the question of how these groups could provide ‘‘a cost-effective
integration of information service offerings from three levels of government” (Cummings, 2004, p. 1) in the Middlesex–Lon-
don region of Southwestern Ontario. The end result of these meetings (Fig. 1) culminated in the creation of an online com-
munity information portal that avoids the traditional, often considerably more expensive and difficult to maintain,
networked directory model of organized, static links to government and non-profit organization information resources.
The drawback to the directory model, such as that exemplified by the North American-wide 211 service, is that the links have
to be monitored and maintained on an ongoing basis by a complement of paid staff. While this model may be useful to many
CI seekers, the excessive control inherent in systems that rely on a directory approach may be a potential disservice to their
users. The online CI directory model assumes primarily a search method where the information seeker does not make a
formal expression of information needs but rather navigates Web sites through the chain of links found in the Web sites’
pages. ‘‘However, when some specific information is searched, this point-and-click access paradigm is unpractical, and
the effectiveness of the results strongly depends on the starting page” (Herrera-Viedma & Pasi, 2006, p. 511). Instead of
devoting its resources to maintaining links and metadata, MCI relies for indexing and retrieval purposes on search engine
technology that targets specifically local public sector Web sites through its own index. Queries aimed at only provincial
and federal government Web sites can also be launched from this one Web site through the publicly available Google search
engine.
On July 31, 2003, the automated collection of queries submitted through MCI’s Google search appliance, and of queries
sent to the public version of Google via MCI, began, with the query data being logged in a Microsoft Access database (Cum-
mings, 2004). As of early 2007, 223 local municipal governments, their associated agencies, and local non-profit community
Fig. 1. Mycommunityinfo.ca search interface.
346 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
organizations in London and Middlesex County (198) and in the Region of Waterloo (25) have had their Web sites indexed
and made accessible through MCI’s search appliance.
3. Review of related work
3.1. Community information seeking and uses
Within the context of community information seeking research, Durrance and Pettigrew (2002) note that there is a lack of
comprehensive data regarding citizens’ use of networked CI services. In-depth examinations are also lacking regarding cit-
izen’s information behaviour in networked CI environments. This study has been designed to help begin to fill the voids iden-
tified by Durrance and Pettigrew.
There is little literature that focuses exclusively on online CI Web searching or seeking. Any studies that do provide in-
sight on this type of information searching do so within contexts that have different foci. Closely related literature includes
Bishop, Tidline, Shoemaker, and Salela (1999) who examined community information use and computer access of a low-in-
come locale in the United States populated primarily by African-Americans. Their sample group (n= 34) reported through
household interviews seven subject areas of information ranked in priority order based on their open responses that they
would like to have access to online: community services and activities, resources for children, healthcare, education, employ-
ment, crime and safety and general reference tools such as dictionaries. Similarly, through the use of online surveys posted
on NorthStarNet in Northeastern Illinois, Three Rivers Free-Net (TRFN) in Pittsburgh, PA, and CascadeLink in Portland, OR,
Durrance and Pettigrew (2002) discerned 20 community information categories of interest from their respondents
(n= 197). The alphabetical list of categories of digital CI needs included:
Business Health
Computer and technical information Housing
Education Library operations and services
Employment opportunities Local events
Financial support Local history and genealogy
Government and civic
Local information (local accommodations, community features) Parenting
Local news (weather, traffic, school closures) Recreation and hobbies
Organizations and groups Sale, exchange, or donation of goods
Other people (both local and beyond the community) Social services
Volunteerism (Durrance & Pettigrew, 2002, p. 2)
While the data collection methods used by Bishop et al. (1999) and Durrance and Pettigrew (2002) were certainly appropri-
ate, neither study may claim that the findings are generalizable. This study is limited too in its generalizability as it focuses
on two small regions within Canada. Additionally, this study does not consider navigational information seeking as informa-
tion seekers go from link to link within Web sites such as those owned by the municipal governments introduced in this
paper until they reach their ultimate information destination. However, for this study large samples of the units of analysis
may be quantified, conceptualized, and then ranked. This is a distinct advantage that the studies above were not able to ex-
ploit. The conceptual meaning of these queries may be compared then from within and between the communities within
these regions to give a more comprehensive picture of online everyday information seeking through Web search from a rel-
atively large portion of the province of Ontario’s population.
3.2. Web log analysis
This study fits well within the research that studies search engine log files where data are recorded about what the user is
actually seeking as represented by the actual keyword queries that the user submits (Thelwall et al., 2005). Research pub-
lished on the analysis of Web query logs has focused on querying behaviours through either commercial search engines,
search engines that are used exclusively for searching through a Web site, or for searching through library OPACs. Silverstein
et al. (1999) examined a data set recorded over 43 days from the AltaVista search engine comprising over 900 million total
requests. Their interest lay in determining which queries were most common, the average length of queries, how many que-
ries were submitted during an individual session and, especially, correlations between query terms and other field values.
Spink et al. (2001) conducted a similar study involving over one million query logs (531,416 of which were unique) sent
to the Excite search engine on one day by 211,063 users. They examined the number of queries submitted per identified user,
measured the change in unique queries submitted by each identified user, the number of results pages viewed, and whether
multi-term queries used advanced search features such as Boolean operators. Rieh and Xie (2006) analyzed also a data set of
313 search sessions from the Excite search engine to characterize facets of query reformulation and identify patterns of mul-
tiple query reformulation in Web searching. Their goal was to explore ‘‘the ways in which search engines can support query
F. Lambert / Information Processing and Management 46 (2010) 343–361 347
Author's personal copy
reformulation more effectively in Web searching” (p. 752) by using Saracevic’s stratified interaction model as an analytical
framework (Rieh & Xie, 2006).
Following the parsing of multi-term queries, Ross and Wolfram (2000) used the frequency of binary term co-occurrence
to determine facets of multi-term queries without having to look at every query. This was a bottom-up grounded theory
method with no a priori categorization beforehand. Pu et al. (2002) used this same approach to develop their subject taxon-
omy for the automatic classification of Web query terms into broad subject categories. Beitzel et al.’s study (2007) involved
the analysis of the Web query logs of the America Online (AOL) Web search service. However, the authors used a longitudinal
analysis ‘‘to examine static and topical changes (in querying) over longer periods such as days, weeks, and months” (Beitzel
et al., 2007, p. 167). Additionally, the authors ‘‘analyzed the queries representing different topics using a topical categoriza-
tion of (their) query stream” in an effort to determine how querying behaviour for some categories would either change or
remain static over time (p. 167).
Chau, Fang, and Sheng’s study (2005) focused on keyword queries that were submitted through the Utah state govern-
ment Web site. The authors’ goal was not only to determine the characteristics of queries that were submitted through this
Web site search engine but also to compare those same queries to those submitted through general-purpose search engines.
They determined ‘‘that Web users behave similarly when using a Web site engine and a general-purpose search engines in
terms of the average number of terms per query and the average number of result pages viewed per sessions(sic)” (p. 1374).
However, the users of the Utah state government Web site search engine submitted, on average, fewer queries per session
than users of general-purpose search engines. Additionally, Utah government Web site users use different sets of terms and
topics in their queries compared to general-purpose search engine users.
Lau and Goh (2006) analyzed 641 991 queries from the Nanyang Technological University OPAC to determine
what caused failed search sessions. Their objective was to identify areas of improvement for the OPAC to improve
users’ search experiences with the OPAC through the use primarily of failure analysis. As a result, the authors recommend
improvements to the OPAC through enhancements to interactive query reformulation, browsing, and context-sensitive
assistance.
4. Research questions
What should be noticed from this literature review is that Web log analysis used as a research method to analyze Web
search needs and behaviours has continued to evolve in innovative and informative ways. Web log analysis’s applicability as
a methodological tool for determining how CI seekers use online resources should continue to be considered seriously as an
approach for improving the quality of and access to local information, regardless of the design of the online CI provider’s
electronic infrastructure.
This study thus attempts to address the following research questions:
What types of conceptual CI are being sought in an online environment through MCI’s service approach and the Web sites
of its partners?
Are there differences in the types of CI being sought between MCI and its partner Web sites?
Will the collection and analysis of these data present findings different than those of other researchers who rely primarily
on other methodological approaches?
The findings of this study will provide a perspective of CI seeking that may make other online CI organizations and mu-
nicipal governments consider further the design and content of their respective Web sites. Additionally, this study and other
studies that may employ a similar methodological approach may lead to the further consideration of theoretical issues of
user warrant versus literary warrant concerning the creation of new or additional taxonomies for community and social ser-
vices information.
5. Methods
5.1. Attributes of mycommunityinfo.ca’s query logs
MCI supplied voluntarily their complete Web query log covering August 2005–July 2006 in a Microsoft Access database.
The full year’s worth of cleaned data were analyzed to provide a snapshot of the local online information inquiries of the
communities under examination. The following gives a description of the different attributes of the Web log data and then
details how the data was processed and analyzed.
The Microsoft Access database table that captures the Web query data is comprised of eleven attributes, with each
attribute recording different aspects of each query event. Table 1 gives brief descriptions of the relevant attributes in the
Access database table. These descriptions are necessary because a number of these fields play a major role in data prepara-
tion and analysis. Additionally, it is likely that the Web logs of many other Web sites will not contain the same types of
attributes as MCI’s customized log. This is because the Access table is a customized, in-house solution for capturing these
data.
348 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
5.2. The processing and analysis of mycommunityinfo.ca’s query logs
The MCI query logs, like those in Ross and Wolfram’s study (2000), included many examples of identical queries being
submitted by the same session identifier in succession. This was due to users submitting a request to view the next page
of hits for the originally submitted query within each session. For this study a ‘find duplicates’ query was used in Access
to retrieve all of the duplicate session numbers. Then, another ‘find duplicates’ query was used to group all of the duplicate
query text within each session with duplicate identifiers that resulted from additional page views. All but one of the identical
additional queries within each session were deleted manually to create a new Microsoft Access table of ‘‘clean” queries for
frequency analysis. Similar to Ross and Wolfram (2000) and Wang et al. (2003) universal resource locators (URLs) and email
addresses were not included in the analysis. Once ‘‘cleaned,” the MCI and municipal government query data were processed
again using another separate ‘find duplicates’ query in Microsoft Access. The final process was refined further by focusing on
the respective Web site through which the queries were submitted by using the site field (Table 1).
Following the parsing of multi-term queries, Ross and Wolfram (2000) used the frequency of binary term co-occurrence
to determine facets of multi-term queries without having to look at every query. This was a bottom-up grounded theory
method with no a priori categorization beforehand. Pu et al. (2002) used this same approach to develop their subject taxon-
omy for the automatic classification of Web query terms into broad subject categories. In this study, an empirical approach is
used to gather and group the most frequently occurring top 100 cleaned queries from each Web site based on their manifest
or visible surface content (Babbie, 2008) Several attempts then were made to conceive and construct a list of categories that
were well grounded in the Web log data based on the queries’ conceptually similar latent content to give the queries some
conceptual meaning (Babbie, 2008). This was done to provide a valid and reliable list of categories that could define what
sorts of community information people are searching for online. Once the list of categories was deemed sufficient to ensure
mutual exclusivity, a copy of this list along with a representative systematic sample of the top 100 most frequently occurring
queries from each Web site was given to a Ph.D. student to classify. This was done to determine first the content validity of
the categories by ensuring that the queries covering the range of meanings included in the conceptual categories were as
clearly understandable and mutually exclusive as possible (Babbie, 2008). The second purpose of this exercise was to test
the reliability of the categories for coding the query data by measuring the rate of intercoder agreement between my coding
and that of the Ph.D. student (Babbie, 2008). The end result was an intercoder rate of agreement of 89%. When the coding of
the queries using the sub-categories is taken into consideration, the rate of agreement dropped slightly to 85%. Following a
discussion, some minor changes were made to the wording of the categories. The final list of categories is shown in Figs. 2
and 3.
6. Findings
6.1. Basic statistical findings of the mycommunityinfo.ca Web log data
Over the course of the 1 year under study, 733,331 raw query events were captured in the Microsoft Access database de-
signed by Technology Services Division, City of London. Following the cleansing of the queries, the Web sites contributed the
breakdown of data for analysis shown in Table 2.
Table 1
Types of fields/attributes in MCI Wen log database table with descriptions.
Attribute/field Description
Time and date Time and date that the query was received
Query text Actual text of query as typed and submitted by user
Site Identifies the site from which the search was launched using an alias created by TSD technicians and then assigned by logic
to referring URLs. If the referring URL comes from the City of London Web site, then ’City’ is entered in the site field
Type of search Has two values: ‘‘0” for MCI’s search appliance; or, ‘‘1” for Google Web API. ‘‘0” shows the query targeting the 223 Web sites
after being submitted through MCI or searching within one of the five municipal Web sites. ‘‘1” is recorded when the search
is launched through public Google using the search domain restrictions of ‘‘.gc.ca” or ‘‘.gov.on.ca”. This is done by the user
clicking on the ‘‘Search Province of Ontario and Government of Canada Resources” button at either the top or the bottom of
the returned results page
Estimated number of
hits
The returned results from the searcher’s query through MCI or the municipal Web sites
Referrer Records the referring URL accompanying the incoming query to MCI’s search appliance from the request header of the site
issuing the request
Start page A pagination of the number of pages of returned hits that the user has examined during his/her search session
Session ID An artificial session identification number is assigned every time the MCI search appliance is ‘‘touched” by an incoming query
request from the municipal Web sites or from MCI. This is initiated by the incoming IP address of the user’s computer or
router
Number of searches A recording of the number of queries submitted during the search session
Site search Records ‘‘community” for queries targeting Web sites in MCI’s local index (using MCI’s Web site); and, ‘‘site search” that
indicates the search was launched on a Web site from its own search bar, to search only its collection of Web pages (e.g. City
of London, County of Middlesex, London Police Services, and Waterloo County)
F. Lambert / Information Processing and Management 46 (2010) 343–361 349
Author's personal copy
Considering the number of raw query events in relation to the total number of cleaned queries in column one of Table 2,
48% of all events that were in MCI’s query log were users requesting views of additional pages of ‘‘hits” returned by the
search engine. Table 2 also shows that municipal Web sites, with the exception of County of Middlesex and the more top-
ically specialized London Police Services Web site, were queried far more as potential sources of local information compared
to the more diverse MCI Web site that indexes 223 Web sites from London–Middlesex and Region of Waterloo.
Fig. 2. Major conceptual categories and sub-categories.
Fig. 3. Minor conceptual categories derived from MCI Web query log data.
350 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
The disproportionate total of MCI-Middlesex–London’s top 100 most frequently occurring queries relative to that of the
other Web sites needs to be noted. This anomaly is due to MCI’s use of Life Events bundles
1
(Cummings, 2002). MCI uses a
number of actively hyperlinked keywords and very short keyword phrases that have been embedded in Life Events pages to
offer the CI seeker the option of launching a search of MCI’s index if the particular Life Event does not contain the information
that is needed. Once the hyperlinked keyword or keyword phrase is clicked on, a search is launched in the same way as typing
the actual query and clicking on the ‘‘go!” button. Sixty-nine out of seventy Life Event bundle keywords were exact matches to
sixty-nine of the queries that appeared in the top 100 of MCI-Middlesex–London’s Web log data. These Life Events may be an
attractive alternative for CI organization and searching rather than relying solely on either keyword searching or a directory of
fixed links.
6.1.1. Mean number of terms per query
The mean number of terms used per query for the top 100 most frequently occurring queries and for all cleaned queries
was calculated to compare some of the descriptive statistical findings of past Web log studies to this study in an effort to
determine if CI seekers in Southwestern Ontario tend to use more or fewer words to find the information being sought.
For the top 100 most sought queries, the proportion of multi-term queries ranged widely based on the Web site used.
MCI-Middlesex–London had the highest proportion of multi-term queries at 49%. Again, this was influenced by the use of
active hyperlinked keywords and short keyword phrases in MCI-Middlesex–London’s Life Events pages. While at least
50% of all top 100 queries are composed of only one word, the average number of words or terms that form queries in Table
2are fewer than those reported in other studies. Wang et al. (2003) report an average of two words per query submitted by
users. Beitzel et al. (2007) found over one week in December 2003 that users submitted on average 2.2 terms per query ses-
sion. When this was expanded to six months, the average number of query terms increased slightly to 2.7 terms. Wang, Ber-
ry, and Yang’s (2003) and Beitzel et al.’s (2007) studies match more closely the mean number of terms per query for all
queries shown in Table 2. Additionally, the average ‘‘popular query” length in Beitzel et al.’s (2007) study was 1.7 terms
per session for the two time periods they examined. While Beitzel et al. (2007) do not define what constitutes a ‘‘popular
query,” an average of 1.7 terms per session is the closest match in the Web log literature to the findings for the top 100 que-
ries presented in Table 2.
6.2. Online community information seeking in London–Middlesex and Region of Waterloo
The sections that follow highlight some of the more interesting conceptual categories that emerged from the Web query
log data collected from MCI’s Web site and the City of London, County of Middlesex, Region of Waterloo, and London Police
Service Web sites. The primary goal of this analysis is to present empirical evidence of actual Web-based CI inquiries that
focus on the geographic region of Southwestern Ontario. Within each of the select categories, an examination of the CI inqui-
ries that emerged from the individual Web sites is presented to demonstrate how CI inquiries may differ between munici-
palities or smaller regions within Southwestern Ontario. Not every single category from all Web sites will be discussed;
rather, the focus will be on any interesting trends or patterns that are typical of particular categories. Additionally, this anal-
ysis of CI seeking in Southwestern Ontario is considered within the context of the design of online CI portals and the design
of CI taxonomies.
The conceptual categories that emerged from the queries were divided into major and minor categories (Figs. 2 and 3).
Within the major categories, more specific sub-categories also emerged. Minor categories were either very Web site specific
(e.g. ‘‘Special” Middlesex County) or could not be categorized. Categories were considered ‘‘minor” if their proportion of all of
Table 2
Descriptive statics of queries for AII MCI and municipal government Web sites.
Web site Total #of cleaned
queries/Web site
Mean words/query
for all cleaned
queries
Total of
frequencies
for top 100 queries
Proportion of top
100 to all cleaned
queries (%)
Mean
words/
query, top
100
Percentage of top
100 queries with
>1
word (%)
City of London 217,162 2.1 36,838 17 1.28 25
Region of Waterloo 92,584 2.25 14,016 15 1.4 34
MCI-Middlesex–
London
42,873 1.95 21,876 51 1.53 49
London Police Service 16,355 1.97 3087 19 1.31 28
MCI-Region of
Waterloo
10,230 2.15 2172 21 1.4 32
County of Middlesex 7867 1.85 1986 25 1.2 19
Grand total 387,071 2.11 79,975 21 1.37 33
1
See http://www.mycommunityinfo.ca/life/default.asp.
F. Lambert / Information Processing and Management 46 (2010) 343–361 351
Author's personal copy
the most frequently occurring queries was equal to or less than the percentages of the lowest percentage subcategory (cf.
Tables 3 and 4).
As may be seen in Tables 5–10, each individual Web site’s conceptual categories may be ranked, showing the conceptual
‘‘popularity” of the submitted queries rather than relying simply on an alphabetical list to report findings (e.g. Durrance &
Pettigrew, 2002). This gives some sense as to what really matters to information seekers as they search for information on-
line to help them address their everyday inquiries.
Table 3
Proportion of top 100 queries, all Web sites, by major conceptual category.
Major categories Number of queries Percentage of categories of total frequencies
1. Recreation, Entertainment, and Leisure 15,186 19.0
General 7288 9.1
Food and Drink 2169 2.7
Holidays 1833 2.3
Parks 1371 1.7
Shopping 1406 1.8
Sports and Physical Fitness 612 0.8
Worship 507 0.6
2. Work 10,954 13.7
Employment and Training 9781 12.2
Human Resources 669 0.8
Volunteerism 504 0.6
3. Family 6934 8.7
Marriage 767 1.0
Children and/or Parenting 6167 7.7
4. Municipal Government Business 5356 6.7
5. Transportation 4715 5.9
Ground Transportation 2334 2.9
Roads 1877 2.3
Air Transportation 504 0.6
6. Housing a Shelter 4540 5.7
7. Solid Waste Collection and Recycling 4160 5.2
8. Crime and Public Safety 3386 4.2
Crime 798 1.0
Public Safety 2588 3.2
9. Animals 2836 3.5
10. Taxation 2758 3.4
11. Geographic Information 2413 3.0
12. Population, Demographics and Statistics 2410 3.0
13. Health 2409 3.0
14. Libraries 1936 2.4
15. Place Names 1794 2.2
16. Water 1709 2.1
17. Ageing, Dying, and Death 1398 1.7
18. Education 1271 1.6
19. Business 901 1.1
20. Media 610 0.8
Total frequencies. Major Categories 77,676 97.1
Total frequencies, all queries 79,975 100
Table 4
Proportion of top 100 queries, all Web sites, by minor conceptual category.
Minor categories Number of queries Percentage of categories of total frequencies
Environment 480 0.6
Construction 236 0.3
Historical 172 0.2
‘‘Special” Middlesex County 166 0.2
Groups 141 0.2
Culture 92 0.1
Persons by Name 78 0.1
Job titles 31 0.0
Unable to categorize 903 1.1
Total frequencies. minor categories 2299 2.9
Total frequencies, all queries 79,975 100
352 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
6.2.1. ‘‘Recreation, Entertainment, and Leisureas the most frequently sought local information
Nearly one in every five queries submitted through the six combined CI-based Web sites pertained to this conceptual cat-
egory (Table 4). Users of particularly the City of London Web site were the largest absolute contributors of queries pertaining
to ‘‘Recreation, Entertainment, and Leisure.” This is hardly surprising since most of the overall queries came from the City of
London Web site to begin with. However, as Table 5 shows, this category also had the largest impact on the types of CI sought
through the City of London Web site.
Even though the proportion of queries devoted to the ‘‘Recreation, Entertainment, and Leisure” category is quite high,
there are often considerable differences within its sub-categories. Queries related to ‘‘Holidays” focused on finding informa-
tion about statutory holidays (e.g. ‘‘Victoria day,” ‘‘Canada day”) and events usually associated with them. Note that MCI-Re-
gion of Waterloo Web site is used more frequently than the municipal Region of Waterloo Web site to find this particular
type of information. However, MCI-Region of Waterloo will search all of the lower-tier municipal Web sites (e.g. the cities
of Kitchener, Waterloo, and Cambridge, the townships of North Dumfries, Wellesley, Wilmot, and Woolwich, and the Region
of Waterloo) all at once. This will thus give the MCI-Region of Waterloo user a greater chance of finding information pertain-
ing to ‘‘Holidays” than if they only searched the Region of Waterloo Web site. City of London CI searchers were the largest
absolute contributors to queries concerning ‘‘Holidays” but as a proportion of the top 100 cleaned queries submitted through
the City of London Web site, the percentage is quite small (3.7%). Surprisingly, London Police Service had a very large pro-
portion of queries (nearly 30%) that focused on ‘‘Recreation, Entertainment, and Leisure.” However, all of these ‘‘Recreation,
Entertainment, and Leisure” queries fall into the more specific subcategory of ‘‘Shopping.” These information seekers wanted
to know where and when police auctions were being held and used ‘‘auction” and ‘‘auctions” as the first and second most
popular queries, respectively.
It was unanticipated that ‘‘Recreation, Entertainment, and Leisure” information would be the most sought after type of
local information. Despite what may be considered more ‘‘serious” categories (e.g. ‘‘Work”, ‘‘Housing and Shelter”, ‘‘Health”),
residents from the two geographic regions served by MCI and its partner Web sites seem to place some priority with seeking
Table 5
Proportion of top 100 queries City of London, by major conceptual category.
Major categories Number of queries Categories as % of queries
1. Recreation, Entertainment, and Leisure 10,679 22.9
General 4825 13.5
Food and Drink 2064 5.8
Holidays 1314 3.7
Parks 1139 3.2
Shopping 514 1.4
Sports and Physical Fitness 368 1.0
Worship 455 1.3
2. Work 6789 19.0
Employment and Training 6069 17.0
Human Resources 377 1.1
Volunteerism 343 1.0
3. Municipal Government Business 3563 10.0
4. Transportation 2024 5.7
Ground Transportation 975 2.7
Roads 695 1.9
Air Transportation 354 1.0
5. Solid Waste Collection and Recycling 1931 5.4
6. Housing and Shelter 1918 5.4
7. Population, Demographics and Statistics 1910 5.3
8. Taxation 1546 4.3
9. Libraries 1343 3.8
10. Family 1129 3.2
Marriage 723 2.0
Children and/or Parenting 406 1.1
11. Geographic Information 1083 3.0
12. Crime and Public Safety 971 2.7
Crime 0 0.0
Public Safety 971 2.7
13. Media 484 1.4
14. Education 204 0.6
15. Water 179 0.5
16. Animals 0 0.0
17. Health 0 0.0
18. Place Names 0 0.0
19. Ageing, Dying, and Death 0 0.0
20. Business 0 0.0
Total queries 35,753 100.0
F. Lambert / Information Processing and Management 46 (2010) 343–361 353
Author's personal copy
information related to leisure activities. Considering the time pressures that working people often feel, this makes sense
since ‘‘the average daily time spent on paid work, housework and other unpaid household duties (including child care)
for those aged 25–54 (in Canada) has increased steadily over the past two decades, rising from 8.2 h in 1986 to 8.8 h in
2005” (Marshall, 2006, pp. 5 and 7). This overall increase in work of all kinds is due solely to the amount of paid work hours
Canadian workers perform (Marshall, 2006). Thus trying to find leisure activities within close proximity to where the MCI
and municipal Web site CI seekers live takes some precedence.
6.2.2. Job seeking
Notwithstanding the emphasis on searching for recreational activities in Southwestern Ontario, finding information about
employment is still a much sought after information-seeking category (Table 3). Even with reasonably low and declining
unemployment rates in Middlesex–London and Region of Waterloo during 2005 and 2006 (6.8% and 6.2% for the former,
respectively, and 5.7% and 5.2% for the latter, respectively), there was still a large focus on finding employment information
(City of Kitchener, 2007; LEDC, xxxx). There is quite a contrast in the percentage of queries devoted to finding employment
between the very rural County of Middlesex (which surrounds City of London) and City of London. While finding information
about ‘‘Work” by querying the City of London Web site was the second most popular category, it was the primary type of
inquiry for County of Middlesex (Table 7) which surrounds City of London.
6.2.3. Health ‘‘Matters
A disproportionate number of the queries (>50%) related to ‘‘Health” actually come from the Region of Waterloo muni-
cipal government Web site, and thus ranks as the fifth most sought local information inquiry (Table 9). The majority of Re-
gion of Waterloo’s health-related queries were concerned with virulent diseases (influenza and its iterations, ‘west Nile,’ and
‘pandemic’) and two narrow public health issues (‘smoking’ and ‘food safety,’). A closer examination of the Region of Water-
loo Web site’s main page reveals a link called ‘‘Pandemic Influenza Planning” (http://www.region.waterloo.on.ca/web/re-
Table 6
Proportion of top 100 queries, London Police Service, by major conceptual category.
Major categories Number of queries Categories as % of queries
1. Crime and Public Safety 1347 45.8
Crime 639 21.7
Public Safety 708 24.1
2. Recreation, Entertainment, and Leisure 873 29.7
General 0 0.0
Food and Drink 0 0.0
Holidays 0 0.0
Parks 0 0.0
Shopping 873 29.7
Sports and Physical Fitness 0 0.0
Worship 0 0.0
3. Work 254 8.6
Employment and Training 149 5.1
Human Resources 52 1.8
Volunteerism 53 1.8
4. Transportation 118 4.0
Ground Transportation 94 3.2
Roads 24 0.8
Air Transportation 0 0.0
5. Municipal Government Business 107 3.6
6. Population, Demographics and Statistics 81 2.8
7. Geographic Information 63 2.1
8. Media 54 1.8
9. Family 26 0.9
Marriage 0 0.0
Children and/or Parenting 26 0.9
10. Place Names 16 0.5
11. Housing and Shelter 0 0.0
12. Solid Waste Collection and Recycling 0 0.0
13. Animals 0 0.0
14. Taxation 0 0.0
15. Health 0 0.0
16. Libraries 0 0.0
17. Water 0 0.0
18. Ageing, Dying, and Death 0 0.0
19. Education 0 0.0
20. Business 0 0.0
Total queries 2939 100.0
354 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
gion.nsf/fmFrontPage?OpenForm. Accessed June 30, 2007). Clicking on this link opens a Web site in a new browser window
called Waterloo Region Pandemic Influenza Planning (http://www.waterlooregionpandemic.ca/en/index.shtml. Accessed June
30, 2007). However, prior to August 18th, 2006, this link did not exist on the Region of Waterloo Web site. Confirmation of
this fact is possible by using the Internet Archive Wayback Machine (http://www.archive.org/web/web.php. Accessed July 1,
2007). Examining past versions of the Region of Waterloo Web site using the Wayback Machine for the time span during
which the Web log data were collected did not reveal any links to the Waterloo Region Pandemic Influenza Planning Web site
let alone links to Web pages within the Region of Waterloo Web site that addressed pandemic planning or virulent disease.
Due to the relatively large number of health-related queries in MCI’s Web logs, the Region of Waterloo Web site may be per-
ceived to be a very important source of health information for residents of Region of Waterloo. However, this perception
could be attributed to poor design of the main Web page.
6.2.4. ‘‘Municipal Government Business
For the County of Middlesex, City of London, and Region of Waterloo, the proportions of queries that pertain to finding
information concerning ‘‘Municipal Government Business” are very similar. This similarity in the proportion of queries be-
tween the different Web sites for ‘‘Municipal Government Business” indicates that, regardless of whether an online informa-
tion seeker lives in either a primarily rural or a primarily urban setting, he or she relies on the World Wide Web to a fairly
significant extent to find information about his or her government’s operations and policies. Some examples of these infor-
mation inquiries are concerned with ‘‘bylaws’, ‘‘zoning”, ‘‘development charges”, ‘‘social services”, and other topics that are
too numerous to list.
MCI-Middlesex–London and MCI-Region of Waterloo were used rarely if at all to find information concerning ‘‘Municipal
Government Business.” When CI seekers are looking for information about their local government in an online environment,
they seem to prefer to use the most direct and relevant source: municipal government Web sites. Also, MCI-Middlesex–Lon-
don and MCI-Region of Waterloo index more than one municipal government’s Web site. A local information seeker looking
Table 7
Proportion of top 100 queries, County of Middlesex, by major conceptual category.
Major categories Number of queries Categories as % of queries
1. Work 562 33.6
Employment and Training 550 32.9
Human Resources 12 0.7
Volunteerism 0 0.0
2. Place Names 182 10.9
3. Municipal Government Business 180 10.8
4. Education 101 6.0
5. Recreation, Entertainment, and Leisure 90 5.4
General 31 1.9
Food and Drink 0 0.0
Holidays 12 0.7
Parks 17 1.0
Shopping 0 0.0
Sports and Physical Fitness 30 1.8
Worship 0 0.0
6. Taxation 81 4.8
7. Family 73 4.4
Marriage 44 2.6
Children and/or Parenting 29 1.7
8. Housing and Shelter 63 3.8
9. Crime and Public Safety 63 3.8
Crime 0 0.0
Public Safety 63 3.8
10. Libraries 55 3.3
11. Geographic Information 53 3.2
12. Solid Waste Collection and Recycling 52 3.1
13. Health 36 2.2
14. Population. Demographics and Statistics 32 1.9
15. Ageing, Dying, and Death 29 1.7
16. Water 19 1.1
17. Transportation 0 0.0
Ground Transportation 0 0.0
Roads 0 0.0
Air Transportation 0 0.0
18. Animals 0 0.0
19. Business 0 0.0
20. Media 0 0.0
Total queries 1671 100.0
F. Lambert / Information Processing and Management 46 (2010) 343–361 355
Author's personal copy
for information from his/her municipality would not necessarily be interested in the information that pertains to other
municipalities.
6.2.5. ‘‘Geographic Informationseeking in Southwestern Ontario
While ‘‘Geographic Information” only makes up 3% of the queries submitted through the City of London Web
site, only three queries were used to make up the total frequency for this category: ‘‘map,” ‘‘maps,” and ‘‘city map.” This
focused querying, like Region of Waterloo Web site’s experience with health information pertaining to virulent disease, says
something about Web site design and how users are interacting with it through its links to information sources such as
maps.
7. Discussion
The findings presented above show that the types of CI categories being sought by CI seekers in London–Middlesex and
Region of Waterloo are, in many instances, considerably different between these communities. These categories are different
also from those discovered and presented by past research. This not only reveals interesting trends in CI seeking, but it also
offers insight into the design of CI Web sites and municipal government Web sites. It raises questions about the effectiveness
of adopting a standardized approach to providing online CI amongst different communities. This standardized approach that
is used by a variety of CI and information and referral services such as the 211 service
2
may actually be doing the residents of
those communities a disservice.
Table 8
Proportion of top 100 queries, MCI-Middlesex–London, by major conceptual category.
Major categories Number of queries Categories as % of queries
1. Family 5458 25.6
Marriage 0 0.0
Children and/or Parenting 5458 25.6
2. Animals 2836 13.3
3. Recreation, Entertainment, and Leisure 2280 10.7
General 1924 9.0
Food and Drink 82 0.4
Holidays 60 0.3
Parks 88 0.4
Shopping 0 0.0
Sports and Physical Fitness 84 0.4
Worship 42 0.2
4. Housing and Shelter 2263 10.6
5. Work 1781 8.4
Employment and Training 1686 7.9
Human Resources 0 0.0
Volunteerism 95 0.4
6. Ageing, Dying, and Death 1358 6.4
7. Taxation 1131 5.3
8. Health 1100 5.2
9. Business 800 3.8
10. Crime and Public Safety 731 3.4
Crime 159 0.7
Public Safety 572 2.7
11. Libraries 514 2.4
12. Education 486 2.3
13. Place Names 296 1.4
14. Geographic Information 158 0.7
15. Population, Demographics and Statistics 61 0.3
16. Media 48 0.2
17. Municipal Government Business 0 0.0
18. Transportation 0 0.0
Ground Transportation 0 0.0
Roads 0 0.0
Air Transportation 0 0.0
19. Solid Waste Collection and .Recycling 0 0.0
20. Water 0 0.0
Total queries 21301 100.0
2
See 211 Toronto (http://www.211toronto.ca/index.jsp), 211 Niagara (http://www.211toronto.ca/ont/index.jsp?partner_code=211nia), and 211 Simcoe
(http://www.211toronto.ca/ont/index.jsp?partner_code=211sim) as examples of the blanket application of online CI design and taxonomy organization.
356 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
7.1. Emergence of help-seeking mismatches
Based on the queries that were collected and analyzed for this study, there is an impression that a relatively large number
of Web searchers using the municipal government Web sites presented in this paper seem to perceive that online govern-
ment-based information resources are a relatively important source for ‘‘Recreation, Entertainment, and Leisure” informa-
tion. For instance, ‘‘movies”, ‘‘restaurants”, ‘‘bars”, ‘‘shopping”, and ‘‘malls” are all examples of very frequently occurring
navigational and informational queries that are part of this category submitted specifically through municipal government
Web sites. However, after querying personally the City of London Web site using each of those same keywords, no hits re-
lated to these queries were returned. The City of London Web site does not contain any Web documents that pertain to many
of these activities. Those who are querying City of London’s Web site in particular might be experiencing what Dewdney and
Harris (1992) define as a help-seeking mismatch where ‘‘the types of help that might be expected from an agency are not
those which it provides.” (p. 23) Thus, users looking for this information in Middlesex–London are not using the best source
to retrieve this type of information (MCI-Middlesex–London does index Web sites related to tourism). If this is indeed a com-
mon practice with other similar government Web sites, municipal governments should evaluate how their Web sites are
used more closely to help minimize this unintentional information barrier.
7.2. Is health information seeking still an important Web search activity?
The low ranking for ‘‘Health” information seeking was somewhat unexpected as searching for health or medical related
information is generally a relatively popular Internet activity in Canada. While a direct comparison cannot be made using
this study’s data, Statistics Canada’s 2005 survey reports that searching for health or medical related information was the
sixth most popular Internet activity of individual Canadians (Statistics Canada, Nov. 1, 2006). Additionally, there is a signif-
icant body of information science literature that emphasizes consumer health information seeking on the WWW. For
Table 9
Proportion of top 100 queries, Region of Waterloo, by major conceptual category.
Major categories Number of queries Categories as % of queries
1. Transportation 2471 17.8
Ground Transportation 1226 8.8
Roads 1116 8.1
Air Transportation 129 0.9
2. Solid Waste Collection and Recycling 1962 14.2
3. Water 1413 10.2
4. Municipal Government Business 1409 10.2
5. Health 1265 9.1
6. Work 1237 8.9
Employment and Training 1017 7.3
Human Resources 220 1.6
Volunteerism 0 0.0
7. Geographic Information 999 7.2
8. Place Names 972 7.0
9. Recreation, Entertainment and Leisure 606 4.4
General 362 2.6
Food and Drink 0 0.0
Holidays 81 0.6
Parks 85 0.6
Shopping 0 0.0
Sports and Physical Fitness 78 0.6
Worship 0 0.0
10. Education 421 3.0
11. Population, Demographics and Statistics 279 2.0
12. Crime and Public Safety 255 1.8
Crime 0 0.0
Public Safety 255 1.8
13. Housing and Shelter 252 1.8
14. Family 236 1.7
Marriage 0 0.0
Children and/or Parenting 236 1.7
15. Business 80 0.6
16. Media 0 0.0
17. Libraries 0 0.0
18. Ageing, Dying, and Death 0 0.0
19. Animals 0 0.0
20. Taxation 0 0.0
Tola queries 13,857 100.0
F. Lambert / Information Processing and Management 46 (2010) 343–361 357
Author's personal copy
instance, Harris, Wathen, and Chan (2005) cite two studies that suggest anywhere from 20% to 80% of the population of the
United States with Internet access seek health information. Gillaspy (2005) found contradictory studies concerning the pro-
portion of the population that use the Internet for finding health information. Gillaspy cites the same Pew Internet Project
report as Harris, Wathen, and Chan where 80% of the population reported they use the Internet to seek health information.
However, a study from the Center for Studying Health Care Change reported that only 16% of the United States’ population
used the Internet to find health information. Gillaspy considers these data within the context of various factors that may
affect the provision of consumer health information in public libraries. While Gillaspy wonders which study is the most
accurate, she concludes that ‘‘in fact, in terms of providing consumer health information in public libraries, it may not mat-
ter. Adults learn at the point of need, when learning is relevant to their life situation, and certainly health issues become
pertinent to almost everyone at some point in their lives” (p. 482). Spink and Jansen’s (2004) studies confirmed what was
found in this study; that only a small percentage of Web queries are medical or health-related. Considering the plethora
of health information Web sites available on the WWW, it is very possible that MCI and the municipal government Web sites
are not considered to be good or even necessary resources despite the number of local health-related Web sites that MCI
indexes.
7.3. Effectiveness and importance of well designed home pages
The City of London has an impressive number of online maps available through its Web site, including an outstanding
easy to use interactive map of the city available to find addresses, provide aerial photographs, find schools, parks and rec-
reation centres, points of interest, and even assessment parcels and electoral wards, among other features. (City of London,
xxxx). As a result, it is not difficult to assume that this collection of geographic resources could be a fairly popular tool for
finding ‘‘Geographic Information” about the city. However, the City of London Web site also has a rather noticeable link to
this interactive map on its main page. The question that is raised, then, is why are keyword queries being submitted through
Table 10
Proportion of top 100 queries, MCI-Region of Waterloo, by major conceptual category.
Major categories Number of queries Categories as % of queries
1. Recreation, Entertainment, and Leisure 658 30.5
General 146 6.8
Food and Drink 23 1.1
Holidays 366 17.0
Parks 42 1.9
Shopping 19 0.9
Sports and Physical Fitness 52 2.4
Worship 10 0.5
2. Work 331 15.4
Employment and Training 310 14.4
Human Resources 8 0.4
Volunteerism 13 0.6
3. Place Names 328 15.2
4. Solid Waste Collection and Recycling 215 10.0
5. Transportation 102 4.7
Ground Transportation 39 1.8
Roads 42 1.9
Air Transportation 21 1.0
6. Water 98 4.5
7. Municipal Government Business 97 4.5
8. Education 59 2.7
9. Geographic Information 57 2.6
10. Population, Demographics and Statistics 47 2.2
11. Housing and Shelter 44 2.0
12. Libraries 24 1.1
13. Media 24 1.1
14. Business 21 1.0
15. Crime and Public Safety 19 0.9
Crime 0 0.0
Public Safety 19 0.9
16. Family 12 0.6
Marriage 0 0.0
Children and/or Parenting 12 0.6
17. Ageing, Dying, and Death 11 0.5
18. Health 8 0.4
19. Animals 0 0.0
20. Taxation 0 0.0
Total queries 2155 100.0
358 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
the Web site to find maps when a relatively prominent link is provided on the main page? This, like Region of Waterloo’s
rather prominent link to its pandemic planning Web site, may be an example of the same issue articulated by Herrera-Vied-
ma and Pasi (2006) cited earlier in this study: that the likelihood of a successful search for information based on a point-and-
click access paradigm is dependent largely on the design and related information provided on the starting Web page. Since
such a relatively large number of queries are related to ‘‘Geographic Information,” and more specifically to maps, it is pos-
sible that an unsuccessful point-and-click access paradigm is being demonstrated. Without Web site navigation data related
to the number of times users have actually clicked on the maps link, this theory cannot be tested. However, further research
into this aspect of human–computer interaction may reveal aspects of online searching behaviour that should be taken into
account in the design of Web sites.
The respective municipal governments mentioned above might find it interesting to know how their citizens rely on their
Web sites to find information concerning the municipalities’ operations, policies, and responsibilities. Municipal govern-
ments have followed the lead of the federal and provincial governments in offering more services and information through
the WWW. This has occurred despite the fact that municipal governments already tended to interact with a country’s citi-
zens much more closely than a national government. This type of online interactivity between a municipality’s residents and
local government should also enhance residents’ perception that municipal governments still tend to be better at delivering
services compared to the provincial or federal governments (cf. Canadian Centre for Management Development, 1999; Erin
Research, Inc., 1998).
7.4. Implicaton of this study’s findings and the findings of other similar studies on information organization
There are some but few similarities between Durrance and Pettigrew’s categories and the major categories that emerged
from the MCI Web log data. Most importantly, like Durrance and Pettigrew’s findings (2002), the categories in Tables 3 and 4
‘‘are markedly different from those traditionally used to classify CI needs” (p. 2). This should hopefully start a dialogue about
possible changes to Sales’s Taxonomy of Human Services (1994) and its more current iteration, the AIRS/INFOLINE Taxonomy of
Human Services, since so many of the more than 8,500 controlled vocabulary terms contained in the taxonomy are not re-
motely similar when related to MCI’s conceptual CI inquiries (Sales, 2003).
The top 100 most frequently occurring queries from six Web sites formed the basis for the categorization and subsequent
discussion of online CI inquiries. This means that only six hundred queries represented 21% of the tens of thousands of dis-
tinct queries submitted through MCI and its partner municipal Web sites over the course of 1 year. This demonstrates a
remarkable conceptual consistency in local information seeking that may be boiled down to the 20 main conceptual cate-
gories shown in Table 3. Why then are bulky and often complex taxonomies needed to organize online CI directories when
their controlled vocabulary may not reflect actual CI seekers’ queries? Despite the artificiality inherent in usability studies
(e.g. Toms & Kinnucan, 1996), perhaps additional usability studies are needed to determine more ideal approaches to search-
ing CI Web sites.
8. Limitations of the study
Thelwall et al. (2005) summarize many of the challenges to using Web log analysis as a research method. Many of these
challenges occurred in this study. One of the most significant limitations in the case of MCI’s query log data is trying to iden-
tify accurately individual users from the search session identifiers. For session identification, Web site designers have imple-
mented the use of cookies because the proliferation of dynamic IP addresses has meant that it is impossible to identify
individual users of a Web site (Rubin, 2001). However, users may disable cookies on their computer, thus leaving this iden-
tifying field in Web logs blank and consequently affect individual query submission analysis (Silverstein et al., 1999; Thel-
wall et al., 2005). MCI does not use cookies. This means that individual users and the respective computers they use cannot
be identified as unique users since only IP addresses are logged. While the session identification for the incoming queries
through the MCI and other Web sites is not perfect in identifying individual users, it does provide the ability to distinguish
a unique search session initiated through a particular computer. The Web searcher’s querying behaviour, from the text sub-
mitted to the number of page requests, for the search session started by the computer’s incoming IP address is recorded
faithfully in the logs until the session is ended by either the Web browser being closed or the search session being inactive
for 20 min which causes the session to time out.
The other mentionable limitation of this study is that it does not include the examination of navigational Web logs for the
Web sites that are under scrutiny. No assumption is made that every user of the Web sites under examination uses only key-
word querying to find the information that he or she is seeking. Users may find the information that they are seeking by
clicking on relevant hyperlinks within a Web site until they reach their final information destination. However, this study
attempts to attach meaning to the keywords users submit through the MCI and municipal government Web sites in their
pursuit of community information. This is what makes this study unique in the area of Web searching that focuses on
the local community. It takes large sets of data (keyword queries from Web query logs) and attaches meaning to these data
to provide a thorough analysis of CI needs based on what the CI seeker actually types and submits through a Web site. The
limitations of query analysis should not inordinately hamper the main purpose of the study: discovering what CI Web
resources and are being used most by CI seekers in Southwestern Ontario by using relative comparisons, and, most
F. Lambert / Information Processing and Management 46 (2010) 343–361 359
Author's personal copy
importantly, what the users’ information needs are, based on the type of topical or conceptual CI being sought from these
resources.
9. Conclusion
The online local information inquiries of a particular population or populations, and how these inquiries are being ad-
dressed, are anything but simple and predictable. Even multiple sources of online information, such as those presented in
this paper, that are designed to serve the same population demonstrates often that a population’s Web searching will vary
often depending on how the information seeker perceives the scope of information and the potential utility of the informa-
tion that those sources provide. This study is a first step in exploring the complexities of CI seeking in an online environment
that has, for the first time, had access to a very large set of data that provides unobtrusive evidence of CI inquiries for specific
geographic regions. Despite this paper’s focus on a limited geographic region, the findings have shown that: local informa-
tion seekers during the course of their Web searching may often choose the wrong resource to address their information
need; health information seeking from the local community is not generally an important task despite the fact that people
are generally likely to get medical attention in the immediate geographic vicinity depending on their situation; what may
appear to be well designed home pages to Web sites may not indeed be so; and, that while the types of community infor-
mation sought through MCI and its partner municipal government Web site are more diverse and specific than those re-
vealed by two other studies, this study’s categories, like those reported by Durrance and Pettigrew (2002), are markedly
different than those found in CI taxonomies. This last point especially will hopefully stimulate debate as to whether these
taxonomies should continue to rely on literary warrant for their empirical basis or whether it might be beneficial to consider
user warrant to go beyond a taxonomic structure and create full-fledged thesauri for indexing and retrieval purposes for
those CI organizations that wish to continue using a directory style model.
9.1. Future research directions
Unlike a lot of past published research that focused on Web searching conducted on individual Web sites, further studies
are under way that will examine the same Web sites presented in this paper. However, these future studies will take a lon-
gitudinal approach by analyzing 3 years’ worth of query data. Will the conceptual categories that emerged from the data in
this study apply still or not? How, if at all, do CI seekers’ Web searches change over this time period? Will the 3 years of
query data provide an accurate depiction of societal, political, and economic changes over this time period? Between this
study and the proposed studies, better models of information needs as expressed through Web searching through single
Web sites will emerge hopefully to give a more complete portrait of how local CI Web sites are used to address citizen every-
day information seeking behaviours.
Acknowledgements
The author wishes to thank: Drs. Liwen Vaughan, Catherine Ross, and Daniel Robinson, Faculty of Information and Media
Studies, University of Western Ontario, for their feedback on a much larger version of this study; Drs. Yin Zhang and Marcia
Zeng, School of Library and Information Science, Kent State University, for their assistance with this current paper; and, Dr.
Stephen Cummings, mycommunityinfo.ca, for his assistance in facilitating access to the log data discussed herein.
References
Babbie, E. (2008). The basics of social research (4th ed.). Belmont, CA: Wadsworth.
Beitzel, S. M., Jensen, E. C., Chowdhury, A., Frieder, O., & Grossman, D. (2007). Temporal analysis of a very large topically categorized Web query log. Journal
of the American Society for Information Science and Technology, 58(2), 166–178.
Bishop, A. P., Tidline, T. T., Shoemaker, S., & Salela, P. (1999). Public libraries and networked information services in low-income communities. Library and
Information Science Research, 21(3), 361–390.
Broder, A. (2002). A taxonomy of Web search. ACM SIGIR Forum, 36(2), 3–10.
Canadian Centre for Management Development (1999). Citizen-centred service: Responding to the needs of Canadians; for the Citizen-Centred Service Network.
Ottawa: Canadian Centre for Management Development, Citizen-Centred Service Network [Microlog #100-01945].
Chau, M., Fang, X., & Sheng, O. (2005). Analysis of the query logs of a Web site search engine. Journal of the American Society for Information Science and
Technology, 56(13), 1363–1376.
Childers, T. (1984). Information and referral: Public libraries. Norwood, NJ: Ablex Publishing.
City of Kitchener (2007). Demographic profile/labour force profile: Fast facts. City of Kitchener. <http://www.kitchener.ca/pdf/fast_facts.pdf> Retrieved
19.04.07.
City of London (n.d.). E-services and maps: Interactive maps: City map. City of London. <http://www.london.ca/_private/Maps/Maps.htm> Retrieved
27.04.07.
Cohill, A. M. (2000). The future of community networks. In A. M. Cohill & A. L. Kavanaugh (Eds.), Community networks: Lessons from Blacksburg, Virginia (2nd
ed., pp. 357–380). Boston: Artech House.
Cummings, S. (2002). Life events, life episodes: Instances of the approach in the United Kingdom, Ireland, Spain, Canada and the private sector and large public
institutions. London, Ontario: Mycommunityinfo.ca.
Cummings, S. (2004). Mycommunityinfo.ca. London, Ontario: Mycommunityinfo.ca.
Dewdney, P., & Harris, R. M. (1992). Community information needs: The case of wife assault. Library and Information Science Research, 14, 5–29.
360 F. Lambert / Information Processing and Management 46 (2010) 343–361
Author's personal copy
Donohue, J. C. (1976). Community information services a proposed definition. In Information politics: Proceedings of the ASIS annual meeting (p. 126).
Washington, DC: ASIS [Cited in Durrance, J. C. (1984b). Community information services: An innovation at the beginning of its second decade. In W.
Simonton (Ed.), Advances in librarianship (Vol. 13, pp. 99–128). Orlando: Academic Press].
Durrance, J. C. (1984). Community information services: An innovation at the beginning of its second decade. In W. Simonton (Ed.). Advances in librarianship
(Vol. 13, pp. 99–128). Orlando: Academic Press.
Durrance, J. C., & Pettigrew, K. E. (2002). Online community information: Creating a nexus at your library. Chicago: American Library Association.
Erin Research, Inc. (1998). Citizens first: Summary report. Ottawa: Canadian Centre for Management Development, Citizen-Centred Service Network.
Gillaspy, M. L. (2005). Factors affecting the provision of consumer health information in public libraries: The last five years. Library Trends, 53(3), 480–495.
Gurstein, M. (2000). Introduction. Community informatics: Enabling community uses of information and communications technology. In M. Gurstein (Ed.),
Community informatics: Enabling community uses of information and communications technology (pp. 1–30). Hershey, PA: Idea Group.
Harris, R., Wathen, C. N., & Chan, D. (2005). Public library responses to a consumer health inquiry in a public health crisis: The SARS experience in Ontario.
Reference and Users Service Quarterly, 45(2), 147–154.
Hearn, G., Kimber, M., Lennie, J., & Simpson, L. (2005). A way forward: Sustainable ICTs and regional sustainability. The Journal of Community Informatics,
1(2). <http://www.ci-journal.net/index.php/ciej/article/view/201/160> Accessed 17.06.07.
Herrera-Viedma, E., & Pasi, G. (2006). Soft approaches to information retrieval and information access on the Web: An introduction to the special topic
section. Journal of the American Society for Information Science and Technology, 57(4), 511–514.
Jansen, B. J., Booth, D. L., & Spink, A. (2008). Determining the informational, navigational, and transactional intent of Web queries. Information Processing and
Management, 44, 1251–1266.
Kubicek, H., & Wagner, R. M. (2002). Community networks in a generational perspective: The change of an electronic medium within three decades.
Information, Communication and Society, 5(3), 291–319.
Lambert, F. (2008). Rewriting the ‘‘rulesof online networked community information services: A case study of the mycommunityinfo.ca model. London, Ontario:
Faculty of Graduate Studies, University of Western Ontario.
Lau, E. P., & Goh, D. H.-L. (2006). In search of query patterns: A case study of a university OPAC. Information Processing and Management, 42(5), 1316–1329.
LEDC (London Economic Development Corporation) (n.d.). Unemployment rate. <http://www.ledc.com/forsiteselectors/workforce/unemploymentrate/>
Retrieved 19.04.07.
Marshall, K. (2006). Converging gender roles. Perspectives on Labour and Income, 7(7), 5–17. <http://www.statcan.ca/english/freepub/75-001-XIE/10706/art-
1.htm> Retrieved 05.07.06.
Pu, H., Chuang, S., & Yang, C. (2002). Subject categorization of query terms for exploring Web users’ search interests. Journal of the American Society for
Information Science and Technology, 53(8), 617–630.
Rideout, V. N., & Reddick, A. J. (2005). Sustaining community access to technology: Who should pay and why! The Journal of Community Informatics, 1(2), 45–
62. <http://www.ci-journal.net/index.php/ciej/article/view/202/162> Retrieved 05.01.06.
Rieh, S. Y., & Xie, H. (2006). Analysis of multiple query reformulations on the Web: The interactive information retrieval context. Information Processing and
Management, 42(3), 751–768.
Ross, N. C. M., & Wolfram, D. (2000). End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine. Journal of the
American Society for Information Science, 51(10), 949–958.
Rubin, J. H. (2001). Introduction to log analysis techniques: Methods for evaluating networked services. In C. R. McClure & J. C. Bertot (Eds.), Evaluating
networked information services: Techniques, policy, and issues (pp. 197–212). Medford, NJ: Information Today.
Sales, G. (1994). A taxonomy of human services: A conceptual framework with standardized terminology and definitions for the field (3rd ed.). El Monte, CA:
Information and Referral Federation of Los Angeles County, Inc.
Sales, G. (2003). An orientation to the structure and contents of the AIRS/INFO LINE taxonomy. The Journal of the Alliance of Information and Referral Systems.
<http://www.211taxonomy.org/publicfiles/view/Taxonomy_Orientation.pdf> Accessed 28.02.07 [last revision, August 2006].
Schuler, D. (1996). New community networks: Wired for change. New York: ACM Press.
Silverstein, C., Henzinger, M., Marais, H., & Moricz, M. (1999). Analysis of a very large Web search engine query log. SIGIR Forum, 33(1), 6–12.
Spink, A., & Jansen, B. J. (2004). Web search: Public searching of the Web.Dordrecht: Kluwer Academic Publishers.
Spink, A., Wolfram, D., Jansen, M. B. J., & Saracevic, T. (2001). Searching the Web: The public and their queries. Journal of the American Society for Information
Science and Technology, 52(3), 226–234.
Statistics Canada (2006). Internet use by individuals, by type of activity. Summary Tables.<http://www40.statcan.ca/l01/cst01/comm16.htm> Accessed
22.06.07 [November 1, last update].
Thelwall, M., Vaughan, L., & Björneborn, L. (2005). Webometrics. In B. Cronin (Ed.). Annual review of information science and technology (Vol. 39, pp. 81–135).
Medford, NJ: Information Today.
Toms, E. G., & Kinnucan, M. T. (1996). The effectiveness of the electronic city metaphor for organizing the menus of Free-Nets. Journal of the American Society
of Information Science, 47(12), 919–931.
Wang, P., Berry, M. W., & Yang, Y. (2003). Mining longitudinal Web queries: Trends and patterns. Journal of the American Society for Information Science and
Technology, 54(8), 743–758.
Zhang, J., Wolfram, D., Wang, P., Hong, Y., & Gillis, R. (2008). Visualization of health-subject analysis based on query term co-occurrences. Journal of the
American Society of Information Science, 59(12), 1933–1947.
F. Lambert / Information Processing and Management 46 (2010) 343–361 361
... Therefore, the theory can serve as a framework to guide the design of ICTs to attract potential consumers, increase consumer loyalty, and stimulate continued use [27]. It has been applied in various contexts [28][29][30] and is especially suitable for the context of OSCCs because online communities are typical of ICTs [31,32]. ...
Article
Full-text available
Background: Previous studies on online smoking cessation communities (OSCCs) have shown how such networks contribute to members' health outcomes from behavior influence and social support perspectives. However, these studies rarely considered the incentive function of OSCCs. One of the ways OSCCs motivate smoking cessation behaviors is through digital incentives. Objective: This study aims to explore the incentive function of a novel digital incentive in a Chinese OSCC-the awarding of academic degrees-to promote smoking cessation. It specifically focuses on "Smoking Cessation Bar," an OSCC in the popular web-based Chinese forum Baidu Tieba. Methods: We collected discussions about the virtual academic degrees (N= 1193) from 540 members of the "Smoking Cessation Bar." The time frame of the data set was from November 15, 2012, to November 3, 2021. Drawing upon motivational affordances theory, 2 coders qualitatively coded the data. Results: We identified five key topics of discussion, including members' (1) intention to get virtual academic degrees (n=38, 2.47%), (2) action to apply for the degrees (n=312, 20.27%), (3) feedback on the accomplishment of goals (n=203, 13.19%), (4) interpersonal interaction (n=794, 51.59%), and (5) expression of personal feelings (n=192, 12.48%). Most notably, the results identified underlying social and psychological motivations behind using the forum to discuss obtaining academic degrees for smoking cessation. Specifically, members were found to engage in sharing behavior (n=423, 27.49%) over other forms of interaction such as providing recommendations or encouragement. Moreover, expressions of personal feelings about achieving degrees were generally positive. It was possible that members hid their negative feelings (such as doubt, carelessness, and dislike) in the discussion. Conclusions: The virtual academic degrees in the OSCC created opportunities for self-presentation for participants. They also improved their self-efficacy to persist in smoking cessation by providing progressive challenges. They served as social bonds connecting different community members, triggering interpersonal interactions, and inducing positive feelings. They also helped realize members' desire to influence or to be influenced by others. Similar nonfinancial rewards could be adopted in various smoking cessation projects to enhance participation and sustainability.
... 167). Lambert (2010aLambert ( , 2010bLambert ( , 2012 used methodological approaches similar to Ross and Wolfram, Pu, Chuang, and Yang, and Beitzel et al. to examine information seeking needs and behaviours expressed through querying on community information and municipal government Web sites. However, Lambert"s studies (2010b, 2012) examined such information seeking over a period of three years, one of the longest time periods in the literature. ...
Article
Full-text available
Virtual trace is presented as a reconceptualized methodological framework inspired particularly by Webb et al"s physical trace methodology and a variety of Webometric data collection and analysis methods from the LIS literature. With the ongoing proliferation of data from current and new Internet-based sources, virtual trace is intended to be considered by experienced and novice researchers as a comprehensive research approach for studies whose designs are similar conceptually to those described in this article. Other online methodologies such as social tagging analysis and virtual ethnography are examined to provide virtual trace further definition.
... This suggests that the length of time that these data are collected is not a variable that affects necessarily the average number of terms submitted per query. This is supported further by Lambert's (2010a) original study that included an analysis of City of London queries among others for 1 year only. He found that the mean words per query for all cleaned queries and the top 100 most frequently occurring queries were 2.1 and 1.28, respectively. ...
Article
Over the past few years, graph representation learning (GRL) has received widespread attention on the feature representations of the non-Euclidean data. As a typical model of GRL, graph convolutional networks (GCN) fuse the graph Laplacian-based static sample structural information. GCN thus generalizes convolutional neural networks to acquire the sample representations with the variously high-order structures. However, most of existing GCN-based variants depend on the static data structural relationships. It will result in the extracted data features lacking of representativeness during the convolution process. To solve this problem, dynamic graph learning convolutional networks (DGLCN) on the application of semi-supervised classification are proposed. First, we introduce a definition of dynamic spectral graph convolution operation. It constantly optimizes the high-order structural relationships between data points according to the loss values of the loss function, and then fits the local geometry information of data exactly. After optimizing our proposed definition with the one-order Chebyshev polynomial, we can obtain a single-layer convolution rule of DGLCN. Due to the fusion of the optimized structural information in the learning process, multi-layer DGLCN can extract richer sample features to improve classification performance. Substantial experiments are conducted on citation network datasets to prove the effectiveness of DGLCN. Experiment results demonstrate that the proposed DGLCN obtains a superior classification performance compared to several existing semi-supervised classification models.
Article
Purpose – The purpose of this paper is to consider the nature of community information (CI) and proposes a data model, based on the entity-relationship approach adopted in the Functional Requirements for Bibliographic Records (FRBR), which may assist with the development of future metadata standards for CI systems. Design/methodology/approach – The two main data structure standards for CI, namely the element set developed by the Alliance of Information and Referral Systems (AIRS) and the MARC21 Format for CI, are compared by means of a mapping exercise, after which an entity-relationship data model is constructed, at a conceptual level, based on the definitions of CI found in the literature. Findings – The AIRS and MARC21 data structures converge to a fair degree, with MARC21 providing for additional detail in several areas. However, neither structure is systematically and unambiguously defined, suggesting the need for a data model. An entity-relationship data modelling approach, similar to that taken in FRBR, yielded a model that could be used as the basis for future standards development and research. It was found to effectively cover both the AIRS and MARC21 element sets. Originality/value – No explicit data model exists for CI, and there has been little discussion reported about what data elements are required to support CI seeking.
Article
Purpose – The purpose of this paper is to report on the findings of an audit of community information (CI) portals to provide an overview of how CI is being organised and presented on the web by aggregating services, and how CI is being shaped and shared in community networks. It also investigates the role that public libraries play in online CI provision. Design/methodology/approach – The research sampled CI portals online within the Australian web domain (.au). An audit of 88 portals was undertaken to establish the scope, role and usefulness of the portals. The audit included a comprehensive usability analysis of a sub set of 20 portals evaluated for 20 different heuristics based on Nielsen's heuristic model. Findings – The research finds that the challenge facing portals is not a lack of information, it is the need to improve the mediation between the community services and people that CI portals promise useful and usable information for. While public libraries remain integral to the provision of CI in their geographical area, they now form part of a larger online network for CI provision, involving a wide range of organisations. Originality/value – The paper discusses the ways CI portals contribute to the provision of information about community services and identifies areas where improvements are needed. In particular, it discusses how these sites function as part of larger CI networks and where more innovative, and more standardised, design could lead to greater levels of engagement and utility.
Article
Full-text available
In this paper, we are introducing a method to improve search engine capabilities by using user preference achieved with the help of community's proxy logs. The goal is focused to build a custom search engine that providing community-specific results. To achieve such search engine, we use proxy server logs from Network Operation Center of EEPIS-ITS and fetch the unique URL and user field as raw data. Getting the needed data, then we crawl the title and meta information from all of the unique URLs. Then, document vector is created in order to make those textual data turn into a machine-friendly numerical data. To find the topic, based on those URLs and its meta information, we cluster it into 10 or more preferable clusters using k-means algorithm. Those results, finally, would be our base to create the search engine and we use vector space model to provide a search result from user's query.
Article
Full-text available
Communication-oriented Internet technologies and activities such as social media sites and blogs, have become an important component of community and employment participation, not just in the specific function of activities, but as a link to larger communities of practice and professional connections. The occurrence of these activities, evident in their presence on Facebook, LinkedIn and other online communities, represents an important opportunity to reframe and re-conceptualize manifestation of communities especially those in which distributed networks and communities substitute for geographic proximity, offering new opportunities for engagement, especially those who might be functionally limited in terms of mobility. For people with disabilities, as well as the aging, increasingly interacting online, the readiness of social networking sites to accommodate their desire to participate, in conjunction with their readiness as users to maximize the potential of platform interfaces and architecture, are critical to achieving the medium’s potential for enhancing community and employment benefits. This paper explores representation/presence of disability and aging using as frames, Facebook and LinkedIn groups. Target identity/member groups on Facebook and LinkedIn were catalogued to explore the presence and representation of disability and aging identities in a socially networked setting. The groups for this study were identified using the search feature designed into the platform architecture, which allow a user to search on specifically designated entities or keywords. Findings suggest that from a policy perspective, institutions need to be cognizant of population characteristics as well as platform opportunities implementing advocacy and relevant support services for people with disabilities and older adults to full ensure engagement and participation.
Article
In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms are used with high frequency, and a great many terms are unique; the language of Web queries is distinctive. Queries about recreation and entertainment rank highest. Findings are compared to data from two other large studies of Web queries. This study provides an insight into the public practices and choices in Web searching.
Article
Queries submitted to the Excite search engine were analyzed for subject content based on the cooccurrence of terms within multiterm queries. More than 1000 of the most frequently cooccurring term pairs were categorized into one or more of 30 developed subject areas. Subject area frequencies and their cooccurrences with one another were tallied and analyzed using hierarchical cluster analysis and multidimensional scaling. The cluster analyses revealed several anticipated and a few unanticipated groupings of subjects, resulting in several well-defined high-level clusters of broad subject areas. Multidimensional scaling of subject cooccurrences revealed similar relationships among the different subject categories. Applications that arise from a better understanding of the topics users search and their relationships are discussed.
Article
This article presents findings from an empirical study of community information exchange and computer access and use among low-income, predominantly African-American residents in one locale. Data were collected through household interviews, focus groups, and surveys. Results indicate that, while computer use is minimal, many low-income community members are poised to participate in the local development of networked information services. The article emphasizes appropriate roles for public libraries in community-wide efforts to bridge the digital divide that cuts computer use along socioeconomic lines.
Article
This article addresses the extent to which public libraries in Ontario were able to respond to inquiries for health information during a major public health crisis. The 2003 outbreak of Severe Acute Respiratory Syndrome (SARS) in Toronto, Ontario, represented a challenge to those charged with providing accurate and timely information to the public. At the onset of the outbreak, the disease was not well understood and information about SARS was sketchy. As the outbreak progressed, information was in flux as more became known about the nature of the disease, methods of transmission, and treatment protocols. Against this background, sixty-nine randomly selected libraries in Ontario were queried by phone and by e-reference service (if it was offered by the library) for information about SARS, its symptoms, and prevention methods. The responses of the libraries were analyzed for the quality of the reference service and types of referrals, particularly Internet sources given the growing popularity of e-health initiatives. The results raise serious questions about the appropriate role of public libraries in the delivery of consumer health information and the preparedness of public library staff to respond to health-related inquiries, particularly in times of crisis.
Article
The purpose of this study was to examine the relationship between community needs for information about wife assault and the information response offered through social service networks in six communities of varying sizes. In the first phase of the research, 543 randomly selected women were interviewed during household surveys to assess their knowledge of different kinds of information resources that would be helpful in a hypothetical situation involving wife assault. In the second phase, 179 interviews were conducted with agencies and professionals identified as likely sources of help during the household interviews. Analyses of both surveys provided a map of the degree to which the information delivery system's response overlaps with the public's need for information. Consistent with previous studies of information-seeking patterns, the results indicated that many residents expected types of help that these agencies did not, in fact, provide. Agencies and professionals themselves were not always aware of appropriate sources of help, and did not routinely assess the severity of the presenting situation or the kind of help wanted. The implications for the design of community information delivery systems, including information and referral (I & R) centers, are discussed.