ArticlePDF Available

Abstract and Figures

Textual data has become increasingly common in business analytic data sets. While concept-based text mining offers a method of extracting meaningful information from text data, methods for monitoring of customer perceptions of business processes and products that are discussed in customer-generated documents are not immediately available. We explore the results of two text-mining algorithms and review issues observed in the data that affect uploading the results onto a newly proposed methodological monitoring platform analogous to statistical process control charts. Finally, we discuss several topics for future research in text mining.
Content may be subject to copyright.
Quantitative Quality Control from Qualitative Data:
Control Charts with Latent Semantic Analysis
Triss Ashton*
Assistant Professor
The University of Texas Pan American
1201 West University, Edinburg, TX 78539
e-mail: AshtonTA@utpa.edu
(956) 665-3353
(956) 665-3367 (fax)
* Corresponding Author
Nicholas Evangelopoulos,
Associate Professor
University of North Texas
1155 Union Circle # 311160
Denton, TX 76203-5017
Victor R. Prybutok
Regents Professor
University of North Texas
1155 Union Circle # 311160
Denton, TX 76203-5017
ABSTRACT Large quantities of data, often referred to as Big Data, are now held by
companies. This Big Data includes statements of customer opinion regarding product or service
quality in an unstructured textual form. While many tools exist to extract meaningful
information from Big Data, automation tools do not exist to monitor the ongoing conceptual
content of that data. We use latent semantic analysis to extract concept factors related to service
quality categories. Customer comments found in the data that express dissatisfaction are then
considered as representing a non-conforming observation in a process. Once factors are
extracted, proportions of nonconformities for service quality failure categories are plotted on a
control chart. The results are easily interpreted and the approach allows for the quantitative
evaluation of customer acceptance of system process improvement initiatives.
Keywords Analysis of text data, big data, control charts, customer satisfaction, latent semantic
analysis (LSA), statistical process control (SPC), text mining.
For the final published version of this paper, See Quality & Quantity DOI DOI 10.1007/s11135-014-0036-5
available at http://link.springer.com/article/10.1007%2Fs11135-014-0036-5
1. INTRODUCTION
As a result of growth in the internet and personal communication devices, businesses of all types
now inexpensively collect information from customers on a continuous basis. Companies
including manufacturers, service providers, retailers, and online businesses are soliciting input
from customers about their experience, the products, and services. Increasingly, the collected
data consist of open-ended text and includes information that describes the customer’s
perceptions about service or product attributes. Fig. 1 shows that this information provides
important and powerful feedback on the firm’s offerings. In fact, in a recent survey of data
management professionals, 35-percent of firms indicated they were now collecting unstructured
data that included textual information (Russom, 2011).
Tremendous strides were made in recent years to automate the analysis of unstructured
text data. Those efforts primarily focused on technology for decomposing corpora’s with varying
latent structural dimensionality into smaller collections based on dimensions (Deerwester et al.,
1990), factors (Sidorova et al., 2008), topics (Arora et al., 2012), sentiment (Liu, 2012), or
concepts (Shehata et al., 2013). While these latent semantic approaches sort the elements of the
corpus into subgroups that possess similar characteristics, these smaller groups of data are not
directly consumable in a decision making environment. Generally, in order to have the results of
the text analysis process produce an actionable output, subsequent analysis is required to link the
data to actions.
Complexities in the analysis of unstructured textual data often results in only minimal use
of the data. Frequently, the lack of use of text data occurs because the results are not presented in
a form consumable by either the business leadership or the engineering communities. For
example, in the quality assurance area a plethora of system monitoring tools exists. The path
shown with the dashed arrow in Fig. 1 depicts the needed link for methods to extract meaningful,
substantive information from textual data analysis results and generate monitoring tools that
complete the feedback loop. Therefore, a firm that is determined to discover the data’s hidden
value might be required to commit considerable resources to the effort using traditional manual
analysis methods.
One proposal (Ashton et al., 2013a), suggests the application of control charts to the
solution generated by a text mining algorithm. That work proposed the use of EWMA and p-
charts and discussed how the charts tracked different statistics from the solution as well as
reviewed different methodological issues with the implementation of control charted text. This
work extends and builds upon the agenda posited in Ashton et al. (2013a) by introducing quality
control of service quality through customer generated text data.
In this study, we propose an approach that uses latent semantic analysis (LSA) to reduce
raw, unstructured textual information to a set of latent semantic factors or topics. Broadly
speaking, when these topics are derived from a business origin corpus, many will quantitatively
describe service quality, product quality, service characteristics the customer prefers, and desired
product design characteristics. We consider each of the discovered topics that constitute a
category of nonconformity as topics that require further investigation and analysis. These
categories of nonconformity include topics that explain sources of customer satisfaction or
dissatisfaction with a business process or offering.
Fig. 1. Product or service quality monitoring through analysis of customer comments.
Presenting text data on control charts is a tool usable by both the manager and engineer.
Control charting text data allows management to understand the severity of nonconformity
categories, conducting comparison among the categories of nonconformity, and prioritizing the
categories for subsequent corrective action. Such charts are valuable to operation managers
because they limit the likelihood of ad hoc responses to a few isolated comments. While this
control charting text procedure will only be demonstrated in the service quality domain, the
procedure potentially works just as well with data in which the customers discuss desired design
characteristics. The desired characteristics of design will fall out as discussion topics and the
chart parameters document how prevalent the discussion of the characteristic is among the
corpus members. Such results could then guide design and redesign efforts, as well as product
positioning.
One added benefit, when the stage division technique is applied, the resulting control
chart allows judging the success or failure of process improvement projects. Staging divides the
Product/
Service
Manufacturer/
Service Provider
Feedback
Text mining of
customer
comments
Customers
Customer
comments
control chart based upon project implementation timing. The parameters of the chart are then
recomputed based on new observations. When a process improvement project is successful, the
process mean will reflect the improvement.
The rest of this paper is organized as follows: related work is discussed in Section 2; we
review latent semantic analysis in Section 3; we describe the data used in the experimental phase
and briefly review LSA’s decomposition of the data in Section 4; we then review the
characteristics of the working data that is applied to the SPC process in Section 5; we review
control chart alternatives in Section 6; we develop our text data example with control charts in
Sections 7 and 8; and we provide the conclusions for our work in Section 9.
2. RELATED WORK AND PROPOSED METHODOLOGY
Little related work exists for extending text analytics beyond the decomposition phase. The first
attempt at applying text analytic results to a statistical process control tool occurred in Lo (2008).
That work proposed a new tool known as Web-complaint Quality Control (WebQC). WebQC
examined customer originated comments that were collected through a website message board.
The collected data was analyzed with a support vector machine (SVM) which decomposed the
corpus into one of four categories or topics including disorder codes, technical problems,
dissatisfaction reports, and other. Once the corpus was analyzed by the SVM, all topics that
consisted of customer comments that related to complaints and technical problems were
combined or pooled into one single data stream and then plotted on a p-chart.
Text mined data was applied to the exponentially weighted moving average (EWMA)
control chart in Ashton et al. (2013b). Unlike Lo (2008), once the corpus was decomposed, the k
extracted topics retained their separate identities and were plotted onto k EWMA charts. The
advantage of this strategy is that each control chart graphically displays the system state
regarding a specific disconformity cause. That work was premised on the fact that eigenvector
component values are measures of length given by the Euclidean norm

 . (1)
With this interpretation, the factor loadings generated by LSA were averaged based on a variable
sized weekly sample and then loaded to the EWMA chart.
While these methods of extending text mining to statistical quality control tools are
sound, they have limitations. In Lo (2008), the resulting chart loses its specificity with regard to
the cause for an out of control condition due to the pooling. Lo’s control chart will alert
management to a problem; however, the specific issue(s) that caused the out of control condition
are lost. Ashton et al. (2013b) corrects for that issue; however, the plotted values are the loading
values. There are a number of fundamental problems with that method which are described as
limitations in Ashton et al. (2013a). These problems are a result of the nature of text based
communication and the principle of least effort (Zipf, 1949). Zipf (1949) states that a person
attempts to communicate efficiently with the least effort. Consistent with the principle, we find
customers of the business enterprise will frequently express an opinion in a terse fashion; yet
other customers are extremely elaborate, detailed, and even to the point of being verbose when
expressing the same opinion. When the EWMA charting strategy described in Ashton et al.
(2013b) encounters these two uniquely styled writers, the first is usually scored with a much
lower factor value than the second. Therefore, the two communications each expressing the
same level of non-conformity, are charted with dissimilar values. The EWMA method allocates a
level of severity to the non-conformity based on how completely the author replicated the list of
terms that defined the topic in the analytic solution and not to the degree of true perceived non-
conformance. For example, the terse customer comment, “late deliveries” receives a lower
loading value, hence lower plotting point on the control chart, than the verbose (and more
complete list wise) customer comment “your logistics processing caused the delivery of the baby
dolls I ordered for my daughter’s birthday to be late‼” when the topic defining terms are logistic,
process, late, order, and delivery.
To address these limitations, we first propose that the factor solutions be kept separate.
Then each factor represents a specific issue. Second, since few authors use the entire collection
of terms that define a topic, we consider all documents that meet a certain threshold equally
representative of the topics definition. Then we discretize the documents factor-loading values
with those exceeding some minimum threshold coded as ‘one’ and those below this threshold
coded as ‘zero’. We are then able to use the p-chart with each chart representing non-conforming
observations for a specific cause.
The p-chart method also addresses another limitation of the prior techniques. As
described by attribution theory (Heider, 1958), a dissatisfied customer may expresses many
reasons or justifications for dissatisfaction. In such a situation the customer is often rationalizing
their opinion. When this happens, they write about more than one of the latent factors that exist
in the corpus. This is a common occurrence in actual textual data sets. During the singular value
decomposition phase of LSA, because two factors are present, lower eigenvector component
values are computed for the document against each of the factors. Therefore, the presence of the
second factor in the document reduces the eigenvector for the first factor and vice-versa. In the
EMWA instance, one document is scored against or accounted for in the control chart of each of
the disconformity causes however, with low values. Discretizing fixes this limitation and allows
multiple causes for non-conformity that appear in a text to be fully valued in each corresponding
control charts.
Attribution is a multifaceted issue. In the data used in the subsequent analysis, customers
in 28% of the observations attribute two causes for closing their account, 11.5% cite three
causes, 3.2% report four, 1% cites five, and 0.07% cites six causes for account closure. Only
35% of respondents report exactly one reason for closing their account. In total, 8215 customers
cited 11,937 justifications for closing their account.
3. LATENT SEMANTIC ANALYSIS
In this section, we review only those key equations of LSA required to support the analysis
undertaken. For the reader wishing a more detailed overview of LSA, we refer them first to
Appendix C of Deerwester et al. (1990), Sidorova et al. (2008), and Manning et al. (2008). For a
more extensive discussion on matrix decompositions, see Schott (2005). For a more extensive
discussion on LSA base transformations see Hu et al. (2007).
LSA was originally introduced as latent semantic indexing for information retrieval
queries Deerwester et al, (1990). LSA was subsequently proposed as a theory in psychology and
cognitive science (Evangelopoulos 2013) for extracting and representing word meaning via
latent concepts Landauer (2007). Sidorova et al. (2008) expanded on this approach introducing
matrix rotations in order to extract interpretable factors. In this paper, we build upon the
recommendations proposed by Evangelopoulos et al. (2012) for the factor analysis variant of
LSA to fit the unique context of the present control chart application.
LSA starts with the compilation of a matrix Xt×d, a t×d frequency count matrix where t is
the number of terms and d is the number of documents in the collection, known as the vector
space model (Salton, 1975). In our work, the documents are the individual customer comments
that originally produce a 4786 × 8215 raw Xt×d matrix. Matrix X is typically reduced in size by
eliminating common or low information value words such as prepositions and conjunction
coordinators (and, as, at, but, by, for, from, in, nor, of, on, or, so, to, with, and yet), retaining
only terms with frequencies greater than one (Deerwester et al, 1990), and then weighted and
normalized (Salton and Buckley, 1988). Our final Xt×d matrix after normalization was 256 ×
8215. It is then decomposed as  
using Singular Value Decomposition
(SVD), the extension of principal components analysis that estimates simultaneous least squares
principal components for two sets of variables. Ut×r is a term-by-factor matrix of eigenvectors of
XXT, the t×t term covariance matrix, that define r semantic themes in the data. The terms
associated by these eigenvectors can be used to interpret or define the latent semantic factors.
Σr×r is a diagonal matrix of singular values. 
is a factor-by-document matrix of eigenvectors
of XTX, the d×d document covariance matrix, that associates the original documents to the
factors.
The SVD products are truncated to a reduced space of only the first k singular values,
Σ1,…,Σk, such that
 
is the best rank k least squares estimate of Xt×d. In
the data set used in this study, k = 20 was used for this estimate.
The outcomes of interest from SVD are the two loading matrices, LT, a k matrix of
term loadings, and LD, a d×k matrix of document loadings. Using the orthonormality property
VTV = I, where I is the k×k identity matrix, we obtain
 
. (2)
Matrix LT associates the various terms with specific factors. This term-factor association
is used to facilitate the factor labeling process. To obtain document factor loadings, we use the
orthonormality property UTU = I, where I is the k×k identity matrix, and Σ = ΣT. Post-
multiplying both sides by Uk gives
   

. (3)
Now given any k×k orthonormal factor M, having the property MMT = I, we can obtain
rotated document loadings  .
4. EXPERIMENTAL DATA
The data used in this analysis comes from a Fortune 500 specialty retailer offering an online
service to U.S. customers. Because this service provider is a member of a highly competitive
industry, various facts were altered to prevent revealing the provider’s identity. In the discussion
that follows, a descriptive word in square brackets [ ] replaces service and product names, e.g.
[SKU], to mask the identity of the firm. Certain figures that could disclose the identity of the
firm are modified proportionately without altering meaningful ratios.
The original data set consists of over 820,000 comments made by customers that had
decided to cancel their subscription service with the provider over a 27-month period. As a step
in the cancellation process, the company asked the customer why they were cancelling their
subscription service and provided an open-ended text box in which the response was recorded.
Customers received no reference cues; however, they were forced to provide text in the box in
order to complete the cancellation. Because the original data set is exceptionally large, the
analysis presented here considered a one-percent random sample.
Throughout the 27-months of data collection, the service provider was proactive with the
data using traditional analytic techniques, e.g. manually reading, interpreting, and running follow
up queries based on analyst interpretations. The firm also actively introduced strategic initiatives
that affected various business processes in response to nonconformities found in the customer
comments. These initiatives affect attempts at quality control charting because they change the
underlying business process and therefore will be used to establish stage change points in the
control charts. We have a list of initiatives that we expect to cause shifts in the mean as reported
by the firm in Table 1.
After trying a number of alternative factor solutions with LSA, the original customer
comments were reduced to a series of 20 factors (k = 20). Determining an optimal number of
factors remains an open problem in text mining research (Bradford, 2008). However, the service
provider’s feedback was that they found the 20 factors helpful in understanding the customer
comments. After SVD, we examined the high loading terms found in the LT matrix and labeled
the factors. We had three experts in quality independently review and label (define) each factor.
A subsequent meeting among the experts resulted in a labeling consensus.
The twenty factors extracted by LSA with definitions, total frequency count, and
Table 1 List of key initiatives
Nr
Week
Initiative
1
1
Started with K1 distribution centers, 2,500 [SKUs]
2
7
Opened K2 new distribution centers
3
9
Increased to 3,000 [SKUs]
4
22
Major software revision: billing, shipping, website all affected
5
24
Increased price
6
26
Increased number of [SKUs] to 4,000
7
38
Opened K3 new distribution centers
8
47
Increased number of [SKUs] to 5,000
9
48
Started shipping [SKUs] from brick-and-mortar stores
10
54
Major software revision: billing, shipping, website all affected
11
58
Increased price
12
68
Introduced lower-priced subscriptions
13
79
Major software revision: billing, shipping, website all affected This version crashed badly.
14
89
Introduced the ability to put subscription on hold.
15
92
Increased number of [SKUs] to 6,000
16
100
Increased number of [SKUs] to 6,500
percentage of variance explained, are listed in Table 2. When a factor definition, generated in
the text mining process, describes a failure in a business process or product, each observation in
that factor is considered a fault or nonconformity in that business process. The rest of this
analysis focuses on a subset of six factors (j = 6) that describe service quality attributes of a
business process. The six quality oriented factors are labeled F3, F6, F7, F11, F14, and F16.
They represent 31-percent of cancellations and are the domain of a logistics business process. Of
the remaining factors, most are personal reasons for not continuing, e.g. do not use the product
enough (F1), moving (F2), or simply prefer the competitor (F9).
At first glance, it is tempting to combine factors 6, 7, and 11 into one factor and describe
it as “shipping delays”. However, a manual reading of the customer’s comments reveals clearly
that the LSA methodology in this instance was able to delineate a difference between these
factors. It separated those dissatisfied with the mail delivery process (F6), from those
disgruntled with delays caused in shipping and receiving area (F7), and from those dissatisfied
because of delays in product available (F11). Such differentiation in logistic services is likely
helpful for developing appropriate interventions.
Factor F3 is a combination of positive and negative sentiments similar to that described in
Hu and Liu (2004) and Bollegala et al. (2013). It requires additional categorizing before
analysis. While analysis through control charting of qualitative data with mixed sentiment is
noteworthy, we elected to simplify this demonstration and will leave control charts in bimodal
data to future research. This leaves five factors of data F6, F7, F11, F14, and F16, all describing
topics of business processes in the logistics function and with singular semantic sentiment for use
in this illustration.
5. WORKING DATA FOR CONTROL CHARTS
The relationship between the individual customer comments and the factors was established
based on the factor loadings in the LD matrix. As one scans the data in the columns of factor
loading data found in the LD matrix, the data does not abruptly terminate. At the top of the
column are those documents that most closely relate to the factor because of the terms the author
(customer) used. As one scans the column data, variation in the mixture of terms used by the
author reduces the association of the document to the factor. In the case of weaker loading, the
author might be more terse, verbose, or abrupt in writing style about that factor. LSA being a
mathematical procedure then associates the document to the factor but at a reduced factor
loading when the term usage does not closely match the composition of the factor in the LT
matrix.
Documents with weaker factor loadings do not share a strong association with the
semantic definition of the factor. However, these documents are mathematically associated with
the factors because they share a limited number of the factors terms. We consider this noise
within the factor. To prevent the injection of noise into the analysis, a minimal threshold value
was established and only those documents that exceeded the threshold are considered high
loading and were included in subsequent analytic steps. Initially, threshold values θj for factor
loading components were determined empirically for each factor j by identifying the level at
which comments stopped being relevant to the factor. Because of the small variation in the
threshold values across all the factors in our data set, a common threshold value, θD, equal to
0.2702 was used. This value represents the point where, on average, each document loads on
exactly one factor.
For an analysis of the distribution of the LD matrix, we convert the LD matrix to a
discrete form by considering the comments whose loading exceeds the threshold value θD. When
the comment loading value exceeds θD it is recoded as 1 while all comments with loadings that
fall below θD are coded 0. Then, we let nj be the count of high-loading comments for factor j,
defined using the indicator function  that compares the loading of ith comment on the
jth factor (lij) to θD,
    

 . (4)
The sum of all high loading comments is fixed to N based on the θD threshold’s value. Then, the
observed probability for any given factor j, i.e., the observed probability that a randomly selected
Table 2 Factors resulting from LSA based text mining
Factor
Label
Frequency
Count
Percent
Variance
Explained
F1
I do not use enough [SKUs]
770
9.86
F2
I am moving
390
6.91
F3
The service (mixed comments)
475
6.72
F4
Prefer [service] from store
573
5.92
F5
Do not use enough to make it worthwhile
517
5.87
F6
Wait time too long for [SKU] to arrive (mail)
470
5.63
F7
Delays in shipping and receiving
545
5.61
F8
Duplicate, fraud, or unused account
444
5.48
F9
Prefer competitor
452
5.07
F10
I do not have enough time to use
330
4.97
F11
Product availability waiting time is too long
415
4.66
F12
The service is great (with sub reason)
412
4.39
F13
Will return after an extended absence
442
4.16
F14
The customer wants a more extensive selection
310
3.79
F15
Trial membership ending, not continuing
279
3.78
F16
Received damaged products
317
3.67
F17
Just wanted to try the service
312
3.64
F18
Costs too much. Budget constrains
257
3.41
F19
Price increases. Want lower prices
254
3.25
F20
Sign-up issues. (Accidental sign-up, did
not sign-up, will sign up later)
251
3.19
Total
8222
100
Note: Quality factors are in bold.
comment has a loading that exceeds threshold θD, is the count of all comments loading on factor
j over the total high loadings count N is
, (5)
where pj ≥ 0, and  
 for j = 1,…,k. Each comment may discuss factor j with
probability pj independent of whether or not it discusses factor i with probability pi, when j i.
Therefore, the k factors correspond to k independent Bernoulli processes characterized by the
random variables Bj, j = 1,…,k. Table 3 illustrates the structure of our data set. The left panel
shows factor loadings for N comments (shown as rows) and k factors (shown as columns). Each
comment can load highly on zero, one, more than one, and up to k factors. Looking at Table 3 in
a column-wise fashion, the total loadings count for each factor is nj as shown in (3). At the data
collection point, had the company required that each customer comment pick 1 out of k available
issues, the distribution of customer-reported issues would have been multinomial over k
outcomes. However, the current approach allows the issues to occur naturally in the voice of the
customer in its original, unstructured text form, leads to a set of k independent Bernoulli
variables Bj, j = 1,…,k, representing whether or not each of the k identified concepts is raised by
the customer comment at a significant level. Looking at each factor separately allows us to
construct a control chart where the process that produces comments that load high on factor j is
monitored. The basis for such a control chart is the binary data shown on the right panel of
Table 3. The k independent Bernoulli variables shown in Table 3 are not identical because they
have different probabilities pj.
Table 3 k independent Bernoulli random variables and their relationship to the factor loadings
Independence of the processes is evaluated with Pearson’s correlation coefficient. Of the 190
possible correlations, only 4 have |R| 0.1, and only 13 fall in the range 0.05 |R| ≤ 0.01. Of
those 17 correlations the highest coefficient is 0.268. Therefore, the assumption of independence
appears reasonable.
The time series nature of the comments allowed us to perform counts of the observations
i for each factor j within each time period t (in our case study, one week) that exceed the
common threshold θD. The number of high loading documents (customer comments in this
application) is
 
      (5)
where  is an indicator function for the ith comment on factor j during period t that
load greater than or equal to the loading threshold and nt is the total number of customer
comments sampled in period t. In this case study, the number of factors is k = 20, and the number
of periods (weeks) is T = 116. The counting process also included a cumulative count of all high
loading documents per time period t across all factors j


   . (6)
LSA Factor loadings
Bernoulli outcomes
Comment
F1
F2
Fk
Trial
B1
B2
Bk
C1
l1,1
l1,2
l1,k
1
0
0
1
C2
l2,1
l2,2
l2,k
2
1
0
0
CN
lN,1
lN,2
lN,k
N
1
1
0
High-loading
comment count
n1
n2
nk
Success count
n1
n2
nk
P(success)
n1/N
n2/N
nk/N
The counts in (5) and (6) are combined to produce the proportions of high-loading documents
specifically addressing factor j over all high-loading documents in time period t
 
    . (7)
For a first overview of the working data, Fig. 2 presents a time series plot of the high
loading documents Dj,t as defined in (5), for factor j = 6. Fig. 2 shows a high count level in the
first six weeks as well as the increased activity in weeks 41, 85, and 95. Throughout this 116-
week period, the service provider membership was increasing, rendering a traditional time series
plot ineffective for presenting the data. Fig. 3 presents the proportions of high-loading
documents as defined in (7). An overall decreasing trend is observable in Fig. 3 which was not
present in Fig. 2. Also note that the increased activity now appears in weeks 41, 58, and 88 when
we consider the proportions of high-loading documents. While for some other applications the
total counts would be valuable in this case we focused on the proportions. This makes more
sense in this application because the organization had a growing number of customers over the
observed 27 month period. In different business environments it is potentially advantageous to
monitor the counts as well as the proportions.
Fig. 2 Time series plot of factor F6 observations
Given the Dj,t and mt variable data, we are now ready to transition our approach from the
LSA to SPC issues. In the next section, we show how control charts are used to monitor the
factor data. We also describe how such monitoring allows evaluation of the company’s
initiatives via stage testing.
Fig. 3 Time series plot of factor F6 as a proportion of all observations
6. CHART SELECTION FOR TEXT
At this point, the data are fully converted to binomial count data, defined as nonconforming for a
specific cause, and loadable to the chart of the researcher’s choice. The p-chart was selected for
this illustration for several reasons. First, the objective of the analysis is to demonstrate the
application of control charts to the monitoring of text-mined customer comment data and a basic
p-chart allows such demonstration without the complication of having to explain and develop a
more complex control chart. Second, the p-chart is a fundamental control chart for examining
this data and is well understood by many professionals from a variety of specializations. While
the np-chart would have performed equally well to the p-chart, the c-chart and u-charts are
inappropriate in this study because we consider each observation a nonconformity and we are not
counting the nonconformities per unit. An added advantage of the p-chart is its straightforward
interpretation it shows variations as a fraction of nonconforming output (Duncan, 1986). From
a practitioner’s perspective, the chart is easy to construct, easy to interpret, and well understood
(Montgomery, 1996).
One limitation of the p-chart is that a sufficient sample size n is required to avoid
alarming on a single or small number of nonconformities if , the actual process value, is low
(Duncan,1986). One strategy for avoiding small values of n is to adjust the time frame used for
taking samples. Our data set, like many customer comment data sets, was exceptionally large
and did not present a sample size problem. In fact, our data set was sufficiently large (>820,000)
to allow sampling on a daily basis. However, in most business organizations, performing this
analysis daily is not recommended. Daily review may not reveal central themes and as a result
can miss changes that are strategic in nature. However, monthly processing is also not
recommended because such a time period may delay detecting shifts caused by competitor
actions. For example, while a customer may be satisfied with your delivery time today, they
may be even more satisfied with a competitor that suddenly cuts the delivery time. When
compared with machine processes, customer sentiment typically develops over longer time
periods. Finally, the analytic processing may require significant commitment of human
resources. Skilled staffing was self-reported as a barrier to conducting big data analytics by 46-
percent of the respondents in a recent survey (Russom, 2011). For these reasons, we
experimented with different alternatives and chose to use weekly interval sampling because it is
a convenient natural time period that fits the process change timing concerns. The weekly
sampling plan worked out nicely resulting in samples of 30 to 60 observations per factor in the
early period of the analysis. The number of observations per factor increased to a range of 100
to 190 per week in the latter time periods because of the increase in subscribership. While an
interval sampling plan could be implemented, with samples of size n taken at specified times t,
the chart also works well with variable sample sizes. This aspect of p-charts allows us to
automate processes and, if desired, implement a 100-percent sampling plan.
Since each factor defines a different issue with a business process, we plot the data of
each factor separately. The p-chart uses as a base line the number of nonconforming units Dj,t
summed across all factors j, divided by the sample size mt such that (Montgomery, 1996)


   . (8)
In many settings, e.g. service quality, data populated on control charts will not have
predetermined objective measurements and the researcher or management has to establish a
standard acceptable level of p. In this research, we use the mean of the data within each stage as
the centerline. For control limits, we use the three-sigma standard given by (Montgomery, 1996)

. (9)
Because we use stages in the charts, shifts in the mean are tested under the hypothesis
 and   using the z statistic
    
 
 
  (10)
where   (Montgomery, 1996). We evaluate results at a 5-percent
significance level given by any standard Z-table Z0.025 = 1.96.
7. P-CHARTS AND STAGE TESTING
The quality related factors in Table 2 (F6, F7, F11, F14 and F16) each describe a specific quality
issue found within the customer comments. Three of those five factors involve time. The time
oriented factors are the waiting time is too long for mailed products (F6), delays in shipping and
handling process (F7), and availability wait time for preferred products is too long (F11).
Logistics activities provide the place and time utility (Stock and Lambert, 2001) and each of
these factors could be considered a nonconformity category in time utility.
During data collection, the company experienced several strategic changes and process
changes that altered the mix of the logistics system and as a result, could potentially affected the
time utility and more specifically the customer service level. Of the items in Table 1, six changes
have the potential to shift the mean within these time oriented factors. These six changes include
the opening of additional distributions centers in weeks 7 and 38. These actions pushed supply
closer to the customer and theoretically reduced the transportation time.
The initiation of shipping from local brick and mortar sites in week 48 also pushed
supply closer to the customer reducing transportation time. However, this initiative could
possibly conflict with the store employee’s core competency. The software revisions in weeks
22, 54, and 79 were key initiatives that affected all facets of the business. If a shift in the mean
occurs, then these events also signify a stage change in the p-charts. Process changes create a
potentially uncomfortable experience and are often associated with a startup learning curve.
Furthermore, intentionally undertaking a change for improvement requires a well-defined
process (Evans and Lindsey, 2008). Therefore, there is a possibility that changes in the
distribution options could potentially increase the process mean if the implementation went badly
or if the change were poorly received.
Fig. 4 presents the final staged p-charts for factors 6, 7, and 11. All computations
associated with these factors are in Table 4. During stage testing, the initiation of product
Fig. 4. p-charts of factors F6, F7, and F11 with associated stage shifts
shipments from stores (week 48) and the second software roll out (week 54) were not significant
at α = 0.05 level for either factor 6 (mail wait too long), or factor 11 (product availability too
long). Since the initiation of shipments from local stores in addition to the existing distribution
centers reduced the distance from product to customer, this process change was expected to
reduce customer departures because of mail delivery delays. The failure to reduce the departure
rate due to delays was surprising but potentially attributable to such issues as the learning curve
at the local store, conflicts in priorities at the local store, and possible routing delays with local
mail.
Stage testing in factor 7 (delays in shipping and receiving) found that the software
deployment (week 22) and initiation of shipments from stores (week 48) were not significantly
different with respect to shipping and handling process delays. For factor 7, the technology
applied did not reduce the process mean and customers continued to depart because of delays in
shipping and receiving at the same rate. While technology improvements, such as the major
software deployment in week 22, are typically expected to improve the efficiency of a system,
the system efficiency is limited by its weakest link.
Table 4 Parameter estimates and z-testing of stages for factors 6, 7, and 11
Stage
Week
UCL
LCL
n1
n2
Z
Conclusion
Factor 6
1
1
0.161290
0.3087
0.0138
2
7
0.107649
0.2626
0
310
706
0.124016
2.388644
Reject
3
22
0.053333
0.1550
0
706
900
0.07721
4.047477
Reject
4
38
0.082452
0.1991
0
900
473
0.063365
-2.10467
Reject
5
48
0.060729
0.2171
0
473
247
0.075
1.050618
FTR
4Rev
38
0.075000
0.2474
0
720
6
54
0.065588
0.1667
0
720
1113
0.06368
-0.41614
FTR
4Rev2
38
0.069285
0.1730
0
1833
7
79
0.037685
0.0863
0
1833
4458
0.048557
6.256838
Reject
Factor 7
1
1
0.122581
0.2541
0
2
7
0.066572
0.1912
0
310
706
0.083661
2.968923
Reject
3
22
0.052222
0.1528
0
706
900
0.058531
1.215931
FTR
2Rev
7
0.058531
0.1647
0
1606
4
38
0.103594
0.2329
0
1606
473
0.068783
-3.40358
Reject
5
48
0.097166
0.2911
0
473
247
0.101389
0.271277
FTR
4Rev
38
0.101389
0.2990
0
720
6
54
0.078167
0.1878
0
720
1113
0.087289
1.720214
Reject
7
79
0.056528
0.1155
0
1113
4458
0.060851
2.701458
Reject
Factor 11
1
1
0.138710
0.2773
0
2
7
0.073654
0.2043
0
310
706
0.093504
3.279601
Reject
3
22
0.037778
0.1240
0
706
900
0.053549
3.169836
Reject
4
38
0.057082
0.1555
0
900
473
0.044428
-1.64975
Reject
5
48
0.072874
0.2430
0
473
247
0.0625
-0.83105
FTR
4Rev
38
0.062500
0.2210
0
720
6
54
0.067385
0.1697
0
720
1113
0.065466
-0.41298
FTR
4Rev2
38
0.065466
0.1664
0
1833
7
79
0.037236
0.0856
0
1833
4458
0.045462
4.88408
Reject
In all three charts, only three violations of the upper control limit are present. In the F6
chart (Fig. 4), weeks 86 and 87 require evaluation by management. These out of control states
are potentially associated with the software revision initiated in week 79. That revision, installed
seven weeks prior, initially crashed badly. We would not expect customers to respond instantly
to a software failure that impacted distribution but to become irritated over time. This software
change affected all facets of the business and possibly caused the out of control states observed
in weeks 86 and 87. The opening of new distribution centers in week 38 also went badly. Those
openings increased by 50-percent the number of distribution centers, after which, the mean of all
three factors increased. Nevertheless, even though weeks 86 and 87 were out of control, the
mean of the stage initiated at week 79 was the lowest for any stage with respect to mail delays
(factor 6) and product availability delays (factor 11). This supports the contention that the
initiatives under taken by the firm reduced the nonconformities.
Factor 14 (Fig. 5) observations represent nonconformities because these customers
wanted a more extensive product selection than was currently offered. While product mix is a
marketing issue, an increase in product SKU’s carried by the company directly increases the
workloads in the logistics system. There were several occasions during the data collection
period that the company expanded their product offerings. While these increases potentially
reduced the frequency of factor 14 nonconformities, it could also result in nonconformities in
other areas of the logistics system. The increases in product offerings occurred in weeks 9, 26,
47, 92, and 100. Parameter estimates and z-test data are in Table 5 while Fig. 5 has the final
chart of the data. Stage testing found no significant difference in the means of data until the
increase in offerings in week 92. Interesting enough, at that time the mean of the factor shifted
up. Since the company was actively increasing offerings all along, the expectation was that the
mean would shift down. Observations in this chart are possibly best explained by the actions of
a competitor although the information needed to support that hypothesis is not available. The
chart signals out of control conditions only twice at weeks 90 and 91 just before the mean shifted
up. While not out of control, four points within the first stage are worth further consideration.
These are weeks 41, 53, 54, and 85. At various points along the data collection period, the
company’s marketing department was offering free trial subscriptions through nationwide
marketing campaigns. We believe these four observations are a result of a sudden surge in
subscriptions followed by cancellations extending from the experience.
Fig. 5. Final p-chart of factor F14 after stage testing
Factor 16 reflects customers departing because they received damaged products. For
example, changes in operator procedures, training of personnel, or changes in packaging used in
shipments can influence the amount of damages product delivered. Presentation of this data is
through a single stage chart format using the factor grand mean  = 0.0386256 as the centerline
(Fig. 6). The chart signals only one out of control condition when in week 28 it violates the
upper control limit, but 4 different observations in weeks 42, 43, 44, and 50 exceeded nine points
in a row on the same side of the line. In addition, the tail area beyond week 85, which visually
appears to have a tighter standard deviation the cause of which we cannot determine. This pattern
occurs in all the charts and might suggest stability in the system.
Fig. 6. p-chart of factor F16 (customers receiving damaged products)
8. EWMA CHARTS AND TEXT DATA
The p-chart has one added disadvantage we have not previously address it detects large shifts
well but not small shifts. In the instance of the data set at hand, this has not been an issue
because the full data set consists of over 820,000 observations. In other settings, the data set may
Table 5 Parameter estimates and z-testing of factor 14
Stage
Week
UCL
LCL
n1
n2
Z
Conclusion
Factor 6
1
1
0.023419
0.0830
0
2
9
0.033069
0.1279
0
427
756
0.029586
-0.94074
FTR
1Rev1
1
0.029586
0.1194
0
1183
3
26
0.021626
0.0976
0
1183
1156
0.025652
1.217375
FTR
1Rev2
1
0.025652
0.1082
0
2339
4
47
0.032712
0.0850
0
2339
2201
0.029075
-1.41506
FTR
1Rev3
1
0.029075
0.0785
0
4540
5
92
0.046473
0.0964
0
4540
1205
0.032724
-3.01766
Reject
6
100
0.049553
0.1050
0
1205
2462
0.048541
-0.40767
FTR
5Rev1
92
0.048541
0.1034
0
3667
not be so robust meaning the p-chart could be disadvantaged. For this study, we also illustrate the
use of the EWMA charts in text data; however, recall from section 2 that a more comprehensive
Fig. 7. Factor F7 (Delays in shipping and receiving). Panel A: EWMA with λ ≤ 0.10 and
L=2.814. Panel B: p-chart
application of the EWMA in text data is available in Ashton et al. (2013). The EWMA chart
preforms relatively better in detecting small shifts, particularly when the values of λ are small
(Montgomery, 1996). In many instance the EWMA is problematic in text data given cross-
loading topics (attribution) and terse text tendencies. For the EWMA chart demonstrated here,
we set the value of λ ≤ 0.10, the control limit L at 2.814, and use a moving range of 3 (Lucas and
Saccucci, 1990). These values are selected to give us the greatest sensitivity to small shifts with
an expected average run length ARL0 500 when no shift is present yet a very short run ARL1
10.3 when the system goes out of control by 1 standard deviation (Lucas and Saccucci, 1990).
For this illustration, only one factor of data, factor F7 is presentation. In this instance, the
EWMA presents an environment that is consistent with that presented by the p-chart (see Fig. 7).
Factor F7’s p-chart (Panel B) is also provided for convenient comparison.
A B
9. CONCLUSION
In this study, we develop and present an approach to evaluate textual data with multiple latent
causes via control charts. We achieve that by extracting the concept structure using LSA. In the
singular value decomposition step of LSA, we extract factor loadings for the quality related
factors, discretize them, and count the number of high-loading customer comments per time
frame of interest. These counts are used to populate the control chart. Since the selected factors
are associated with a specific cause for nonconformity, the resulting control charts allow
evaluation of the company’s efforts to improve its business processes.
This study examined data collected during an on-line subscription cancellation process.
However, the method developed has broader application. In this study, the customer was
required to complete a text box before the cancellation process would complete. Given the
development of the internet, personal communications devices, and social networking, business
oriented data related to product quality, service quality, perceptions, and offerings it is possible
to collect data on a nearly continuous basis. Many such venues use short text responses and rely
on compatible technologies such as short message service (SMS), or limits specified by the
hosting service (Twitter). As a result of the imposed message limits, instrument development
becomes crucial. Cocco and Tuzzi (2012) found a question-wording effect as a result of the
method of survey completion. That study also found that the question-order effect increased
depending on the survey completion mode. Cocco and Tuzzi’s (2012) work was based on Likert-
scale instruments, however, their work suggests instrument development designed for text data
collection should consider the behavioral norms of the technology user. In recent years, effort
has gone into the development of hybrid survey instruments that collect a mixture of Likert-scale
and open-ended question data (Fielding et al., 2013). While our current method was developed
with data exclusively generated from open-ended question, applying the control charting
technique presented here with the text portion of a hybrid instrument is a potentially valuable
approach.
Consistent with the text analysis method presented, two business oriented text analytics
strategies present themselves. The first strategy depends on a text only response instrument
designed consistent with Cocco and Tuzzi’s (2012). As a first step of analysis, apply one of the
many sentiment analysis techniques (Feldman, 2013; Scharkow, 2013). Separate the respondents
based on the sentiment score into at least three different subsets reflecting positive, neutral, and
negative sentiment. Text-mine the three subsets independently thereby uncovering the cause(s)
for the opinion. Finally, monitor the results via control chart.
The second strategy depends on the development of a hybrid instrument (Fielding et al.,
2013). When the data are analyzed, text responses could be compartmentalized into subsets
according to the Likert response. The Likert response item essentially functions as a sentiment
analysis score and allows the respondents to self-select among categories such as reflecting
positive, neutral, and negative. Text mining of these separate subsets then would reveal the
cause(s) of the sentiment. Those results could then be monitored by a control chart.
If either of these instruments were deployed into a mobile environment, then it is likely
the instrument items will need to be consistent with Cocco and Tuzzi’s (2012) approach. Such an
instrument could be deployed into the market on a regularly scheduled and predefined period of
time generating as many samples as necessary.
Selection of a particular control chart for text mined data is situational dependent and
compounded by the data structure and the monitoring objectives. In the present research, the
customer comments frequently attribute multiple causes for nonconformity in their comments.
Consistent with such data we describe the occurrence of comments that are relevant to
nonconformity categories by a number of independent Bernoulli processes. In other applications
it is possible that the survey subject or survey questions are potentially narrow in scope and can
result in data that consists of exactly one explanation selected out of a set of available
nonconformity explanations. In such a case, the process would be multinomial. Alternatively,
when the objective is to monitor how many nonconformities occur per comment the data can
also be collapsed into counts of nonconformities per comment. In such a case the Poisson
binomial distribution would be applicable because it describes the sum of k independent non-
identical Bernoulli processes. This study is an important effort toward opening the possibility of
such options for future work.
The present work makes an important contribution to the body of literature because it is
the first work that takes unstructured textual feedback and establishes a mechanism for
transitioning the analysis of that text into actionable SPC. In a traditional analytic approach to
textual data, manual content analysis is labor intensive and more subject to interpretation bias
than the LSA control chart approach we develop and illustrate in this work. This LSA based
approach represents a contribution to manufacturing and service quality because it provides
managers and researchers a means to systematically analyze text based data without the bias and
labor intensive problems associated with traditional manual text analysis systems. The use of
raw textual data is also potentially superior to the use of structured survey instruments because in
many instances, the extracted factors represent nonconformities that were not previously
anticipated and analysis can proceed to root cause analysis. It is also a reasonably efficient
process given the power of current desk top computers and likely to become of increasing
importance in this era of Big Data and the emerging era of mega data.
REFERENCES
Arora, S., Ge, R., Moitra, A.: Learning topic models going beyond SVD. In Proc. IEEE 53rd
Annu. Symp. on Found. of Comput. Sci., New Brunswick, NJ, 1-10 (2012)
Ashton, T., Evangelopoulos, N., Prybutok, V.: Extending monitoring methods to textual data: A
research agenda. Qual. Quant., (2013a). doi 10.1007/s11135-013-9891-8
Ashton, T., Evangelopoulos, N., Prybutok, V.: Exponentially weighted moving average control
charts for monitoring customer service quality comments. Int. J. of Serv. and Stand. 8(3),
230-246 (2013b) doi 10.1504/IJSS.2013.057237
Bollegala, D., Weir, D., Carroll, J.: Cross-domain sentiment classification using a sentiment
sensitive thesaurus. IEEE Trans. Knowl. Data Eng., 25(8), 1719-1731 (2013)
Bradford, R.: An empirical study of required dimensionality for large-scale latent semantic
indexing applications. CIKM ’08: Proc. of the 17th ACM Conf. on Inf. and Knowl. Manag.,
ACM, pp. 153-162, New York (2008)
Cocco, M., Tuzzi, A.: New data collection modes for surveys: a comparative analysis of the
influence of survey mode on question-wording effects. Qual. Quant., 47, 31353152 (2013)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic
analysis. J. of the Am. Soc. for Inf. Sci., 41(6), 391-407 (1990)
Duncan, A.: Quality control and industrial statistics (5th ed.). pp. 436-451. Irwin, Homewood,
IL. (1986)
Evangelopoulos, N.: Latent Semantic Analysis. Wiley Interdisc. Rev.: Cogn. Sci., 4(6), 683-692
(2013). doi 10.1002/wcs.1254
Evangelopoulos, N., Zhang, X., Prybutok, V.: Latent semantic analysis: Five methodological
recommendations. Eur. J. of Inf. Syst., 21, 70-86 (2012). doi 10.1057/ejis.2010.61
Evans, J., Lindsay, W.: Managing for quality and performance excellence (7th ed.). pp. 462.
Thomson/South-Western, Mason, OH, (2008)
Feldman, R.: Techniques and applications for sentiment analysis. Commun. of the ACM, 56(4),
82-89 (2013)
Fielding, J., Fielding, N., Hughes, G.: Opening up open-ended survey data using qualitative
software. Qual. Quant., 47, 32613276 (2013)
Heider, F.: The psychology of interpersonal relations. New York: John Wiley & Sons (1958)
Hu, M., Liu, B.: Mining and summarizing customer reviews. Proc. of Knowl., Discov., and
Datamining, pp.168-177. Association for Computing Machinery, Seattle, WA. (2004)
Hu, X., Cai, Z., Wiemer-Hastings, P., Graesser, A.C., McNamara, D.S.: Strengths, Limitations,
and Extensions of LSA. In: T. K. Landauer, D.S. McNamara, S. Dennis, W. Kintsch, (Eds.),
Handbook of latent semantic analysis, pp. 401-425. Lawrence Erlbaum Associates, Mahwah,
NJ (2007)
Landauer, T.: LSA as a theory of meaning. In: T. K. Landauer, D.S. McNamara, S. Dennis, W.
Kintsch, (Eds.), Handbook of Latent Semantic Analysis, pp. 1-32. Lawrence Erlbaum
Associates, Mahwah, NJ (2007)
Liu, B.: Sentiment Analysis and Opinion Mining. San Rafael, CA: Morgan & Claypool, (2012)
Lo, S.: Web service quality control based on text mining using support vector machine. Expert
Syst. with Appl., 34, 603-610 (2008)
Lucas, J., Saccucci, M.: Exponentially Weighted Moving Average Control Schemes: Properties
and Enhancements. Technometrics, 32 (1), 1-12 (1990)
Manning, C., Raghavan, P., Schütze, H.: Matrix decompositions and latent semantic indexing.
In: Introduction to Information Retrieval, pp. 403-420. Cambridge University Press, New
York, NY (2008)
Montgomery, D.: Introduction to statistical quality control (3rd ed.). pp. 252-266. John Wiley &
Sons, New York, NY. (1996)
Russom, P.: Big data analytics. TDWI Best Practices Report. The Data Warehouse Institute,
Reston, Va. [Online]. Available: http://tdwi.org/research/2011/09/best-practices-report-q4-
big-data-analytics.aspx (2011, Fourth Quarter).
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. and
Manag., 24(5), 513-523 (1988)
Salton, G.: A Vector Space Model for Automatic Indexing. Commun. of the ACM, 18(11), 613
620 (1975)
Scharkow, M.: Thematic content analysis using supervised machine learning: An empirical
evaluation using German online news. Qual. Quant., 47, 761773 (2013)
Schott, J.: Systems of linear equations. In: Matrix analysis for statistics (2nd ed.), pp. 221-254.
John Wiley & Sons, Hoboken, NJ (2005)
Shehata,S., Karray, F., Kamel, M.S.: (2013) An efficient concept-based mining model for
enhancing text clustering. IEEE Trans Knowl. Data Eng., 22 (10), 1360-1371 (2013)
Sidorova, A., Evangelopoulos, N., Valacich, J., Ramakrishnan, T.: Uncovering the intellectual
core of the information systems discipline. MIS Q., 32(3), 467-A20 (2008)
Stock, J., Lambert, D.: Strategic logistics management (4th ed.), pp. 9-29. McGraw-Hill/Irwin ,
Boston, MA (2001)
Szarka, J., Woodall, W.: On the equivalence of the Bernoulli and geometric CUSUM charts. J.
of Qual. Technology, 44(1), 54-62 (2012)
Zipf, G.K.: Human behavior and the principle of least effort. Addison-Wesley Press,
Cambridge, MA (1949)
... Methods such as content analysis are time-consuming, which makes it difficult to apply to massive datasets. In addition, there are concerns about the subject position and the concomitant analyst or researcher bias (Aureli, 2017;Baumer et al., 2017;Ashton et al., 2014). ...
... Text mining is a group of algorithms capable of extracting information from textual data. Although text mining is usually known as a word pattern explorer within raw texts, it has evolved as a system of meaning discovery, by applying artificial intelligence-based methods (Ashton et al., 2014). It is part of the growing use of tools and techniques based on Artificial Intelligence (AI), allowing researchers to Reassessing human qualitative analysis with AI investigate data from different perspectives given unsupervised approaches that rely solely on algorithms to produce the outputs (Manning, 2015). ...
... It is part of the growing use of tools and techniques based on Artificial Intelligence (AI), allowing researchers to Reassessing human qualitative analysis with AI investigate data from different perspectives given unsupervised approaches that rely solely on algorithms to produce the outputs (Manning, 2015). As it strongly depends on machine analysis, it supposedly overcomes problems associated with content analysis's reliability (Aureli, 2017;Ashton et al., 2014), as well as the "involvement paradox" (Langley and Klag, 2019). ...
Article
Full-text available
Purpose In a context where human–machine interaction is growing, understanding the limits between automated and human-based methods may leverage qualitative research. This paper aims to compare human and machine analyses, highlighting the challenges and opportunities of both approaches. Design/methodology/approach This study applied qualitative secondary analysis (QSA) with machine learning-based text mining on qualitative data from 25 interviews previously analyzed with traditional qualitative content analysis. Findings By analyzing both techniques' strengths and weaknesses, this study complements the results from the original research work. The previous human model failed to point to a particular aspect of the case, while the machine analysis did not recognize the sequence of time in the interviewee's discourse. Originality/value This study demonstrates that combining content analysis with text mining techniques improves the quality of the research output. Researchers may, therefore, better handle biases from humans and machines in traditional qualitative and quantitative research.
... Existing SPM schemes have been used to detect anomalies in complaint frequency/interval [18][19][20][21], sentiment intensity [22][23][24], or both aspects [25]. They can detect the occurrences of complaint anomalies but cannot explain the reasons for these anomalies. ...
... In monitoring customer complaints, existing methods have focused on sudden increases in complaint frequency or sentiment, where complaints are interpreted as negative reviews [20]. Complaint frequency-based monitoring schemes have applied the p chart to detect increases in the proportion of negative reviews [18][19][20][21]36]. The statistical principle underlying the p chart is based on a binomial distribution. ...
... The core logical component of LSA is analogous to traditional factor analysis, and therefore, the number of expected factors extracted, the "S" matrix, can be specified to allow the researcher to control the granularity of the research (Ashton, Evangelopoulos, and Prybutok 2014). Similar to principal component analysis (PCA), SVD produces simultaneous principal components for two sets of variables, contexts (U) and documents (V T ). ...
... There are very few comparative studies that examine topic extraction approaches, and where such studies exist, they tend to suggest LSA and LDA algorithms perform similarly (Cvitanic, Lee, Song, Fu, and Rosen 2016;Ashton et al. 2020;Mohammed and Al-augby 2020;Zengul et al. 2023). When differences are observed, they are likely due to underlying data characteristics of the corpus to be analyzed (i.e., homogeneity, average document length, etc.), the particular hyperparameters selected for algorithmic execution, or the nature of the evaluation task (Ashton et al. 2014). It is also important to note that there are other NLP models (such as Top2Vec, Word2Vec, and Doc2Vec) that researchers have utilized in text-mining studies. ...
Article
The purpose of this study is to review a text topic modeling methodology, latent semantic analysis (LSA), and provide researchers with the requisite knowledge to allow them to learn and implement their own accounting research study using LSA. The authors first provide a brief literature review of prior business and accounting research studies that have utilized the LSA methodology. Using a provided dataset, the authors present details of how to employ LSA in a research study by replicating the mechanics used in an LSA study conducted by Hutchison, Plummer, and George (2018b). Their intent is to present thorough guidance on data selection, the analysis platform, and the necessary steps needed to conduct LSA research. This article also briefly compares LSA with other topic modeling methodologies, presents several accounting research opportunities where LSA could be utilized, and outlines LSA’s limitations. Data Availability: Data are available from the public sources cited in the text; sample dataset is available for download, see footnote 5.
... Existing SPM methods for monitoring customer complaints focused on changes in either complaint frequency (Ashton et al., 2014;Kim & Lim, 2021;Lo, 2008;Zavala & Ramirez-Marquez, 2019) or user sentiment (Liang & Wang, 2019Tasoulis et al., 2018). These schemes help monitor one aspect of complaints, and information regarding the other is lost. ...
... Customer complaints are interpreted as negative reviews in online processes (Kim & Lim, 2021). Some research took negative reviews as non-conformities and employed the p chart, a classical attribute control chart, to monitor the frequency change of negative reviews (Ashton et al., 2014;Kim & Lim, 2021;Lo, 2008;Zavala & Ramirez-Marquez, 2019). The p chart requires a binomial distribution for a fraction of negative reviews. ...
... The context region models focus on the frequency of words appearing within the so-called regions. Regions could be thought of as documents, and they usually refer to the entire user-generated expression (Ashton et al. 2013). Therefore, context region models are particularly applicable to analyzing online consumer reviews, which in turn can be analyzed for the presence of a prevailing theme within the whole document (i.e., the review itself). ...
... Next, this matrix is transformed by weighing the frequencies based on the word's importance in a given context and their inverse relation to its frequency across the whole data set (Martin and Berry 2011). Once weighted, the matrix is decomposed by singular value decomposition (SVD), which produces two matrices -one that identifies terms that are highly correlated with various topics within the data set and one that associates documents with topics (Ashton et al. 2013). Consequently, by applying LSA to online consumer reviews from Amazon.com, we will be able to identify the key terms in the reviews and the key topics, both within the purview of service and product quality attributes. ...
Conference Paper
The continued growth of eCommerce and the prevalence of the online marketplace have developed a reliance upon user-driven ratings and electronic Word-Of-Mouth (eWOM) narratives. The widespread adoption of the star-rating system and corresponding text provided by users have established an ecosystem in which the aggregation of both the numeric ratings and thematic concepts found in the user comments are employed to help facilitate individual purchase decisions through the "wisdom of the crowd". We propose that both mechanisms may be subject to the bias within the user-generated contents' potential to conflate the attributes and qualities of the product themselves with those of the service experience. This research investigates this phenomenon and strives to understand the prevalence and impact of this misrepresentation. Data is scraped from Amazon.com and established product quality benchmark publications. Text mining and modeling techniques are then applied to investigate the underlying characteristics to determine similarities and differences.
... Automated anomaly detection systems in the IMTP of quality determinants can provide valuable insights into how to manage a product or service. Some preliminary works in this direction, which have used, for example, statistical control charts, have already been provided (Ashton et al., 2014(Ashton et al., , 2015Ashton & Evangelopoulos, 2012;Franceschini & Romano, 1999). ...
Article
Nowadays, word of mouth and the Voice-of-Customers have largely left their physical and relational dimension to be replaced by online digital platforms where users can describe their experiences and share them openly. However, the increasing availability of user-generated content, related to the usage of products and services, poses several challenges for research. A key priority is to understand how to use these large amounts of accessible and low-cost data to improve the quality of products and/or services and, consequently, customer satisfaction. The purpose of this paper is to show the potential of digital Voice-of-Customers (digital VoCs) as a source of information to monitor quality over time. The obtained insights may represent a first step towards developing quality tracking tools based on digital VoCs, thereby allowing the evolution of changes in perceived quality to be understood. Some examples accompany the description.
... Assumindo que os documentos são gerados de acordo com um determinado modelo probabilístico, descobrir sua estrutura semântica equivale a estimar os parâmetros do modelo, alguns deles representando um conjunto de tópicos (Stevens et al., 2012). De certa forma, como sugerido por Ashton et al. (2014), a abordagem probabilística, na modelagem de tópicos, expandiu os modelos de fatoração de matrizes incorporando um contexto de incerteza, permitindo a implementação de processos generativos. Neste trabalho, foi aplicado o LDA (Latent Dirichlet allocation), um modelo tradicional que assume que os documentos são misturas probabilísticas de tópicos latentes (Figura 2) e que os tópicos são distribuições de probabilidade sobre palavras (Blei, 2012). ...
Article
Full-text available
Entende-se por imagem organizacional as percepções públicas em torno de uma organização. Para os órgãos públicos, é importante gerenciar tal imagem dado que ela interfere diretamente no relacionamento com diferentes atores, bem como sua legitimação e credibilidade perante a sociedade. Neste sentido, os veículos de imprensa têm grande influência, contudo existem desafios relacionados ao formato e volume dos dados destas fontes. Assim, o objetivo deste artigo é utilizar técnicas de Ciência de Dados para analisar a imagem organizacional de uma organização pública através da imprensa nacional, com foco em portais de notícias e jornais. Para validação, foi utilizado o caso da Agência Nacional de Energia Elétrica (ANEEL). As etapas metodológicas consistiram em definição, coleta, preparação e análise dos dados através de técnicas de Processamento de Linguagem Natural. Os principais resultados reforçam os indícios de alta relação entre a imagem da ANEEL e a imagem do Governo, e que existe uma disparidade entre a imagem passada pela manchete e pela notícia na íntegra. Ainda, observou-se a utilização de estratégias de comunicação (rotulagem, agenda-setting, linkage e framing) por parte dos veículos de imprensa. Para estudos futuros, sugere-se a utilização de outras fontes de dados e também a validação a partir de outros casos.
... giving a more consistent output(Ashton et al., 2014).450 ...
Article
The amount of texts available on the web is growing continuously and making sense of this un-structured data efficiently and effectively, therefore, poses a demanding challenge for organizations. Although computer science community has developed many techniques, there is ample room for improvement on organizational utilization of such text data, especially when referring to decision-making support. In this article, we propose and validate a framework towards an effective use of text data inside hotel industry, bringing tourism sector to this discussion. We combined three text mining techniques for text classification, sentiment analysis and topic modeling in a novelty way to allows managers to analyze guests' comments and compares competitors in hospitality industry based on SERVQUAL. Our objective is to present an automatized process involving text data collection and analysis, improving decision-making process.
Article
ABSTRACT The underlying purpose of this article is to discuss latent semantic analysis (LSA) and its application and research opportunities in tax research. LSA is a Big Data research methodology that can assist researchers in identifying the underlying themes/topics in data. To illustrate, the authors apply LSA to almost 2,300 abstracts of articles in Advances in Taxation, The Journal of the American Taxation Association, and National Tax Journal to identify the major topics in academic tax research over the period 1979-2015 and examine differences across journals and across time. The authors also discuss how researchers can apply LSA methodology to specific tax and accounting topics, providing research opportunities that heretofore may have previously been impractical.
Article
With the booming of online shopping, the post-sales online review process becomes a crucial activity in product life cycle management, as it provides abundant user-generated content (UGC) for e-commerce businesses to monitor customer satisfaction, identify potential recalls, and improve electronic Word of Mouth (eWOM). As the core of UGCs, online reviews are generally divided into negative reviews and non-negative reviews in terms of sentiment polarity. Quality issues, such as failures of products or services, are more likely to be hidden in negative reviews. Hence this paper focuses on detecting the abnormal changes of the time-between-review T and sentiment scores S of negative reviews. Due to the complexity and variability of online review processes, the distribution assumptions for S and T may be invalid in real cases. To overcome this problem, we design a distribution-free two-sided monitoring scheme by using the max-type combining function to combine the exponentially weighted moving average (EWMA)-based Wilcoxon rank-sum (WRS) statistics for S and T. The IC and OC performances of the proposed scheme are investigated via simulation study. The results indicate that the proposed scheme outperforms other schemes including a distribution-free EWMA scheme and three parametric Shewhart schemes. Finally, a real example from Ctrip is provided for illustrating the application of the proposed scheme in post-sales service monitoring.
Chapter
In this chapter direct methods for solving systems of linear equations $$ Ax = b,A = \left[ \begin{array}{l} {a_{11}}...{a_{1n}}\\ \vdots \quad \quad \vdots \\ {a_{n1}}... \end{array} \right],b = \left[ \begin{array}{l} {b_1}\\ \vdots \\ {b_n} \end{array} \right] $$ will be presented. Here A is a given n × n matrix, and b is a given vector. We assume in addition that A and b are real, although this restriction is inessential in most of the methods. In contrast to the iterative methods (Chapter 8), the direct methods discussed here produce the solution in finitely many steps, assuming computations without roundoff errors.
Conference Paper
Varying philosophical and theoretical orientations to qualitative inquiry remind us that issues of quality and credibility intersect with audience and intended research purposes. This overview examines ways of enhancing the quality and credibility of qualitative analysis by dealing with three distinct but related inquiry concerns: rigorous techniques and methods for gathering and analyzing qualitative data, including attention to validity, reliability, and triangulation; the credibility, competence, and perceived trustworthiness of the qualitative researcher; and the philosophical beliefs of evaluation users about such paradigm-based preferences as objectivity versus subjectivity, truth versus perspective, and generalizations versus extrapolations. Although this overview examines some general approaches to issues of credibility and data quality in qualitative analysis, it is important to acknowledge that particular philosophical underpinnings, specific paradigms, and special purposes for qualitative inquiry will typically include additional or substitute criteria for assuring and judging quality, validity, and credibility. Moreover, the context for these considerations has evolved. In early literature on evaluation methods the debate between qualitative and quantitative methodologists was often strident. In recent years the debate has softened. A consensus has gradually emerged that the important challenge is to match appropriately the methods to empirical questions and issues, and not to universally advocate any single methodological approach for all problems.
Chapter
The theory of finite-dimensional vector spaces was created primarily in connection with one problem, and that is the simultaneous solution of a system of k linear equations in n indeterminates over a field F of the form $$ \begin{gathered} {a_{11}}{X_1} + ... + {a_{1n}}{X_n} = {b_1} \hfill \\ {a_{21}}{X_1} + ... + {a_{2n}}{X_n} = {b_2} \hfill \\ {a_{k1}}{X_1} + ... + {a_{kn}}{X_n} = bk \hfill \\ \end{gathered} $$ where the aij and the bi are elements of F and the Xj are indeterminates taking values in F. Such systems arise in many applications and also in many areas of mathematics.
Article
Roberts (1959) first introduced the exponentially weighted moving average (EWMA) control scheme. Using simulation to evaluate its properties, he showed that the EWMA is useful for detecting small shifts in the mean of a process. The recognition that an EWMA control scheme can be represented as a Markov chain allows its properties to be evaluated more easily and completely than has previously been done. In this article, we evaluate the properties of an EWMA control scheme used to monitor the mean of a normally distributed process that may experience shifts away from the target value. A design procedure for EWMA control schemes is given. Parameter values not commonly used in the literature are shown to be useful for detecting small shifts in a process. In addition, several enhancements to EWMA control schemes are considered. These include a fast initial response feature that makes the EWMA control scheme more sensitive to start-up problems, a combined Shewhart EWMA that provides protection against both large and small shifts in a process, and a robust EWMA that provides protection against occasional outliers in the data that might otherwise cause an out-of-control signal. An extensive comparison reveals that EWMA control schemes have average run length properties similar to those for cumulative sum control schemes.