ArticlePDF Available

A New Perspective for Neural Networks: Application to a Marketing Management Problem

Authors:

Abstract and Figures

Over the last few years, connectionism or neural networks (nn) have successfully been applied to a wide range of areas and have demonstrated their capabilities in solving complex problems. Current indications show that these techniques are very important and rapidly developing areas of research and applications, particularly, in the area of data mining for knowledge discovery. One particular neural network model, the back-propagation (BP) algorithm, has performed very well in this regard and it is now accepted as a reliable method for data mining. However, these models have their shortcomings. The major difficulty lies in the fact that the relationships between specific variables and the neural network results are, at best, difficult to explain. This article presents an innovative but simple method for using nn to understand the pattern/outcome correlation to interpret a cause and effect relationship. A comparative analysis and experimental results are also presented to show the validity of the proposed scheme.
Content may be subject to copyright.
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 25, 1605-1616 (2009)
1605
A New Perspective for Neural Networks: Application
to a Marketing Management Problem
JAESOO KIM AND HEEJUNE AHN+
Department of Computer Science and Engineering
+Department of Control and Instrumentation Engineering
Seoul National University of Technology
Seoul, 139-743 Korea
E-mail: heejune@snut.ac.kr
Over the last few years, connectionism or neural networks (NN) have successfully
been applied to a wide range of areas and have demonstrated their capabilities in solving
complex problems. Current indications show that these techniques are very important and
rapidly developing areas of research and applications, particularly, in the area of data
mining for knowledge discovery. One particular neural network model, the back-propa-
gation (BP) algorithm, has performed very well in this regard and it is now accepted as a
reliable method for data mining. However, these models have their shortcomings. The
major difficulty lies in the fact that the relationships between specific variables and the
neural network results are, at best, difficult to explain. This article presents an innovative
but simple method for using NN to understand the pattern/outcome correlation to inter-
pret a cause and effect relationship. A comparative analysis and experimental results are
also presented to show the validity of the proposed scheme.
Keywords: neural networks, sensitivity analysis, CART, logistic regression, data mining
1. INTRODUCTION
1.1 Background
We all know that information has become a very important commodity. Every sec-
ond hundreds of thousands of new records of information are generated. This informa-
tion needs to be summarized and synthesized if it is to support effective decision-making.
This involves the challenge of dealing with huge sets of data, dynamic data, incomplete
or imprecise data, noisy data and missing attributes, and redundant or insignificant data.
A successful approach to modeling non-linear relationships under these situations can be
usage of artificial neural networks (ANN) or connectionism that can be trained with the
set of available data [1, 8]. One particular neural network type, the back-propagation (BP)
algorithm has performed very well in this regard and it is now accepted as a reliable
method for data mining [2].
Neural networks share the advantages with the many other data mining tools. An
advantage they have over classical models used to analyzed data, such as regression
analysis, is that they can fit data where the relation between independent and dependent
variables is nonlinear and where the specific form of the nonlinear relationship is un-
known. Also, decision trees, a method of splitting data into homogeneous clusters with
Received October 19, 2007; revised April 9 & July 17, 2008; accepted August 28, 2008.
Communicated by Chin-Teng Lin.
+ Corresponding author.
JAESOO KIM AND HEEJUNE AHN
1606
similar expected values for the dependent variable, are often less effective when the pre-
dictor variables are continuous than when they are nominal (or categorical). Neural net-
works work well with both nominal and continuous variables. They do not require that
the relationships between predictor and dependent variables be linear whether or not the
variables are transformed. The neural network method is more robust and has better pre-
dictive accuracy than classical methods, such as discriminant and logistic analysis, in
many data mining applications. As the focus of this paper is neural networks, the other
data mining techniques will not be discussed further.
In spite of their advantages, neural networks with BP algorithm have their short-
comings. The major difficulty lies in the fact that the relationships between specific or
causal variables and the neural network results are, at best, difficult to explain because of
the complexity of the functions used in the neural network approximations. The output of
a neural network is a predicted value and some goodness of fit statistics. However, the
functional form of the relationship between predictor and target variables is not made
explicit. So the nature of the strength of the relationship between the independent and
dependent variables, i.e., the importance of each variable, is usually not revealed. Vali-
dating unexplainable results can be a significant challenge. This means there must be
something more general in their activity that led them to this result. Here we face the
challenge of finding appropriate way to figure out these interrelationships to make use of
them in the future without requiring some additional knowledge about the character of
the task.
Basically, the aim of this paper is to show that the neural network modeling may
offer significant advantages over the commonly used estimation procedures that can
summarise the large amount of collected data into relevant, concrete and effective action
recommendations for decision makers. In order to meet the growing demand of decision
makers, we have to focus on the systems that find adequate explanation models.
In this study, we also tested the comparative abilities of a neural network model, lo-
gistic regression, and classification and regression trees (CART) at capturing interrela-
tionship between the independent variables and the dependent variables. And this paper
describes how a model of factors influencing consumer behavior, from which initial
measures can be used, can be produced using a neural network based on consumer sur-
vey data.
1.2 Cultural Orientation and Consumer Behavior
The culturally based norms (appropriate behavior in a situation) and values (desir-
able behavior across situations) would lead to differences in consumer behavior across
cultures. These values and norms are passed on from the community to an individual as
he or she is socialized within the community. Consumers learn values and norms about
the acquisition, consumption and disposal of products through socialization in their
communities. Thus cultural values and norms become a primary explanation of similari-
ties in behavior of individuals within the community, and differences in the behavior of
individuals across communities [4].
Especially, Chinese consumer behavior is essentially different because of its unique
cultural, social and economic roots [13-15]. The behavior of Chinese consumers has
A NEW PERSPECTIVE FOR NEURAL NETWORKS
1607
even been distinguished from that of consumers in other Asian countries [5]. Sun &
Collins [13] studied consumers’ attitudes towards imported fruit in Guangzhou, finding
that fruit attributes in relation to symbolic and hedonic values were of primary impor-
tance in the decision to purchase.
Imported fruits have been widely available in China since about 1993. They are re-
tailed throughout the year in every major city in the country. Though imported fruit is
more expensive than locally produced fruit, there are still many willing buyers and the
imported fruit business has experienced burgeoning demand and high profits.
For this study, survey data were collected through structured intercept interviews
with consumers at point of sale immediately after they had purchased imported fruit.
Results will help to broaden our understanding of Chinese consumer behavior and pro-
vide valuable information when formulating marketing strategies.
The rest of the article is organized as follows. Section 2 reviews the studies on neu-
ral networks. In section 3, we introduce a sensitivity measure that assesses the relative
importance of the input factors used by the network to arrive at its targets and review the
existing relative importance measure for neural network input elements. In section 4, we
apply the method discussed in section 3 to a marketing management problem. Then, sec-
tion 5 compares and contrasts how neural networks and classical modeling techniques
deal with the specific modeling challenges and how the output of neural networks can be
used to better understand the relationship in the data through sensitivity analysis. Subse-
quently, we examine the results of our studies and test how the neural network model per-
forms in practice using the real-world data set. Finally, we draw conclusions in section 6.
2. NEURAL NETWORK AND ITS UNDERSTANDING THE OUTPUT
Neural networks are based on an early model of human brain function. Although it
is described as a network, a neural network is nothing more than a mathematical function
that computes an output based on a set of input variables. The network paradigm makes
it easy to decompose the larger function to a set of related subfunctions, and it enables a
variety of learning algorithms that can estimate the parameters of the subfunctions.
There are many different types of neural networks. A feedforward neural network
with one hidden layer considered in this paper is known as a multilayer perceptron
(MLP), which is one of the most popular kinds of neural networks and uses supervised
learning. As a result, its effectiveness has been established and software for applying it is
widely available. It has also been proved that a network with only one hidden layer is
enough to approximate any continuous function given there are enough nodes in the
hidden layer [3, 9]. The hidden layers are used to model the nonlinearities in the rela-
tionship between inputs and output [11]. Therefore neural networks might represent a
viable alternative to multivariate statistical methods.
Although neural networks can be applied to a number of data mining problems, in-
cluding classification, regression, and clustering, the complexity, combined with the non-
descriptive nature of neural network models, often discourages all but the most scientists
and researchers from employing the data mining technique.
Neural networks are trained by adjusting weights by some automatic learning algo-
rithms so that the result of stability approximates the desired outcomes for the provided
JAESOO KIM AND HEEJUNE AHN
1608
inputs. The output from neural networks varies greatly. Other common outputs are accu-
racy measures such as confusion matrix, R2, and so forth for validating the model. The
output from a neural network is purely predictive. Unfortunately, none of these aids the
user in understanding the model or the underlying data relationships. Because there is no
descriptive component to a neural network model, a neural network's choices are hard to
understand, and this often discourages its use. In fact, this technique is often referred to
as a black-box technology.
Because of the more complicated functions involved in neural network analysis, in-
terpretation of the variables is more challenging. One approach is to examine the weight
connecting the input variables to the hidden layer. Those which are closest to zero are
least important. A variable is deemed unimportant only if all of these connections are
near zero. This procedure is typically used to eliminate variables from a model, not to
quantify their impact on the outcome. Due to the homogeneous structure of neural net-
work, it is hard to extract structured knowledge from either the weights or the configura-
tion of the neural network in question. It should be emphasized that the weights in a neu-
ral network with hard-limiter as its activation function do have physical meaning [12].
The weights of a given node represent the coefficients of the hyperplane or discriminant
function that partitions the input space into two regions with different output values.
However, this interpretation of weights gets weaker and weaker if the net’s activation
function is either sigmoid or hyperbolic tangent functions and the given dependent vari-
ables are continuous instead of binary [10]. Therefore the weights are relatively unin-
formative for determining the influence of the variables on the fitted values.
Another approach to assessing the predictor variables’ importance is to compute a
sensitivity analysis for each variable. The sensitivity is a measure of how much the pre-
dicted value’s error increases when the variables are excluded from the model one at a
time. Through the sensitivity analysis, it is possible to generate an estimate of the general
level of influence exhibited by each parameter from an analysis of the network weights
in a systematic manner. This can be used to rank each variable’s importance. In the fol-
lowing section, a method for calculating output sensitivities to inputs’ variations from a
trained neural network is discussed in some detail.
3. SENSITIVITY ANALYSIS
In general, one of the key factors that affect the success of process modeling is the
ability to extract information about the model structure and the relationships between its
inputs and outputs from the trained network. Such information is essential for model
validation and for process optimization, control and safety assessments. Moreover, in
some cases where the original process is not well understood, this information can be
employed as a basis for the analysis of the process and in determining the most signifi-
cant factors that affect it.
For multilayer feedforward networks with n input nodes, one hidden layer with h
nodes and k output nodes the relative importance (RI) of the ith component of the input
vector can be estimated as follows:
A NEW PERSPECTIVE FOR NEURAL NETWORKS
1609
10
0
,
hji kj
n
jji
i
ik hji
j
=
=
=
=
ww
w
RI w (1)
where wji is the weight from the ith input node to the jth hidden node and wkj is the
weight from the jth hidden node to the kth output node. Biases are given the subscript 0.
Hence, the RI measure incorporates certain rates of change of the strengths of sig-
nals as they flow through the network. For example, wji is the partial derivatives of the
inputs to the hidden layer with respect to the inputs to the network. Similarly, wkj is the
partial derivatives of the inputs to the output layer with respect to the outputs of the hid-
den layer. So this RI measure is simply compounded weighted averages and is inde-
pendent of the activation function, therefore it is applicable to networks trained on a
range of activation functions, which are monotonically increasing.
Eq. (1) includes a component to normalize for the effect of extreme weights con-
necting input and hidden nodes. This additional component is also included in a closely
related formula given by Garson [7]:
11
111
.
ji kj
hn
jji
i
ik
ji kj
nh n
ij
j
iji
i
=
=
==
=
=
∑∑
ww
w
RI ww
ww
(2)
Thus, for each j of h hidden nodes, sum the product formed by multiplying the in-
put-to-hidden connection weight of the input node i of variable for hidden node j, times
the connection weight of the output node k for hidden node j, then divide by the sum of
such quantities for all variables. The result is the percentage of all output weights attrib-
utable to the given independent variable, excepting bias weights arising from the back-
propagation algorithm.
However, the related method proposed by Garson does not include the effect of the
bias, which could result in a significant omission. Garson’s measure places more empha-
sis on the connection strengths from the hidden layer (wji) to the output layer (wkj), but it
does not measure the direction of influence (positive or negative). That is, during the
summation process, positive and negative weights can cancel their contribution or influ-
ence, which leads to inconsistent results. Including the bias influence allows all the influ-
ences to be considered in the context of the complete network. For instance, it is possible,
although unlikely, that the output of a network is based purely on the bias, and the input
signal has no significant effect. Using Garson's approach, the input parameters could be
assigned influence to various degrees since the overwhelming bias effect is ignored.
The RI measure given in this paper would illustrate the minimal (zero) influence of
the inputs and the large effect of the bias. In this way, it is possible that the denominator
in Eq. (1) will reduce to zero for non-zero weights. That is, the denominators will only
be zero if all the weights are zero, for instance all weights from the hidden layer to the
JAESOO KIM AND HEEJUNE AHN
1610
output layer are zero resulting in a network which simply outputs a single value deter-
mined by the activation function for any input signal or for all weights from the input
node under consideration to the hidden layer, including the bias, to be zero in which case
the input parameter will have no effect. Also, note that the numerator can become zero
under the same conditions, i.e., non-zero weights. In this case the strength of the input
would be zero. In the following section, it will be shown how this method can be applied
to a real world problem.
4. APPLICATION EXAMPLE
4.1 Data Collection and Statistical Information
For this study, a supermarket in China was chosen and the survey conducted was
mall-intercept personal interviews. The shop in its fruit section was deliberately divided
commodities into two subsections, domestic fruit and imported fruit, and also marked in
both sides with clear sign. It was a heavily trafficked store, and its management gave
approval to promote the survey as being on behalf of the company.
The purpose of the survey was explained to interviewees as being for the improve-
ment of service so that customers’ needs could be better understood and met. This was in
fact true because the company wanted to use the results. The rationale for this approach
was to ensure that the interviewee’s personal interest was directly associated with the
quality of their answers to the questions. Surveys were administered so as to avoid public
holidays and to achieve a spread across weekdays. The survey involved 520 personal
interviews in Guangzhou and a total of 495 useable responses was recorded for the study.
Respondents were asked about both their beliefs and evaluation of the imported
fruit, including their intention of purchasing imported fruit on each statement. 11 ques-
tions relating to consumers’ attitudes and perceptions that might be motivated their buy-
ing intention to imported fruit were designed.
The questions that consumers were asked in relation to these 11 attributes were
framed into statements according to Fishbein’s theory [6]. Fishbein’s proposition is that
people form attitudes towards a product attribute on the basis of their belief about that
attribute (comprised of perceptions and knowledge) and their positive or negative feel-
ings towards that attribute (comprised of their evaluation of that belief). According to
Fishbein, a consumer’s overall attitude toward imported fruit products would be repre-
sented by the sum of the products of their beliefs about each attribute and their evalua-
tion of those beliefs.
Among them, seven were the objective characteristics relating to attributes and per-
ceptions of imported products, such as appearance, packing, pollution, taste good, taste
different, freshness and price. Four were the subjective attributes towards symbolic
means of purchasing imported fruit products: achievement; wealthy; personality and so-
cial statues. Besides assessing consumers’ attitudes and perceptions of imported fruit a
behavioral response measure of consumer intention was elicited.
Table 1 describes some features of the independent variables and the dependent
variables used in this work. Note that there is no missing for the variables. Some descrip-
tive statistics for the data set of the consumers’ attitudes and perceptions towards im-
A NEW PERSPECTIVE FOR NEURAL NETWORKS
1611
Table 1. Description of attributes: the independent variables (x1 ~ x11) and the dependent
variable (y).
Attributes Variable Type
Appearance of fruit (x1) continuous
Packing status (x2) continuous
Less pollution (x3) continuous
Taste good (x4) continuous
Taste different (x5) continuous
Freshness (x6) continuous
Person’s achievement (x7) continuous
Wealthy (x8) continuous
Personality (x9) continuous
Social status (x10) continuous
Attitude to price (x11) categorical
Buying intention (y) categorical
Table 2. Descriptive statistics: each value is the product of beliefs by evaluation.
Attributes Mean Std. Dev. Skewness
Appearance of fruit 5.54 4.39 1.76
Packing status 5.70 4.78 1.73
Less pollution 7.60 5.57 1.00
Taste good 7.72 5.91 1.08
Taste different 6.80 5.53 1.51
Freshness 6.42 5.45 1.26
Person’s achievement 13.51 7.78 0.18
Wealthy 13.64 7.77 0.17
Personality 12.37 7.98 0.33
Social status 13.91 8.06 0.08
Attitude to price 1.71 0.46 0.91
ported fruit are also illustrated in Table 2. The data in this example have skewness values
of ranging from 0 to 2, which are considered acceptable for this task so that approximate
normality is attained after the data is logged.
4.2 Methods
The data set consists of 495 respondents on consumer survey in an imported fruit
market. The cross-validation procedures used in the neural network simulation were ap-
plied to the questionnaire data to prevent overfitting; that is 30 percent of the sample was
used to train the network, 20 percent was used to determine a stopping point for training,
and the remaining 50 percent was used for hold-out-sample testing of the predictive ac-
curacy. For the logistic regression and the CART modeling, the data was separated into a
training set of 248 customers and a test set of 247 customers. Variables used are de-
scribed in Table 1.
JAESOO KIM AND HEEJUNE AHN
1612
The logistic regression coefficients correspond to “B” coefficients in the logistic re-
gression equation indicate the amount of change expected in the log odds when there is a
one unit change in the predictor variable with all of the other variables in the model held
constant. A coefficient close to 0 suggests that there is no change due to the predictor
variable. There is a relationship between the logistic coefficients and the odds ratios,
odds ratio = Exp(B). These coefficients are used to compare the relative importance of
the independent variables in this work as it can be seen in Table 3.
Table 3. Coefficients of the relative importance to the various imported fruit character-
istics.
Attr Logistic Exp(B) (rank) NN RI (rank) CART Scores (%) (rank)
x1 1.062 (3) 0.015 (11) 3.71 (10)
x2 0.928 (10) 0.072 (7) 0.00 (11)
x3 1.060 (4) 0.266 (2) 27.55 (3)
x4 1.107 (2) 0.227 (3) 100.0 (1)
x5 0.996 (8) 0.066 (8) 25.22 (4)
x6 0.926 (11) 0.028 (10) 8.21 (9)
x7 1.030 (5) 0.111 (4) 22.35 (6)
x8 1.019 (6) 0.093 (5) 22.14 (7)
x9 0.991 (9) 0.042 (9) 11.34 (8)
x10 1.005 (7) 0.087 (6) 24.50 (5)
x11 2.073 (1) 0.269 (1) 57.38 (2)
For building a neural network model, we only considered a feedforward with a sin-
gle hidden layer architecture, as they can approximate any continuous function and train-
ing algorithm was the Lavenberg-Marquardt algorithm. The size of hidden nodes needs
to be only a relatively small fraction of the input layer. In this study, one empirical guide-
line is to determine the number of hidden nodes as twice the square root of the sum of
input and output nodes, design multiple networks by varying the initial weights, and use
the validation set to choose the best network. If the network fails to converge to a solu-
tion, it may be that more hidden nodes are required. If it does converge, we may try
fewer hidden nodes. Application of the fitted model to the test data indicated that a 4
node neural network provided the best model (i.e., 11-4-1). The performance measures
of each neural network model such as MSE, and classification accuracy (%) are the av-
erage of 5 trials.
To calculate a variable importance score, CART looks at the improvement measure
attributable to each variable in its role as a surrogate to the primary split. The values of
these improvements are summed over each node and totaled, and are scaled relative to
the best performing variable. In such ways, CART automatically produces the variable
importance ranking (scores) based on the contribution predictors make to the construc-
tion of the tree. The variable importance rankings or predictor rankings (%) are strictly
relative to a specific tree; change the tree and we might get very different rankings.
A NEW PERSPECTIVE FOR NEURAL NETWORKS
1613
5. COMPARATIVE ANALYSIS AND RESULTS
In this example, we attempt to assess the relative importance of input or causal vari-
ables by examining a trained neural network model using the real-world data. A neural
network with 4 nodes in the hidden layer was run on a training and validation set. Each
of the three models was tested on consumer survey data and used to rank variables in
importance.
Table 3 displays the results of the sensitivity test for each of the variable and shows
the coefficients of the models developed with the logistic regression method, ranking
scores (%) with the CART method and the neural network method using Eq. (1). The
table shows that x11 (price) is the most important input factor, followed closely by x3 (less
pollution); x4 (taste good) is of substantial but somewhat lesser importance. On the basis
of this measure, we could also correctly infer that in the case of subjective characteristics,
x7 (achievement) is the most likely direct cause of buying behavior and that x8 (wealthy)
are slightly less important but still substantial causes of the consumer’s buying behavior.
The odds ratio (or
β
weights or Exp(B)) in regression is interpreted as the ratio of
the relative importance of the causal or input variables in the model. As indicated by the
Exp(B) for the odds ratio in the table, x11, x4, x1, x3 and x7 have the highest importance in
affecting the choice of the consumer’s buying strategy. For this example, the odds ratio
of x11 was 2.073, 1.107 was x4, and 1.062 was x1. On the basis of regression, we would
correctly infer that x11 was the most important and likely direct cause of buying intention
and that x4, x1, x3 and x7 were equal but lesser causes. This similarly matches with the
results from the neural network analysis.
In the CART model case, the scores (%) reflect the contribution each variable
makes in classifying or predicting the dependent variable, with the contribution stem-
ming from both the variable’s role as a primary splitter and its role as a surrogate to any
of the primary splitters. In this example, x4, the variable used to split the root node, is
ranked as most important. The variable, x2, received a zero score, indicating that this
variable did not play any role in the analysis as either as primary splitters or surrogates.
x11 (price) also is an important splitting variable and has the highest scores, followed by
x3, x5, x10 and x7 as illustrated in Table 3.
For all three models’ measures, x11 has the highest contribution. For both our meas-
ure and in logistic regression’s measure, x3 and x4 were second, respectively, whereas in
CART, x11 was second, Thus, the logistic and neural network methods identify x11 as the
most important variable, whereas the CART model identifies x4 as the most important
variable.
According to the sensitivities, x3, x4 and x11 are the most important variables and x1
is the rather least important. This contrasts with the importance rankings of x1 in the lo-
gistic analysis, where x1 was a more important variable than others. Note that these are
the sensitivities for the particular models. A different initial starting point for the neural
network or a different number of hidden nodes could result in a model with different
sensitivities and the rankings can be quite sensitive to random fluctuations in the data.
The importance rankings in CART need to be understood as being relative to a particular
tree and the rankings are strictly relative to a given tree structure. Overall, it shows that
the consumer’s objective characteristics were more important than the subjective charac-
teristics in effecting the consumers’ purchasing behavior. The low sensitivities were
JAESOO KIM AND HEEJUNE AHN
1614
probably a result of the high correlations of the variables with each other.
These findings have obvious managerial implications since the consumers’ percep-
tions of the various imported fruit characteristics can be influenced by managerial ac-
tions. For some variables (e.g., packing status and freshness), the financial costs needed
to change consumers’ perceptions might be large, but these results show that it may have
little effect on consumer’s purchasing intention and should therefore not be a priority
item for managerial action. Conversely, since the ultimate purchasing behavior is more
strongly influenced by other variables, such as price/taste good/less pollution, manage-
rial actions affecting the consumers’ perceptions and attitudes of the imported fruit selec-
tion can better influence consumer’s buying expectations in the store.
Finally, the classification results for all three models are presented in Table 4. The
cross-validation procedure described earlier was used for all three models. The results in
Table 4 demonstrate that once again the neural network model exhibits a superior ability
to learn the patterns corresponding to consumer choice (buying intention). Consistent
with the simulation results, the neural networks demonstrate significantly better hold-out-
sample predictive accuracy that of the other models.
Table 4. Summary of classifications of the consumers’ attitudes and perceptions by lo-
gistic regression, neural network and CART.
Analysis Method Training Hold-Out
Logistic Regression 67.7% 63.56%
Neural Network 89.59% 65.18%
CART 82.66% 61.13%
6. CONCLUSIONS
In this paper, we have developed a method for determining the relative importance
of each input or causal variable of a neural network on the target. We have then applied
this method to a neural network model of an empirical examination of consumer’s be-
havior and showed that the neural network models can be used to improve the predic-
tions to an important business management problem as well as understand the relative
causal importance and order of the input variables.
Neural networks seem to have an advantage over linear models when they are ap-
plied to complex nonlinear data and may outperform classical models in certain situa-
tions, but interpreting the result is difficult because the nature of the relationship between
dependent and target variables is not usually revealed. Neural networks also have no
problem with trigonometric or logarithmic relationships, but either of these could be a
real problem for the other techniques. This is an advantage neural networks share with
other data mining tools not discussed in detail in this paper.
A method for interpreting the results of neural networks is presented here and in-
corporating such method into neural network models would help address the limitation.
Our RI measure does provide a reasonable method of using neural network for modeling
as well as for classification or prediction, and stands in sharp contrast to misleading
views of neural networks as black-boxes whose iterative processes are beyond human
compre-hension, even if the predictions are good.
A NEW PERSPECTIVE FOR NEURAL NETWORKS
1615
REFERENCES
1. C. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, New
York, 1995.
2. K. J. Cios, W. Pedrycz, and R. W. Swiniarski, Data Mining Methods for Knowledge
Discovery, Kluwer Academic Publishers, Dordrecht, 1998.
3. G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathe-
matics of Control, Signals and Systems, Vol. 2, 1989, pp. 303-314.
4. G. J. Tellis and D. S. Ackerman, “Can culture affect prices? A cross-cultural study
of shopping and retail prices,” Journal of Retailing, Vol. 77, 2001, pp. 57-82.
5. J. X. Fan and J. J. Xiao, “Consumer decision-making styles of young-adult Chi-
nese,” Journal of Consumer Affaires, Vol. 32, 1998, pp. 275-289.
6. M. Fishbein, The Relationship Between Beliefs, Attitudes, and Behavior, Cognitive
Consistency, S. Feldman, ed., Academic, New York, 1966, pp. 199-223.
7. G. D. Garson, “Interpreting neural-network connection weights,” AI Expert, 1991,
pp. 47-51.
8. S. Haykin, Neural Networks: A Comprehensive Foundation, MacMillian and IEEE
Computer Society, New York, 1994.
9. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are
universal approximators,” Neural Networks, Vol. 2, 1989, pp. 359-366.
10. M. Ishikawa, “Structural learning with forgetting,” Neural Networks, Vol. 9, 1996,
pp. 509-521.
11. C. Klimasauskas, “Neural networks: An engineering perspective,” IEEE Communi-
cation Magazine, Vol. 30, 1992, pp. 50-53.
12. R. Lippmann, “An introduction to computing with neural nets,” IEEE ASSP Maga-
zine, Vol. 4, 1987, pp. 4-22.
13. X. Sun and R. Collins, “Attitudes and consumption values of consumers of imported
fruit in Guangzhou, China,” International Journal of Consumer Studies, Vol. 26,
2002, pp. 34-43.
14. X. Sun and R. Collins, “A comparison of attitudes among purchasers of imported
fruit in Guangzhou and Urumqi, China,” Food Quality and Preference, Vol. 15,
2004, pp. 229-237.
15. O. H. M. Yau, Consumer Behavior in China: Customer Satisfaction and Cultural
Values, Routledge, London, 1994.
Jaesoo Kim (金在洙) is an Associate Professor at Depart-
ment of Computer Science and Engineering, Seoul National Uni-
versity of Technology. He received his M.S. and Ph.D. degrees
in Computer Science and Information Science from Monash
University, Australia, University of Otago, New Zealand in 1992
and 1999, respectively, and his B.S. degree in Computer Science
from Seoul National University of Technology, South Korea in
1988. Before joining Seoul National University of Technology
as a Professor, he worked as an Assistant Professor at Zayed
University, UAE from 2002 to 2003, and University of Queen-
JAESOO KIM AND HEEJUNE AHN
1616
sland, Australia from 2000 to 2001. Before his Ph.D., he worked as a research engineer
at Otago University, New Zealand from 1997 to 1999, University of Auckland, New
Zealand from 1993 to 1996, and Monash University, Australia from 1989 to 1992. His
interests include artificial intelligence, computing intelligence, web based data mining,
and intelligent business decision system.
Heejune Ahn (安熙準) received his Ph.D., M.S., and B.S.
degrees in Electrical Engineering from KAIST (Korea Advanced
Institute of Technology), Daejeon, the Republic of Korea, in 1999,
1995 and 1993, respectively. He is an Assistant Professor of the
Department of Control and Instrumentation Engineering at Seoul
National University of Technology, Seoul, the Republic of Korea.
He worked as a visiting researcher at Telecommunication Lab. of
Erlangen-Nuremberg University, Germany, from July 1999 to
February 2002. He has been a GSM/GPRS/UMTS wireless mo-
bile protocol software engineer at Next Generation Handset Lab.,
LG Electronics Inc., Korea, from February 2000 to September 2002. From September
2002 to December 2003 he worked as a software architect and programmer of J2EE web
server system at Tmax Soft Inc. His research interests include multimedia communica-
tions, protocol development, network system performance analysis, and real-time embed-
ded systems.
... Although we can use techniques like multiple regression or discriminant analysis to evaluate neural network prediction or classification success rates, no method that analyzes the relative relevance of the input parameters employed by the network to reach its conclusions has been accepted. One of the most advantages of ANN-based models is that they make sensitivity analysis easier to analyze the relative importance of their input variables (Kim & Ahna, 2009). One of the most interesting properties of ANN-based models is that they make sensitivity analysis of the relative relevance of their input variables more easier (Kim & Ahna, 2009). ...
... One of the most advantages of ANN-based models is that they make sensitivity analysis easier to analyze the relative importance of their input variables (Kim & Ahna, 2009). One of the most interesting properties of ANN-based models is that they make sensitivity analysis of the relative relevance of their input variables more easier (Kim & Ahna, 2009). To determine the most significant input-output relationship that has been manually carried out, sensitivity studies use a "leave one out" technique. ...
Article
Full-text available
In addition to cement, sand, gravel, and water, the current investigation of the influence of additives on the compressive strength of concrete at 28 days includes fly ash, silica fume, and slag. 315 concrete compositions with various amounts of additives are trained and tested using an artificial neural network. Concrete strength is largely affected by the specific gravity of cement and the specific gravity of fine and coarse particles, according to the studies. For greater compressive strength, it is preferable to use materials with a higher specific gravity. Compressive strength has grown as the amount of silica fumes has increased. Increased amounts of slag or superplasticizer resulted in the same behavior. When the amount of fly ash was increased, the compressive strength of the material decreased.
... They're statistical data-modelling techniques in which interconnected nodes process data in parallel while modifying and learning from prior examples. ANNs have only recently gained prominence, owing to the increasing development of computer usage and the favorable outcomes highlighted in a number of empirical studies which were first described in 1943 by McCulloch and Pitts [9][10][11][12][13][14]. The main purpose of the study is to churn out a predictive model with greater accuracy in predicting the buying behavior of consumers for DTC (direct-to-consumer) brands. ...
Article
Direct-to-consumer (DTC) businesses are gaining popularity as a way to reach a larger number of customers and better suit their needs. Vertical brands are distinguished by their metamorphosis, in which they offer their products straight from the manufacturer to consumers without the use of distribution intermediaries, like in traditional business models. They're obliterating themselves on virtual platforms and undermining their old linear sales processes in the process. In the current scenario, the ability of connectionist models to explain consumer behavior, with a focus on the feed-forward neural network model, should be emphasized, and the possibility of expanding the implications of ANN (Artificial Neural Network) for predicting buying behavior for DTC (direct-to-consumer) brands should be explored. To forecast consumer loyalty as a critical feature of consumer behavior, a variety of neural network models of various complexity are constructed. When compared to the more standard logistic regression approach, neural networks outperform logistic regression in predicting customer loyalty. Utilitarian and Informational Reinforcement factors, both independently determined, are shown to contribute significantly to the explanation of consumer choice. The potential of connectionist models for predicting and explaining consumer behavior is discussed, and future research directions are proposed for investigating the predictive and explanatory capacity of connectionist models, such as neural network models, and their integration into consumer behavior analysis within the theoretical framework.
... In addition, the authors applied decision tree analysis and multilayer perception are the areas whose applications are negligible in marketing management and essentially rare in branding (Kim and Ahn 2009;Govindarajan and Chandrasekaran 2012). The patterns in the data need to be deciphered to gain insight into aspects such as customer preferences, market trends, and business performances (Mahapatra et al. 2010). ...
Article
Full-text available
The article argues that apart from price competition with other underlying factors differentiate the customers’ perception of the national brand (NB) and private label brand (PLB). The data collected across various brands indicate that respondents prefer NBs for their reputation, stock adequacy, and sizes over PLBs’ price benefits. Rough set theory, analysis shows that the maximum one-to-one correspondence exists between the brand’s reputation, the fabric quality, and the information quality for a product with each brand, irrespective of the brand type. The C4.5 decision tree analysis and multilayer perceptron theory show that the reputation and quality of the fabric are also accountable for selecting PLB over NB, besides price. The choice of colors is the conspicuous attribute of a PLB choice, followed by ease of availability in online stores. The study demonstrates that retailers should focus more on brand image and brand repositioning than keeping price gaps to satisfy young customers
... In the case of China, imported fruits are more expensive than locally produced fruits, due to the symbolic and hedonic values in which imported fruits provide to the local. Thus many customers are purchasing them, leading to a rising demand and higher profits for retailers (Kim & Ahn, 2009). In the case of Indonesian consumers, imported apples, oranges, and mangoes are more popular than locally produced fruit, because of their lower prices and attractive colors. ...
Article
This study attempts to explore factors influencing the choice of locally grown or imported fruits among young Malaysians. It investigates how consumer preference, socioeconomics, and demographic profiles can affect their choice of which fruit category they pick. Five hundred respondents were interviewed by using a structured questionnaire to collect information related to their fruit preferences and choices. The millennium generation in Malaysia, especially the Malay living in Johor, were surveyed as a representation of future consumers of fruit and their subsequent choices and demand. Factor analysis was carried out on statements regarding consumer preferences on choices of local or imported fruit. Five factors were identified as the outstanding consumer preferences for fruits. Demographic profiles of the respondents such as family size, and dimension of fruit preferences, including country of origin, perceived quality, and environmental concerns, were important factors that affect consumers’ purchasing behavior in choosing locally grown or imported fruits. Logit regression indicated that family size, country of origin product quality, perceived quality, and variety of fruits will likely influence the preferences for fruit among the younger generation.
... No such prior knowledge is required for ANN modeling, however, because the nature of the non-linearity can be adjusted by changing the number of hidden layers in the network and the number of nodes in each layer, and by altering the transfer function (Shahin, et al., 2008). One of the most useful aspects of ANN-based models is that they facilitate assessment of the relative importance of their input variables by sensitivity analysis (Kim and Ahna, 2009). Sensitivity analysis is important in model development because it provides insights into input-output dependencies, i.e. the relative influence of different input variables on the model's output. ...
Article
Full-text available
Estimation of restraint is very important for accurately predicting the risk of early thermal and shrinkage cracking in concrete structures. The stress in young concrete is affected by changes in its dimensions during hydration and the restraint imposed by adjoining structures. In concrete culverts, the restraints from existing structures acting upon the first and second casting sections to be cast are different, causing them to exhibit different early cracking behaviour. This work presents a new method for predicting restraint in complex concrete structures using artificial neural networks (ANNs). Finite element calculations were performed to predict restraint in 108 slabs, 324 walls and 972 roofs from second sections of concrete culverts, and the results obtained were used to train and validate ANN models. The ANN models were then used to study the effects of varying selected parameters (the thickness and width of the roof and slab, the thickness and height of the walls, and the length of the culvert section) on the predicted restraint. Mathematical expressions for predicting restraint values in slabs, walls and roofs were derived based on the ANN models’ output and implemented in an Excel spreadsheet that provides a simple way of predicting restraint in practical applications. Restraint values predicted in this way agree well with the results of finite-element calculations.
... No such prior knowledge is required for ANN modeling, however, because the nature of the non-linearity can be adjusted by changing the number of hidden layers in the network and the number of nodes in each layer, and by altering the transfer function (Shahin, et al., 2008). One of the most useful aspects of ANNbased models is that they facilitate assessment of the relative importance of their input variables by sensitivity analysis (Kim and Ahna, 2009). Sensitivity analysis is important in model development because it provides insights into input-output dependencies, i.e. the and subsequently adapted by is used in this study to determine the relative importance of the input parameters. ...
Thesis
Full-text available
One of the widespread issues in concrete structures is cracks occurring at early age. Cracks that appear in the young concrete may cause early start of corrosion of rebars or early penetration of harmful liquids or gases into the concrete body. These situations could result in reduced service life and in significantly increased maintenance cost of structures. Therefore it is important for construction companies to avoid these cracks. Volumetric deformations in early age concrete are caused by changes in temperature and/or the moisture state. If such movements are restrained, stresses will occur. If the tensile stresses are high enough, there will be a damage failure in tension and visible cracks arise. These stresses are always resulting from a self-balancing of forces, either within the young concrete body alone, i.e. without structural joints to other structures, or from the young concrete in combination with adjacent structures through structural joints. The decisive situation within a young concrete body alone is typically high stresses at the surface when the temperature is near the peak temperature within the body. This situation occur rather early for ordinary structures, say within a few days after casting for structures up to about some meters thickness, but for very massive structures like large concrete dams, it might take months and even years to reach the maximum tensile stresses at the surface. Usually this type of cracks is denoted "surface cracks", and in some cases only a temperature calculation may give a good perception to make decisions of the risk of surface cracking. On the other hand, the decisive situation within a young concrete body connected to adjacent structures, might include both risk of surface cracking at some distance away from the structural joint and risk of through cracking starting in the neighborhood of the structural joint. If the young concrete body is small in accordance to the adjacent structure, or, in other words, if there is an overall high restraint situation in the young concrete, the risk of early surface cracking might be out of question. So, restraint from adjacent structures represents one of the main sources of thermal and shrinkage stresses in a young concrete body. This study is mainly concentrated on establishing the restraint inside the young concrete body counteracted by adjacent structures, and how to estimate the risk of through cracking based on such restraint distributions. The restraint values in the young concrete are calculated with use of the finite element method, FEM. Any spatial structure may be analyzed with respect to the level of restraint. Calculations of risk of cracking are demonstrated with use of existing compensation plane methods, and a novel method denoted equivalent restraint method, ERM, is developed for the use of restraint curves. ERM enables the use of both heating of the adjacent structure and/or cooling of the young concrete, which are the most common measures used on site to reduce the risk of early cracking. In a design situation many parameters are to be considered, like type of cement, different concrete mixes, temperature in the fresh concrete, surrounding temperatures, temperature in the adjacent structure, measures on site (heating/cooling/insulation), sequence order of casting. Therefore, in general a lot of estimations concerning risks of cracking are to be performed. The main objective with the present study is to develop methods speeding up and shorten the design process. Furthermore, established restraint curves have been applied to the method of artificial neural networks (ANN) to model restraint in the slab, wall, and roof for the typical structures wall-onslab and tunnel. It has been shown that ANN is capable of modeling the restraint with good accuracy. The usage of the neural network has been demonstrated to give a clear picture of the relative importance of the input parameters. Further, results from the neural network can be represented by a series of basic weight and response functions, which enables that the restraint curves easily can be made available to any engineer without use of complicated software. A new casting technique is proposed to reduce restraint in the newly cast concrete with a new arrangement of the structural joint to the existing old concrete. The proposed technique is valid for the typical structure wall-on-slab using one structural joint. This casting method means that the lower part of the wall is cast together with the slab, and that part is called a kicker. It has been proven by the beam theory and demonstrated by numerical calculations that there is a clear reduction in the restraint from the slab to the wall using kickers. Restraint is affected by casting sequence as well as boundary conditions and joint position between old and new concrete elements. This study discusses the influence of different possible casting sequences for the typical structure wall-on-slab and slab-on-ground. The aim is to identify the sequence with the lowest restraint to reduce the risk of cracking.
Article
Introduction: Organic farming plays an important role in protecting the environment, maintaining non-renewable resources, improving the food quality, reducing the production of unnecessary products, and promoting market- oriented agricultural sector. In fact, organic farming make a significant contribution in improving the quality of the environment and natural resources, and also it has a positive effect on the quality of food supply and the promotion of public health. Given the many benefits of organic products, the market for these products has been increasingly considered by researchers, government officials and consumers. First step in developing the market for organic products is to meet the needs and demands of consumers. Recognizing consumer behavior and investigating the factors affecting it contributes significantly in success of any economic system. Besides, in advanced marketing studies, the process of identifying consumer choice is very crucial. Contrary to economists' views, consumers give little weight to benefits and costs in their decision making, and their choices are based on people's behavior, habits and other factors that may speed up the decision making. Consumer preferences for organic products depend on many factors and the importance of each of these factors varies among different consumers. Therefore, the main aim of this study is to rate and evaluate factors affecting the consumer preferences for organic products (fruitage, vegetables and cucurbits) in Mashhad city. Materials and Methods: Many marketing researchers use regression models to evaluate consumer decisions. In these models, decision variables are definitive part of utility function which is used to calculate how to choose a product. Linearity of utility function is the vital hypothesis. To specify a non-linear model, it is necessary to use variables that can show non-linear effects (For example, including the quadratic term of variables). However, this requires the insertion of assumptions about the nature of the utility function which ultimately leads to specification bias, and subsequently misinterpretation and unreasonable applications in marketing studies. Modeling complex processes is one of the advantages of artificial neural networks, and in this approach, it is not necessary to specify a mathematical relationship between the variables. The nonlinear and complex interactions can be considered between system variables using artificial neural network model. In this study in order to rate and evaluate factors affecting consumers preferences for organic products (fruitage, vegetables and cucurbits) an artificial neural network has been used that is consist of three dependent or target variables. Also, in order to evaluate the importance of the explanatory variables of the artificial neural network, partial derivatives approach has been used. Therefore, the use of three output variables on artificial neural networks simultaneously and partial derivative approach was distinctive features of this study compared with previous ones. Data is collected through questionnaires from a total of 175 households living in Mashhad. Age, gender, education, household size, number of household members under 10 years, number of household members over 65 years, price, having information on organic products, product appearance, having information on the supply of organic products, nutritional values, ease of access, the supply of organic products during the year and having labels were the input variables of artificial neural network. Consumer preferences for the purchase of organic fruitage, vegetables and cucurbits were the target variables of the artificial neural network. Results and Discussion: The results indicate that price has the greatest influence on willingness to consume organic products among all other factors. The price effect on willingness to consume organic products is different among individual consumers, and it's independent of the product. This finding suggested that the price of organic products had a significant impact on consumer purchasing decisions in comparison with other marketing mix elements. Conclusion: The adoption and implementation of marketing strategies based on price play a very important role in the growth of organic products markets. The results of the study indicate that, for each consumer and each product, the price had almost the similar effects on willingness to choose. Hence, it is recommended that the similar pricing strategies be used for these three organic products.
Article
The concept of the 'postmodern consumer' plays a central role in the debate, started in the early 80s, about economic, social and cultural changes in developed countries in the years following the end of the second world war. These changes were interpreted as a passage from modern to postmodern society. According to this literature, postmodern conditions have had a significant impact on the consumer, especially with regard to his/her psychological characteristics. In this new framework the consumer is viewed as someone more interested in the symbolic or cultural value of products and services than in their functional and utility value. At the same time, he/she is represented as an active player in the market scenario, exercising the freedom to move in search of trademarks, symbols and experiences through which he/she can communicate his/her own identity. The figure of the postmodern consumer is difficult to place in the framework of standard neoclassical theories on consumerism, which highlights the shortcomings of this theoretical approach in studying the behavior of the postmodern consumer. These shortcomings are likely to be more relevant when considering the consumer of food products, given the strong nexus between consumption and the well-being of the consumer and the symbolic and cultural value that food products project. The main goal of the paper is to provide an interdisciplinary overview of the postmodern consumer of food products by means of an analysis of scientific literature, mainly in the areas of behavioral economics, sociology and psychology. Following this, the paper focuses on questions regarding information and the rational behavior of consumers as being the main hypothesis upon which standard neoclassical theories are based, adding to the traditional approach to consumer choice the new insights provided by this different perspective. Finally, the implications of this type of analysis for food safety and quality policies are considered, together with a discussion on further research needed to define more effective policies.
Article
Full-text available
The dimensions and profiles of consumer decision-making styles of young-adult Chinese are investigated using a modified model of consumer decision-making styles and data recently collected from five Chinese universities. The results are then compared with those of similar studies using American and Korean data. While the dimensions of consumer decision-making styles are similar in these three countries, differences in consumer purchasing power and, maturity of the consumer market may contribute to the differences in consumer decision-making styles.
Article
In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube; only mild conditions are imposed on the univariate function. Our results settle an open question about representability in the class of single hidden layer neural networks. In particular, we show that arbitrary decision regions can be arbitrarily well approximated by continuous feedforward neural networks with only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks.
Article
Abstract Research on how Chinese consumption values influence Chinese consumer behaviour is rare. First, this paper examines consumers’ attitudes towards the physical and intangible attributes of imported fruit. Then, it identifies consumers’ consumption values and the role of these values in purchasing behaviour. Data were collected through point of sale intercept surveys conducted in Guangzhou, China. Latent consumption values of consumers were identified through factor analysis. K-means clustering revealed four natural groupings of consumers, each group demonstrating different consumption values. The results demonstrated the primary importance of symbolic values and hedonic values in the decision to purchase imported fruit. Such consumption values may derive from the intermingling of Confucian and Western cultural values. Results from this study could help to better understand interrelationships among product attributes, consumption values and cultural values, and could make a significant contribution in developing strategies to market imported fruit in China.
Article
An abstract is not available.