ArticlePDF Available

A New Perspective for Neural Networks: Application to a Marketing Management Problem

September 2009
Journal of Information Science and Engineering 25(5):1605-1616

September 2009
25(5):1605-1616

Source
DBLP

Authors:

Skolkovo Institute of Science and Technology

Over the last few years, connectionism or neural networks (nn) have successfully been applied to a wide range of areas and have demonstrated their capabilities in solving complex problems. Current indications show that these techniques are very important and rapidly developing areas of research and applications, particularly, in the area of data mining for knowledge discovery. One particular neural network model, the back-propagation (BP) algorithm, has performed very well in this regard and it is now accepted as a reliable method for data mining. However, these models have their shortcomings. The major difficulty lies in the fact that the relationships between specific variables and the neural network results are, at best, difficult to explain. This article presents an innovative but simple method for using nn to understand the pattern/outcome correlation to interpret a cause and effect relationship. A comparative analysis and experimental results are also presented to show the validity of the proposed scheme.

. Description of attributes: the independent variables (x 1 ~ x 11 ) and the dependent variable (y).

…

Figures - uploaded by Heejune Ahn

Content may be subject to copyright.

Content uploaded by Heejune Ahn

Content may be subject to copyright.

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 25, 1605-1616 (2009)

1605

A New Perspective for Neural Networks: Application

to a Marketing Management Problem

JAESOO KIM AND HEEJUNE AHN+

Department of Computer Science and Engineering

+Department of Control and Instrumentation Engineering

Seoul National University of Technology

Seoul, 139-743 Korea

E-mail: heejune@snut.ac.kr

Over the last few years, connectionism or neural networks (NN) have successfully

been applied to a wide range of areas and have demonstrated their capabilities in solving

complex problems. Current indications show that these techniques are very important and

rapidly developing areas of research and applications, particularly, in the area of data

mining for knowledge discovery. One particular neural network model, the back-propa-

gation (BP) algorithm, has performed very well in this regard and it is now accepted as a

reliable method for data mining. However, these models have their shortcomings. The

major difficulty lies in the fact that the relationships between specific variables and the

neural network results are, at best, difficult to explain. This article presents an innovative

but simple method for using NN to understand the pattern/outcome correlation to inter-

pret a cause and effect relationship. A comparative analysis and experimental results are

also presented to show the validity of the proposed scheme.

Keywords: neural networks, sensitivity analysis, CART, logistic regression, data mining

1. INTRODUCTION

1.1 Background

We all know that information has become a very important commodity. Every sec-

ond hundreds of thousands of new records of information are generated. This informa-

tion needs to be summarized and synthesized if it is to support effective decision-making.

This involves the challenge of dealing with huge sets of data, dynamic data, incomplete

or imprecise data, noisy data and missing attributes, and redundant or insignificant data.

A successful approach to modeling non-linear relationships under these situations can be

usage of artificial neural networks (ANN) or connectionism that can be trained with the

set of available data [1, 8]. One particular neural network type, the back-propagation (BP)

algorithm has performed very well in this regard and it is now accepted as a reliable

method for data mining [2].

Neural networks share the advantages with the many other data mining tools. An

advantage they have over classical models used to analyzed data, such as regression

analysis, is that they can fit data where the relation between independent and dependent

variables is nonlinear and where the specific form of the nonlinear relationship is un-

known. Also, decision trees, a method of splitting data into homogeneous clusters with

Received October 19, 2007; revised April 9 & July 17, 2008; accepted August 28, 2008.

Communicated by Chin-Teng Lin.

+ Corresponding author.

JAESOO KIM AND HEEJUNE AHN

1606

similar expected values for the dependent variable, are often less effective when the pre-

dictor variables are continuous than when they are nominal (or categorical). Neural net-

works work well with both nominal and continuous variables. They do not require that

the relationships between predictor and dependent variables be linear whether or not the

variables are transformed. The neural network method is more robust and has better pre-

dictive accuracy than classical methods, such as discriminant and logistic analysis, in

many data mining applications. As the focus of this paper is neural networks, the other

data mining techniques will not be discussed further.

In spite of their advantages, neural networks with BP algorithm have their short-

comings. The major difficulty lies in the fact that the relationships between specific or

causal variables and the neural network results are, at best, difficult to explain because of

the complexity of the functions used in the neural network approximations. The output of

a neural network is a predicted value and some goodness of fit statistics. However, the

functional form of the relationship between predictor and target variables is not made

explicit. So the nature of the strength of the relationship between the independent and

dependent variables, i.e., the importance of each variable, is usually not revealed. Vali-

dating unexplainable results can be a significant challenge. This means there must be

something more general in their activity that led them to this result. Here we face the

challenge of finding appropriate way to figure out these interrelationships to make use of

them in the future without requiring some additional knowledge about the character of

the task.

Basically, the aim of this paper is to show that the neural network modeling may

offer significant advantages over the commonly used estimation procedures that can

summarise the large amount of collected data into relevant, concrete and effective action

recommendations for decision makers. In order to meet the growing demand of decision

makers, we have to focus on the systems that find adequate explanation models.

In this study, we also tested the comparative abilities of a neural network model, lo-

gistic regression, and classification and regression trees (CART) at capturing interrela-

tionship between the independent variables and the dependent variables. And this paper

describes how a model of factors influencing consumer behavior, from which initial

measures can be used, can be produced using a neural network based on consumer sur-

vey data.

1.2 Cultural Orientation and Consumer Behavior

The culturally based norms (appropriate behavior in a situation) and values (desir-

able behavior across situations) would lead to differences in consumer behavior across

cultures. These values and norms are passed on from the community to an individual as

he or she is socialized within the community. Consumers learn values and norms about

the acquisition, consumption and disposal of products through socialization in their

communities. Thus cultural values and norms become a primary explanation of similari-

ties in behavior of individuals within the community, and differences in the behavior of

individuals across communities [4].

Especially, Chinese consumer behavior is essentially different because of its unique

cultural, social and economic roots [13-15]. The behavior of Chinese consumers has

A NEW PERSPECTIVE FOR NEURAL NETWORKS

1607

even been distinguished from that of consumers in other Asian countries [5]. Sun &

Collins [13] studied consumers’ attitudes towards imported fruit in Guangzhou, finding

that fruit attributes in relation to symbolic and hedonic values were of primary impor-

tance in the decision to purchase.

Imported fruits have been widely available in China since about 1993. They are re-

tailed throughout the year in every major city in the country. Though imported fruit is

more expensive than locally produced fruit, there are still many willing buyers and the

imported fruit business has experienced burgeoning demand and high profits.

For this study, survey data were collected through structured intercept interviews

with consumers at point of sale immediately after they had purchased imported fruit.

Results will help to broaden our understanding of Chinese consumer behavior and pro-

vide valuable information when formulating marketing strategies.

The rest of the article is organized as follows. Section 2 reviews the studies on neu-

ral networks. In section 3, we introduce a sensitivity measure that assesses the relative

importance of the input factors used by the network to arrive at its targets and review the

existing relative importance measure for neural network input elements. In section 4, we

apply the method discussed in section 3 to a marketing management problem. Then, sec-

tion 5 compares and contrasts how neural networks and classical modeling techniques

deal with the specific modeling challenges and how the output of neural networks can be

used to better understand the relationship in the data through sensitivity analysis. Subse-

quently, we examine the results of our studies and test how the neural network model per-

forms in practice using the real-world data set. Finally, we draw conclusions in section 6.

2. NEURAL NETWORK AND ITS UNDERSTANDING THE OUTPUT

Neural networks are based on an early model of human brain function. Although it

is described as a network, a neural network is nothing more than a mathematical function

that computes an output based on a set of input variables. The network paradigm makes

it easy to decompose the larger function to a set of related subfunctions, and it enables a

variety of learning algorithms that can estimate the parameters of the subfunctions.

There are many different types of neural networks. A feedforward neural network

with one hidden layer considered in this paper is known as a multilayer perceptron

(MLP), which is one of the most popular kinds of neural networks and uses supervised

learning. As a result, its effectiveness has been established and software for applying it is

widely available. It has also been proved that a network with only one hidden layer is

enough to approximate any continuous function given there are enough nodes in the

hidden layer [3, 9]. The hidden layers are used to model the nonlinearities in the rela-

tionship between inputs and output [11]. Therefore neural networks might represent a

viable alternative to multivariate statistical methods.

Although neural networks can be applied to a number of data mining problems, in-

cluding classification, regression, and clustering, the complexity, combined with the non-

descriptive nature of neural network models, often discourages all but the most scientists

and researchers from employing the data mining technique.

Neural networks are trained by adjusting weights by some automatic learning algo-

rithms so that the result of stability approximates the desired outcomes for the provided

JAESOO KIM AND HEEJUNE AHN

1608

inputs. The output from neural networks varies greatly. Other common outputs are accu-

racy measures such as confusion matrix, R2, and so forth for validating the model. The

output from a neural network is purely predictive. Unfortunately, none of these aids the

user in understanding the model or the underlying data relationships. Because there is no

descriptive component to a neural network model, a neural network's choices are hard to

understand, and this often discourages its use. In fact, this technique is often referred to

as a black-box technology.

Because of the more complicated functions involved in neural network analysis, in-

terpretation of the variables is more challenging. One approach is to examine the weight

connecting the input variables to the hidden layer. Those which are closest to zero are

least important. A variable is deemed unimportant only if all of these connections are

near zero. This procedure is typically used to eliminate variables from a model, not to

quantify their impact on the outcome. Due to the homogeneous structure of neural net-

work, it is hard to extract structured knowledge from either the weights or the configura-

tion of the neural network in question. It should be emphasized that the weights in a neu-

ral network with hard-limiter as its activation function do have physical meaning [12].

The weights of a given node represent the coefficients of the hyperplane or discriminant

function that partitions the input space into two regions with different output values.

However, this interpretation of weights gets weaker and weaker if the net’s activation

function is either sigmoid or hyperbolic tangent functions and the given dependent vari-

ables are continuous instead of binary [10]. Therefore the weights are relatively unin-

formative for determining the influence of the variables on the fitted values.

Another approach to assessing the predictor variables’ importance is to compute a

sensitivity analysis for each variable. The sensitivity is a measure of how much the pre-

dicted value’s error increases when the variables are excluded from the model one at a

time. Through the sensitivity analysis, it is possible to generate an estimate of the general

level of influence exhibited by each parameter from an analysis of the network weights

in a systematic manner. This can be used to rank each variable’s importance. In the fol-

lowing section, a method for calculating output sensitivities to inputs’ variations from a

trained neural network is discussed in some detail.

3. SENSITIVITY ANALYSIS

In general, one of the key factors that affect the success of process modeling is the

ability to extract information about the model structure and the relationships between its

inputs and outputs from the trained network. Such information is essential for model

validation and for process optimization, control and safety assessments. Moreover, in

some cases where the original process is not well understood, this information can be

employed as a basis for the analysis of the process and in determining the most signifi-

cant factors that affect it.

For multilayer feedforward networks with n input nodes, one hidden layer with h

nodes and k output nodes the relative importance (RI) of the ith component of the input

vector can be estimated as follows:

A NEW PERSPECTIVE FOR NEURAL NETWORKS

1609

hji kj

jji

ik hji

∑∑

∑

RI w (1)

where wji is the weight from the ith input node to the jth hidden node and wkj is the

weight from the jth hidden node to the kth output node. Biases are given the subscript 0.

Hence, the RI measure incorporates certain rates of change of the strengths of sig-

nals as they flow through the network. For example, wji is the partial derivatives of the

inputs to the hidden layer with respect to the inputs to the network. Similarly, wkj is the

partial derivatives of the inputs to the output layer with respect to the outputs of the hid-

den layer. So this RI measure is simply compounded weighted averages and is inde-

pendent of the activation function, therefore it is applicable to networks trained on a

range of activation functions, which are monotonically increasing.

Eq. (1) includes a component to normalize for the effect of extreme weights con-

necting input and hidden nodes. This additional component is also included in a closely

related formula given by Garson [7]:

111

ji kj

jji

ji kj

nh n

iji

∑∑

∑∑ ∑

RI ww

(2)

Thus, for each j of h hidden nodes, sum the product formed by multiplying the in-

put-to-hidden connection weight of the input node i of variable for hidden node j, times

the connection weight of the output node k for hidden node j, then divide by the sum of

such quantities for all variables. The result is the percentage of all output weights attrib-

utable to the given independent variable, excepting bias weights arising from the back-

propagation algorithm.

However, the related method proposed by Garson does not include the effect of the

bias, which could result in a significant omission. Garson’s measure places more empha-

sis on the connection strengths from the hidden layer (wji) to the output layer (wkj), but it

does not measure the direction of influence (positive or negative). That is, during the

summation process, positive and negative weights can cancel their contribution or influ-

ence, which leads to inconsistent results. Including the bias influence allows all the influ-

ences to be considered in the context of the complete network. For instance, it is possible,

although unlikely, that the output of a network is based purely on the bias, and the input

signal has no significant effect. Using Garson's approach, the input parameters could be

assigned influence to various degrees since the overwhelming bias effect is ignored.

The RI measure given in this paper would illustrate the minimal (zero) influence of

the inputs and the large effect of the bias. In this way, it is possible that the denominator

in Eq. (1) will reduce to zero for non-zero weights. That is, the denominators will only

be zero if all the weights are zero, for instance all weights from the hidden layer to the

JAESOO KIM AND HEEJUNE AHN

1610

output layer are zero resulting in a network which simply outputs a single value deter-

mined by the activation function for any input signal or for all weights from the input

node under consideration to the hidden layer, including the bias, to be zero in which case

the input parameter will have no effect. Also, note that the numerator can become zero

under the same conditions, i.e., non-zero weights. In this case the strength of the input

would be zero. In the following section, it will be shown how this method can be applied

to a real world problem.

4. APPLICATION EXAMPLE

4.1 Data Collection and Statistical Information

For this study, a supermarket in China was chosen and the survey conducted was

mall-intercept personal interviews. The shop in its fruit section was deliberately divided

commodities into two subsections, domestic fruit and imported fruit, and also marked in

both sides with clear sign. It was a heavily trafficked store, and its management gave

approval to promote the survey as being on behalf of the company.

The purpose of the survey was explained to interviewees as being for the improve-

ment of service so that customers’ needs could be better understood and met. This was in

fact true because the company wanted to use the results. The rationale for this approach

was to ensure that the interviewee’s personal interest was directly associated with the

quality of their answers to the questions. Surveys were administered so as to avoid public

holidays and to achieve a spread across weekdays. The survey involved 520 personal

interviews in Guangzhou and a total of 495 useable responses was recorded for the study.

Respondents were asked about both their beliefs and evaluation of the imported

fruit, including their intention of purchasing imported fruit on each statement. 11 ques-

tions relating to consumers’ attitudes and perceptions that might be motivated their buy-

ing intention to imported fruit were designed.

The questions that consumers were asked in relation to these 11 attributes were

framed into statements according to Fishbein’s theory [6]. Fishbein’s proposition is that

people form attitudes towards a product attribute on the basis of their belief about that

attribute (comprised of perceptions and knowledge) and their positive or negative feel-

ings towards that attribute (comprised of their evaluation of that belief). According to

Fishbein, a consumer’s overall attitude toward imported fruit products would be repre-

sented by the sum of the products of their beliefs about each attribute and their evalua-

tion of those beliefs.

Among them, seven were the objective characteristics relating to attributes and per-

ceptions of imported products, such as appearance, packing, pollution, taste good, taste

different, freshness and price. Four were the subjective attributes towards symbolic

means of purchasing imported fruit products: achievement; wealthy; personality and so-

cial statues. Besides assessing consumers’ attitudes and perceptions of imported fruit a

behavioral response measure of consumer intention was elicited.

Table 1 describes some features of the independent variables and the dependent

variables used in this work. Note that there is no missing for the variables. Some descrip-

tive statistics for the data set of the consumers’ attitudes and perceptions towards im-

A NEW PERSPECTIVE FOR NEURAL NETWORKS

1611

Table 1. Description of attributes: the independent variables (x1 ~ x11) and the dependent

variable (y).

Attributes Variable Type

Appearance of fruit (x1) continuous

Packing status (x2) continuous

Less pollution (x3) continuous

Taste good (x4) continuous

Taste different (x5) continuous

Freshness (x6) continuous

Person’s achievement (x7) continuous

Wealthy (x8) continuous

Personality (x9) continuous

Social status (x10) continuous

Attitude to price (x11) categorical

Buying intention (y) categorical

Table 2. Descriptive statistics: each value is the product of beliefs by evaluation.

Attributes Mean Std. Dev. Skewness

Appearance of fruit 5.54 4.39 1.76

Packing status 5.70 4.78 1.73

Less pollution 7.60 5.57 1.00

Taste good 7.72 5.91 1.08

Taste different 6.80 5.53 1.51

Freshness 6.42 5.45 1.26

Person’s achievement 13.51 7.78 0.18

Wealthy 13.64 7.77 0.17

Personality 12.37 7.98 0.33

Social status 13.91 8.06 0.08

Attitude to price 1.71 0.46 − 0.91

ported fruit are also illustrated in Table 2. The data in this example have skewness values

of ranging from 0 to 2, which are considered acceptable for this task so that approximate

normality is attained after the data is logged.

4.2 Methods

The data set consists of 495 respondents on consumer survey in an imported fruit

market. The cross-validation procedures used in the neural network simulation were ap-

plied to the questionnaire data to prevent overfitting; that is 30 percent of the sample was

used to train the network, 20 percent was used to determine a stopping point for training,

and the remaining 50 percent was used for hold-out-sample testing of the predictive ac-

curacy. For the logistic regression and the CART modeling, the data was separated into a

training set of 248 customers and a test set of 247 customers. Variables used are de-

scribed in Table 1.

JAESOO KIM AND HEEJUNE AHN

1612

The logistic regression coefficients correspond to “B” coefficients in the logistic re-

gression equation indicate the amount of change expected in the log odds when there is a

one unit change in the predictor variable with all of the other variables in the model held

constant. A coefficient close to 0 suggests that there is no change due to the predictor

variable. There is a relationship between the logistic coefficients and the odds ratios,

odds ratio = Exp(B). These coefficients are used to compare the relative importance of

the independent variables in this work as it can be seen in Table 3.

Table 3. Coefficients of the relative importance to the various imported fruit character-

istics.

Attr Logistic Exp(B) (rank) NN RI (rank) CART Scores (%) (rank)

x1 1.062 (3) 0.015 (11) 3.71 (10)

x2 0.928 (10) 0.072 (7) 0.00 (11)

x3 1.060 (4) 0.266 (2) 27.55 (3)

x4 1.107 (2) 0.227 (3) 100.0 (1)

x5 0.996 (8) 0.066 (8) 25.22 (4)

x6 0.926 (11) 0.028 (10) 8.21 (9)

x7 1.030 (5) 0.111 (4) 22.35 (6)

x8 1.019 (6) 0.093 (5) 22.14 (7)

x9 0.991 (9) 0.042 (9) 11.34 (8)

x10 1.005 (7) 0.087 (6) 24.50 (5)

x11 2.073 (1) 0.269 (1) 57.38 (2)

For building a neural network model, we only considered a feedforward with a sin-

gle hidden layer architecture, as they can approximate any continuous function and train-

ing algorithm was the Lavenberg-Marquardt algorithm. The size of hidden nodes needs

to be only a relatively small fraction of the input layer. In this study, one empirical guide-

line is to determine the number of hidden nodes as twice the square root of the sum of

input and output nodes, design multiple networks by varying the initial weights, and use

the validation set to choose the best network. If the network fails to converge to a solu-

tion, it may be that more hidden nodes are required. If it does converge, we may try

fewer hidden nodes. Application of the fitted model to the test data indicated that a 4

node neural network provided the best model (i.e., 11-4-1). The performance measures

of each neural network model such as MSE, and classification accuracy (%) are the av-

erage of 5 trials.

To calculate a variable importance score, CART looks at the improvement measure

attributable to each variable in its role as a surrogate to the primary split. The values of

these improvements are summed over each node and totaled, and are scaled relative to

the best performing variable. In such ways, CART automatically produces the variable

importance ranking (scores) based on the contribution predictors make to the construc-

tion of the tree. The variable importance rankings or predictor rankings (%) are strictly

relative to a specific tree; change the tree and we might get very different rankings.

A NEW PERSPECTIVE FOR NEURAL NETWORKS

1613

5. COMPARATIVE ANALYSIS AND RESULTS

In this example, we attempt to assess the relative importance of input or causal vari-

ables by examining a trained neural network model using the real-world data. A neural

network with 4 nodes in the hidden layer was run on a training and validation set. Each

of the three models was tested on consumer survey data and used to rank variables in

importance.

Table 3 displays the results of the sensitivity test for each of the variable and shows

the coefficients of the models developed with the logistic regression method, ranking

scores (%) with the CART method and the neural network method using Eq. (1). The

table shows that x11 (price) is the most important input factor, followed closely by x3 (less

pollution); x4 (taste good) is of substantial but somewhat lesser importance. On the basis

of this measure, we could also correctly infer that in the case of subjective characteristics,

x7 (achievement) is the most likely direct cause of buying behavior and that x8 (wealthy)

are slightly less important but still substantial causes of the consumer’s buying behavior.

The odds ratio (or

weights or Exp(B)) in regression is interpreted as the ratio of

the relative importance of the causal or input variables in the model. As indicated by the

Exp(B) for the odds ratio in the table, x11, x4, x1, x3 and x7 have the highest importance in

affecting the choice of the consumer’s buying strategy. For this example, the odds ratio

of x11 was 2.073, 1.107 was x4, and 1.062 was x1. On the basis of regression, we would

correctly infer that x11 was the most important and likely direct cause of buying intention

and that x4, x1, x3 and x7 were equal but lesser causes. This similarly matches with the

results from the neural network analysis.

In the CART model case, the scores (%) reflect the contribution each variable

makes in classifying or predicting the dependent variable, with the contribution stem-

ming from both the variable’s role as a primary splitter and its role as a surrogate to any

of the primary splitters. In this example, x4, the variable used to split the root node, is

ranked as most important. The variable, x2, received a zero score, indicating that this

variable did not play any role in the analysis as either as primary splitters or surrogates.

x11 (price) also is an important splitting variable and has the highest scores, followed by

x3, x5, x10 and x7 as illustrated in Table 3.

For all three models’ measures, x11 has the highest contribution. For both our meas-

ure and in logistic regression’s measure, x3 and x4 were second, respectively, whereas in

CART, x11 was second, Thus, the logistic and neural network methods identify x11 as the

most important variable, whereas the CART model identifies x4 as the most important

variable.

According to the sensitivities, x3, x4 and x11 are the most important variables and x1

is the rather least important. This contrasts with the importance rankings of x1 in the lo-

gistic analysis, where x1 was a more important variable than others. Note that these are

the sensitivities for the particular models. A different initial starting point for the neural

network or a different number of hidden nodes could result in a model with different

sensitivities and the rankings can be quite sensitive to random fluctuations in the data.

The importance rankings in CART need to be understood as being relative to a particular

tree and the rankings are strictly relative to a given tree structure. Overall, it shows that

the consumer’s objective characteristics were more important than the subjective charac-

teristics in effecting the consumers’ purchasing behavior. The low sensitivities were

JAESOO KIM AND HEEJUNE AHN

1614

probably a result of the high correlations of the variables with each other.

These findings have obvious managerial implications since the consumers’ percep-

tions of the various imported fruit characteristics can be influenced by managerial ac-

tions. For some variables (e.g., packing status and freshness), the financial costs needed

to change consumers’ perceptions might be large, but these results show that it may have

little effect on consumer’s purchasing intention and should therefore not be a priority

item for managerial action. Conversely, since the ultimate purchasing behavior is more

strongly influenced by other variables, such as price/taste good/less pollution, manage-

rial actions affecting the consumers’ perceptions and attitudes of the imported fruit selec-

tion can better influence consumer’s buying expectations in the store.

Finally, the classification results for all three models are presented in Table 4. The

cross-validation procedure described earlier was used for all three models. The results in

Table 4 demonstrate that once again the neural network model exhibits a superior ability

to learn the patterns corresponding to consumer choice (buying intention). Consistent

with the simulation results, the neural networks demonstrate significantly better hold-out-

sample predictive accuracy that of the other models.

Table 4. Summary of classifications of the consumers’ attitudes and perceptions by lo-

gistic regression, neural network and CART.

Analysis Method Training Hold-Out

Logistic Regression 67.7% 63.56%

Neural Network 89.59% 65.18%

CART 82.66% 61.13%

6. CONCLUSIONS

In this paper, we have developed a method for determining the relative importance

of each input or causal variable of a neural network on the target. We have then applied

this method to a neural network model of an empirical examination of consumer’s be-

havior and showed that the neural network models can be used to improve the predic-

tions to an important business management problem as well as understand the relative

causal importance and order of the input variables.

Neural networks seem to have an advantage over linear models when they are ap-

plied to complex nonlinear data and may outperform classical models in certain situa-

tions, but interpreting the result is difficult because the nature of the relationship between

dependent and target variables is not usually revealed. Neural networks also have no

problem with trigonometric or logarithmic relationships, but either of these could be a

real problem for the other techniques. This is an advantage neural networks share with

other data mining tools not discussed in detail in this paper.

A method for interpreting the results of neural networks is presented here and in-

corporating such method into neural network models would help address the limitation.

Our RI measure does provide a reasonable method of using neural network for modeling

as well as for classification or prediction, and stands in sharp contrast to misleading

views of neural networks as black-boxes whose iterative processes are beyond human

compre-hension, even if the predictions are good.

A NEW PERSPECTIVE FOR NEURAL NETWORKS

1615

REFERENCES

1. C. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, New

York, 1995.

2. K. J. Cios, W. Pedrycz, and R. W. Swiniarski, Data Mining Methods for Knowledge

Discovery, Kluwer Academic Publishers, Dordrecht, 1998.

3. G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathe-

matics of Control, Signals and Systems, Vol. 2, 1989, pp. 303-314.

4. G. J. Tellis and D. S. Ackerman, “Can culture affect prices? A cross-cultural study

of shopping and retail prices,” Journal of Retailing, Vol. 77, 2001, pp. 57-82.

5. J. X. Fan and J. J. Xiao, “Consumer decision-making styles of young-adult Chi-

nese,” Journal of Consumer Affaires, Vol. 32, 1998, pp. 275-289.

6. M. Fishbein, The Relationship Between Beliefs, Attitudes, and Behavior, Cognitive

Consistency, S. Feldman, ed., Academic, New York, 1966, pp. 199-223.

7. G. D. Garson, “Interpreting neural-network connection weights,” AI Expert, 1991,

pp. 47-51.

8. S. Haykin, Neural Networks: A Comprehensive Foundation, MacMillian and IEEE

Computer Society, New York, 1994.

9. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are

universal approximators,” Neural Networks, Vol. 2, 1989, pp. 359-366.

10. M. Ishikawa, “Structural learning with forgetting,” Neural Networks, Vol. 9, 1996,

pp. 509-521.

11. C. Klimasauskas, “Neural networks: An engineering perspective,” IEEE Communi-

cation Magazine, Vol. 30, 1992, pp. 50-53.

12. R. Lippmann, “An introduction to computing with neural nets,” IEEE ASSP Maga-

zine, Vol. 4, 1987, pp. 4-22.

13. X. Sun and R. Collins, “Attitudes and consumption values of consumers of imported

fruit in Guangzhou, China,” International Journal of Consumer Studies, Vol. 26,

2002, pp. 34-43.

14. X. Sun and R. Collins, “A comparison of attitudes among purchasers of imported

fruit in Guangzhou and Urumqi, China,” Food Quality and Preference, Vol. 15,

2004, pp. 229-237.

15. O. H. M. Yau, Consumer Behavior in China: Customer Satisfaction and Cultural

Values, Routledge, London, 1994.

Jaesoo Kim (金在洙) is an Associate Professor at Depart-

ment of Computer Science and Engineering, Seoul National Uni-

versity of Technology. He received his M.S. and Ph.D. degrees

in Computer Science and Information Science from Monash

University, Australia, University of Otago, New Zealand in 1992

and 1999, respectively, and his B.S. degree in Computer Science

from Seoul National University of Technology, South Korea in

1988. Before joining Seoul National University of Technology

as a Professor, he worked as an Assistant Professor at Zayed

University, UAE from 2002 to 2003, and University of Queen-

JAESOO KIM AND HEEJUNE AHN

1616

sland, Australia from 2000 to 2001. Before his Ph.D., he worked as a research engineer

at Otago University, New Zealand from 1997 to 1999, University of Auckland, New

Zealand from 1993 to 1996, and Monash University, Australia from 1989 to 1992. His

interests include artificial intelligence, computing intelligence, web based data mining,

and intelligent business decision system.

Heejune Ahn (安熙準) received his Ph.D., M.S., and B.S.

degrees in Electrical Engineering from KAIST (Korea Advanced

Institute of Technology), Daejeon, the Republic of Korea, in 1999,

1995 and 1993, respectively. He is an Assistant Professor of the

Department of Control and Instrumentation Engineering at Seoul

National University of Technology, Seoul, the Republic of Korea.

He worked as a visiting researcher at Telecommunication Lab. of

Erlangen-Nuremberg University, Germany, from July 1999 to

February 2002. He has been a GSM/GPRS/UMTS wireless mo-

bile protocol software engineer at Next Generation Handset Lab.,

LG Electronics Inc., Korea, from February 2000 to September 2002. From September

2002 to December 2003 he worked as a software architect and programmer of J2EE web

server system at Tmax Soft Inc. His research interests include multimedia communica-

tions, protocol development, network system performance analysis, and real-time embed-

ded systems.

Investigation of the effect of mineral additives on concrete strength using ANN

Article

Full-text available

Apr 2022

In addition to cement, sand, gravel, and water, the current investigation of the influence of additives on the compressive strength of concrete at 28 days includes fly ash, silica fume, and slag. 315 concrete compositions with various amounts of additives are trained and tested using an artificial neural network. Concrete strength is largely affected by the specific gravity of cement and the specific gravity of fine and coarse particles, according to the studies. For greater compressive strength, it is preferable to use materials with a higher specific gravity. Compressive strength has grown as the amount of silica fumes has increased. Increased amounts of slag or superplasticizer resulted in the same behavior. When the amount of fly ash was increased, the compressive strength of the material decreased.

TRANSFORMING CONSUMER BEHAVIOR TO NEW PARADIGMS THROUGH DEEP LEARNING APPLICATIONS

Article

Sep 2023

Direct-to-consumer (DTC) businesses are gaining popularity as a way to reach a larger number of customers and better suit their needs. Vertical brands are distinguished by their metamorphosis, in which they offer their products straight from the manufacturer to consumers without the use of distribution intermediaries, like in traditional business models. They're obliterating themselves on virtual platforms and undermining their old linear sales processes in the process. In the current scenario, the ability of connectionist models to explain consumer behavior, with a focus on the feed-forward neural network model, should be emphasized, and the possibility of expanding the implications of ANN (Artificial Neural Network) for predicting buying behavior for DTC (direct-to-consumer) brands should be explored. To forecast consumer loyalty as a critical feature of consumer behavior, a variety of neural network models of various complexity are constructed. When compared to the more standard logistic regression approach, neural networks outperform logistic regression in predicting customer loyalty. Utilitarian and Informational Reinforcement factors, both independently determined, are shown to contribute significantly to the explanation of consumer choice. The potential of connectionist models for predicting and explaining consumer behavior is discussed, and future research directions are proposed for investigating the predictive and explanatory capacity of connectionist models, such as neural network models, and their integration into consumer behavior analysis within the theoretical framework.

Positioning of Private Label Brands of Men’s Apparel against National Brands

Article

Full-text available

Sep 2021

The article argues that apart from price competition with other underlying factors differentiate the customers’ perception of the national brand (NB) and private label brand (PLB). The data collected across various brands indicate that respondents prefer NBs for their reputation, stock adequacy, and sizes over PLBs’ price benefits. Rough set theory, analysis shows that the maximum one-to-one correspondence exists between the brand’s reputation, the fabric quality, and the information quality for a product with each brand, irrespective of the brand type. The C4.5 decision tree analysis and multilayer perceptron theory show that the reputation and quality of the fabric are also accountable for selecting PLB over NB, besides price. The choice of colors is the conspicuous attribute of a PLB choice, followed by ease of availability in online stores. The study demonstrates that retailers should focus more on brand image and brand repositioning than keeping price gaps to satisfy young customers

Preference for Locally Grown or Imported Fruit Among the Millennial Generation in Johor, Malaysia

Article

May 2016

This study attempts to explore factors influencing the choice of locally grown or imported fruits among young Malaysians. It investigates how consumer preference, socioeconomics, and demographic profiles can affect their choice of which fruit category they pick. Five hundred respondents were interviewed by using a structured questionnaire to collect information related to their fruit preferences and choices. The millennium generation in Malaysia, especially the Malay living in Johor, were surveyed as a representation of future consumers of fruit and their subsequent choices and demand. Factor analysis was carried out on statements regarding consumer preferences on choices of local or imported fruit. Five factors were identified as the outstanding consumer preferences for fruits. Demographic profiles of the respondents such as family size, and dimension of fruit preferences, including country of origin, perceived quality, and environmental concerns, were important factors that affect consumers’ purchasing behavior in choosing locally grown or imported fruits. Logit regression indicated that family size, country of origin product quality, perceived quality, and variety of fruits will likely influence the preferences for fruit among the younger generation.

Prediction of Restraint in Second Cast Sections of Concrete Culverts using Artificial Neural Networks

Article

Full-text available

May 2016

Estimation of restraint is very important for accurately predicting the risk of early thermal and shrinkage cracking in concrete structures. The stress in young concrete is affected by changes in its dimensions during hydration and the restraint imposed by adjoining structures. In concrete culverts, the restraints from existing structures acting upon the first and second casting sections to be cast are different, causing them to exhibit different early cracking behaviour. This work presents a new method for predicting restraint in complex concrete structures using artificial neural networks (ANNs). Finite element calculations were performed to predict restraint in 108 slabs, 324 walls and 972 roofs from second sections of concrete culverts, and the results obtained were used to train and validate ANN models. The ANN models were then used to study the effects of varying selected parameters (the thickness and width of the roof and slab, the thickness and height of the walls, and the length of the culvert section) on the predicted restraint. Mathematical expressions for predicting restraint values in slabs, walls and roofs were derived based on the ANN models’ output and implemented in an Excel spreadsheet that provides a simple way of predicting restraint in practical applications. Restraint values predicted in this way agree well with the results of finite-element calculations.

Restraint Effects in Early Age Concrete Structures

Thesis

Full-text available

Sep 2015

Majid Al-gburi

One of the widespread issues in concrete structures is cracks occurring at early age. Cracks that appear in the young concrete may cause early start of corrosion of rebars or early penetration of harmful liquids or gases into the concrete body. These situations could result in reduced service life and in significantly increased maintenance cost of structures. Therefore it is important for construction companies to avoid these cracks. Volumetric deformations in early age concrete are caused by changes in temperature and/or the moisture state. If such movements are restrained, stresses will occur. If the tensile stresses are high enough, there will be a damage failure in tension and visible cracks arise. These stresses are always resulting from a self-balancing of forces, either within the young concrete body alone, i.e. without structural joints to other structures, or from the young concrete in combination with adjacent structures through structural joints. The decisive situation within a young concrete body alone is typically high stresses at the surface when the temperature is near the peak temperature within the body. This situation occur rather early for ordinary structures, say within a few days after casting for structures up to about some meters thickness, but for very massive structures like large concrete dams, it might take months and even years to reach the maximum tensile stresses at the surface. Usually this type of cracks is denoted "surface cracks", and in some cases only a temperature calculation may give a good perception to make decisions of the risk of surface cracking. On the other hand, the decisive situation within a young concrete body connected to adjacent structures, might include both risk of surface cracking at some distance away from the structural joint and risk of through cracking starting in the neighborhood of the structural joint. If the young concrete body is small in accordance to the adjacent structure, or, in other words, if there is an overall high restraint situation in the young concrete, the risk of early surface cracking might be out of question. So, restraint from adjacent structures represents one of the main sources of thermal and shrinkage stresses in a young concrete body. This study is mainly concentrated on establishing the restraint inside the young concrete body counteracted by adjacent structures, and how to estimate the risk of through cracking based on such restraint distributions. The restraint values in the young concrete are calculated with use of the finite element method, FEM. Any spatial structure may be analyzed with respect to the level of restraint. Calculations of risk of cracking are demonstrated with use of existing compensation plane methods, and a novel method denoted equivalent restraint method, ERM, is developed for the use of restraint curves. ERM enables the use of both heating of the adjacent structure and/or cooling of the young concrete, which are the most common measures used on site to reduce the risk of early cracking. In a design situation many parameters are to be considered, like type of cement, different concrete mixes, temperature in the fresh concrete, surrounding temperatures, temperature in the adjacent structure, measures on site (heating/cooling/insulation), sequence order of casting. Therefore, in general a lot of estimations concerning risks of cracking are to be performed. The main objective with the present study is to develop methods speeding up and shorten the design process. Furthermore, established restraint curves have been applied to the method of artificial neural networks (ANN) to model restraint in the slab, wall, and roof for the typical structures wall-onslab and tunnel. It has been shown that ANN is capable of modeling the restraint with good accuracy. The usage of the neural network has been demonstrated to give a clear picture of the relative importance of the input parameters. Further, results from the neural network can be represented by a series of basic weight and response functions, which enables that the restraint curves easily can be made available to any engineer without use of complicated software. A new casting technique is proposed to reduce restraint in the newly cast concrete with a new arrangement of the structural joint to the existing old concrete. The proposed technique is valid for the typical structure wall-on-slab using one structural joint. This casting method means that the lower part of the wall is cast together with the slab, and that part is called a kicker. It has been proven by the beam theory and demonstrated by numerical calculations that there is a clear reduction in the restraint from the slab to the wall using kickers. Restraint is affected by casting sequence as well as boundary conditions and joint position between old and new concrete elements. This study discusses the influence of different possible casting sequences for the typical structure wall-on-slab and slab-on-ground. The aim is to identify the sequence with the lowest restraint to reduce the risk of cracking.

Artificial-Neural-Network-Based Consumer Behavior Prediction: A Survey

Conference Paper

May 2020

Rating and Investigating Factors Affecting Consumer Preferences for Organic Products in Mashhad City

Article

Oct 2019

Introduction: Organic farming plays an important role in protecting the environment, maintaining non-renewable resources, improving the food quality, reducing the production of unnecessary products, and promoting market- oriented agricultural sector. In fact, organic farming make a significant contribution in improving the quality of the environment and natural resources, and also it has a positive effect on the quality of food supply and the promotion of public health. Given the many benefits of organic products, the market for these products has been increasingly considered by researchers, government officials and consumers. First step in developing the market for organic products is to meet the needs and demands of consumers. Recognizing consumer behavior and investigating the factors affecting it contributes significantly in success of any economic system. Besides, in advanced marketing studies, the process of identifying consumer choice is very crucial. Contrary to economists' views, consumers give little weight to benefits and costs in their decision making, and their choices are based on people's behavior, habits and other factors that may speed up the decision making. Consumer preferences for organic products depend on many factors and the importance of each of these factors varies among different consumers. Therefore, the main aim of this study is to rate and evaluate factors affecting the consumer preferences for organic products (fruitage, vegetables and cucurbits) in Mashhad city. Materials and Methods: Many marketing researchers use regression models to evaluate consumer decisions. In these models, decision variables are definitive part of utility function which is used to calculate how to choose a product. Linearity of utility function is the vital hypothesis. To specify a non-linear model, it is necessary to use variables that can show non-linear effects (For example, including the quadratic term of variables). However, this requires the insertion of assumptions about the nature of the utility function which ultimately leads to specification bias, and subsequently misinterpretation and unreasonable applications in marketing studies. Modeling complex processes is one of the advantages of artificial neural networks, and in this approach, it is not necessary to specify a mathematical relationship between the variables. The nonlinear and complex interactions can be considered between system variables using artificial neural network model. In this study in order to rate and evaluate factors affecting consumers preferences for organic products (fruitage, vegetables and cucurbits) an artificial neural network has been used that is consist of three dependent or target variables. Also, in order to evaluate the importance of the explanatory variables of the artificial neural network, partial derivatives approach has been used. Therefore, the use of three output variables on artificial neural networks simultaneously and partial derivative approach was distinctive features of this study compared with previous ones. Data is collected through questionnaires from a total of 175 households living in Mashhad. Age, gender, education, household size, number of household members under 10 years, number of household members over 65 years, price, having information on organic products, product appearance, having information on the supply of organic products, nutritional values, ease of access, the supply of organic products during the year and having labels were the input variables of artificial neural network. Consumer preferences for the purchase of organic fruitage, vegetables and cucurbits were the target variables of the artificial neural network. Results and Discussion: The results indicate that price has the greatest influence on willingness to consume organic products among all other factors. The price effect on willingness to consume organic products is different among individual consumers, and it's independent of the product. This finding suggested that the price of organic products had a significant impact on consumer purchasing decisions in comparison with other marketing mix elements. Conclusion: The adoption and implementation of marketing strategies based on price play a very important role in the growth of organic products markets. The results of the study indicate that, for each consumer and each product, the price had almost the similar effects on willingness to choose. Hence, it is recommended that the similar pricing strategies be used for these three organic products.

Consumption and consumers of food products in the postmodern society

Article

Jan 2015

The concept of the 'postmodern consumer' plays a central role in the debate, started in the early 80s, about economic, social and cultural changes in developed countries in the years following the end of the second world war. These changes were interpreted as a passage from modern to postmodern society. According to this literature, postmodern conditions have had a significant impact on the consumer, especially with regard to his/her psychological characteristics. In this new framework the consumer is viewed as someone more interested in the symbolic or cultural value of products and services than in their functional and utility value. At the same time, he/she is represented as an active player in the market scenario, exercising the freedom to move in search of trademarks, symbols and experiences through which he/she can communicate his/her own identity. The figure of the postmodern consumer is difficult to place in the framework of standard neoclassical theories on consumerism, which highlights the shortcomings of this theoretical approach in studying the behavior of the postmodern consumer. These shortcomings are likely to be more relevant when considering the consumer of food products, given the strong nexus between consumption and the well-being of the consumer and the symbolic and cultural value that food products project. The main goal of the paper is to provide an interdisciplinary overview of the postmodern consumer of food products by means of an analysis of scientific literature, mainly in the areas of behavioral economics, sociology and psychology. Following this, the paper focuses on questions regarding information and the rational behavior of consumers as being the main hypothesis upon which standard neoclassical theories are based, adding to the traditional approach to consumer choice the new insights provided by this different perspective. Finally, the implications of this type of analysis for food safety and quality policies are considered, together with a discussion on further research needed to define more effective policies.

Prediction of restraint in second cast sections of concrete culverts using artificial neural networks

Data

Full-text available

May 2016

Consumer Decision‐Making Styles of Young‐Adult Chinese

Article

Full-text available

Dec 1998
J CONSUM AFF

The dimensions and profiles of consumer decision-making styles of young-adult Chinese are investigated using a modified model of consumer decision-making styles and data recently collected from five Chinese universities. The results are then compared with those of similar studies using American and Korean data. While the dimensions of consumer decision-making styles are similar in these three countries, differences in consumer purchasing power and, maturity of the consumer market may contribute to the differences in consumer decision-making styles.

Interpreting neural network connection weights

Article

Jan 1991

G.D. Garson

Data Mining Methods for Knowledge Discovery

Book

Feb 1998

First Page of the Article

The Relationships between Beliefs, Attitudes, and Behavior

Chapter

Dec 1966

MARTIN FISHBEIN

Approximation by superposition of sigmoidale function

Article

Jan 1989

George Cybenko

In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube; only mild conditions are imposed on the univariate function. Our results settle an open question about representability in the class of single hidden layer neural networks. In particular, we show that arbitrary decision regions can be arbitrarily well approximated by continuous feedforward neural networks with only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks.

Neural Networks for Pattern Recognition.

Article

Dec 1997

Consumer Behaviour in China: Satisfaction and Cultural Values

Book

Jan 1994

Attitudes and consumption values of consumers of imported fruit in Guangzhou, China

Article

Mar 2002

Abstract Research on how Chinese consumption values influence Chinese consumer behaviour is rare. First, this paper examines consumers’ attitudes towards the physical and intangible attributes of imported fruit. Then, it identifies consumers’ consumption values and the role of these values in purchasing behaviour. Data were collected through point of sale intercept surveys conducted in Guangzhou, China. Latent consumption values of consumers were identified through factor analysis. K-means clustering revealed four natural groupings of consumers, each group demonstrating different consumption values. The results demonstrated the primary importance of symbolic values and hedonic values in the decision to purchase imported fruit. Such consumption values may derive from the intermingling of Confucian and Western cultural values. Results from this study could help to better understand interrelationships among product attributes, consumption values and cultural values, and could make a significant contribution in developing strategies to market imported fruit in China.

Neural Networks: A Comprehensive Foundation: Macmillan

Article

Jan 1994

S. S. Haykin

Interpreting neural-network connections

Article

Jan 1991

George Garson

An abstract is not available.

A New Perspective for Neural Networks: Application to a Marketing Management Problem

Abstract and Figures

Recommended publications

Previsão de séries temporais por meio de redes neurais

Preventing Customers from Running Away! Exploring Generalized Additive Models for Customer Churn Pre...

Unveiling the supplier risks with a neural network based supplier selection model

Study of enterprises marketing risk early warning system based on BP neural network model