当前位置：文档库 › 1-norm extreme learning machine for regression and multiclass classification using Newton method

1-norm extreme learning machine for regression and multiclass classification using Newton method

1-Norm extreme learning machine for regression and multiclass

classi?cation using Newton method

S.Balasundaram n,Deepak Gupta,Kapil

School of Computer and Systems Sciences,Jawaharlal Nehru University,New Delhi110067,India

a r t i c l e i n f o

Article history:

Received26August2012

Received in revised form

14March2013

Accepted25March2013

Available online25October2013

Keywords:

Extreme learning machine

Dual exterior penalty problem

Feedforward neural networks

Linear programming problem

Newton method

a b s t r a c t

In this paper,a novel1-norm extreme learning machine(ELM)for regression and multiclass classi?cation

is proposed as a linear programming problem whose solution is obtained by solving its dual exterior

penalty problem as an unconstrained minimization problem using a fast Newton method.The algorithm

converges from any starting point and can be easily implemented in MATLAB.The main advantage of the

proposed approach is that it leads to a sparse model representation meaning that many components of

the optimal solution vector will become zero and therefore the decision function can be determined

using much less number of hidden nodes in comparison to ELM.Numerical experiments were performed

on a number of interesting real-world benchmark datasets and their results are compared with ELM

using additive and radial basis function(RBF)hidden nodes,optimally pruned ELM(OP-ELM)and

support vector machine(SVM)methods.Similar or better generalization performance of the proposed

method on the test data over ELM,OP-ELM and SVM clearly illustrates its applicability and usefulness.

1.Introduction

Recently,Huang et al.[20]proposed a new learning algorithm for

single hidden layer feedforward neural networks(SLFNs)architecture

called extreme learning machine(ELM)method which overcomes

the problems of traditional feedforward neural network learning

algorithms such as the presence of local minima,imprecise learning

rate and slow rate of convergence.The main advantage of ELM is that

the hidden layer of SLFNs need not be tuned.In fact,for the randomly

chosen input weights and hidden layer biases,ELM will lead

to a least squares solution of a system of linear equations for the

unknown output weights having the smallest norm property[20].

ELM is a simple uni?ed algorithm for regression,binary and multi-

class classi?cation problems and it has been successfully tested on

benchmark problems of practical importance.It was initially pro-

posed for SLFNs and later extended to generalized SLFNs which may

not neuron alike[14,15].Although ELM is a much faster learning

machine with better generalization performance than other learning

algorithms,the stochastic nature of the hidden layer output matrix

may lower its learning accuracy[5].Further,it was observed that

because of the random selection of input weights and hidden node

biases,large number of hidden nodes might be required to achieve

an acceptable level of performance[22,40].This suggests to look for

compact networks having the ability to achieve good generalization

performance[22,25,37,40].The other issue in ELM is in choosing the

optimal number of hidden nodes for a given problem which is

usually done by trial and error method.Two heuristic approaches

namely constructive[8,14–16]and pruning methods[24]have been

proposed in the literature to address this problem.

Replacing the support vector machine(SVM)kernels by ELM

kernels in the SVM formulation[6,34],it was shown in[10]that

better generalization can be achieved.In[17],ELMs for classi?ca-

tion were extended to support vector networks where the training

error and the norm of the output weight vector were minimized

by optimization method.It was observed further that the proposed

method achieves similar or better generalization performance in

comparison to SVM and is less sensitive to the input parameters.

For the study of ELM as a uni?ed learning algorithm with different

types of feature mappings and its relationship with least squares

SVM(LS-SVM)and proximal SVM(PSVM),the reader is referred to

[19].As an interesting application of ELM for the simultaneous

learning of a function and its derivatives,see[1].Finally for an

excellent survey on ELM,the reader is referred to[18].

In recent years there is signi?cant interest in the study of

1-norm regularization or penalty[9,38],since1-norm tends to

make some of the?tted coef?cients of the model become exactly

zero and hence gives sparse models that are easily interpretable.

The in?uential work in this direction is LASSO,for Least Absolute

Shrinkage and Selection Operator,proposed for linear regression

Contents lists available at ScienceDirect

journal homepage:https://www.wendangku.net/doc/293884506.html,/locate/neucom

Neurocomputing

https://www.wendangku.net/doc/293884506.html,/10.1016/j.neucom.2013.03.051

n Corresponding author.Tel.:t911126704724;fax:t911126741586.

E-mail addresses:balajnu@https://www.wendangku.net/doc/293884506.html,,bala_jnu@https://www.wendangku.net/doc/293884506.html,(S.Balasundaram),

deepakjnu85@https://www.wendangku.net/doc/293884506.html,(Deepak Gupta),navkapil@https://www.wendangku.net/doc/293884506.html,(Kapil).

Neurocomputing128(2014)4–14

in[33]wherein least squares estimates are obtained by minimiz-ing the residual sum squared errors.Keeping in mind the advan-tage of constructing a sparse ELM model that its decision function will be determined with few hidden nodes and therefore can be also used for selecting the contributing hidden nodes,a naive1-norm ELM formulation has been proposed in the current paper. Moreover,since the sum of absolute errors is minimized the proposed formulation will result in a robust model representation. It has been empirically shown that the proposed formulation leads to a sparse model with comparable generalization performance.In [24],Miche et al.proposed optimally pruned ELM(OP-ELM)where the hidden neurons are selected?rst by applying1-norm for the outputs and then the weights for these selected neurons are computed using the classical least squares.As an interesting application of1-norm SVM formulation to pre-select the hidden nodes,see[12].

Inspired by the study of1-norm SVM problem formulated as a linear programming optimization problem by Mangasarian[23],a linear programming ELM method is described in this work whose solution will be obtained by solving its exterior penalty problem in dual as an unconstrained optimization problem using Newton–Armijo algorithm.The main advantage of the proposed Newton linear programming ELM(NLPELM)method is that it is a sparse model representation whose solution is obtained by solving a system of linear equations at a?nite number of times.The effectiveness of the proposed method for regression,binary and multiclass classi?cation problems is demonstrated by performing numerical experiments on a number of interesting datasets and comparing their results with ELM,OP-ELM and SVM.Finally,it is interesting to observe[3]that,in comparison to the quadratic programming SVM,linear programming SVM is an ef?cient method with the advantage of having reduced number of support vectors.

Throughout in this work all the vectors are assumed as column vectors.For any two vectors x,y in the n-dimensional real space R n the inner product of the vectors will be denoted by x t y where x t is the transpose of the vector x.The1-norm and2-norm of a vector x will be denoted by jj x jj1and jj x jj2respectively.For any vector x?ex1;…;x nTt A R n,xtis a vector whose i th component is max f0;x i g:Since the continuous,piece-wise linear function: max f0;x g,for any real x,is not differentiable at the origin,by de?ning the average of the left and right derivatives as a‘general-ized derivative’,one can obtain its generalized derivative to be a piece-wise constant function.In fact,let

x n?d

max0;x

f g?

0when x o0

0:5when x?0

1when x41

>::

Also,for any vector x?ex1;…;x nTt A R n,let the piece-wise constant function x n be such that[11,23]:ex nTi?ex iTn:The diag-onal matrix of order n whose diagonal elements become the components of the vector x is denoted by diag(x).For any real matrix H A R m??,its transpose is denoted by H t:The identity matrix of order m is denoted by I m:If f is a real valued function of the variable x?ex1;…;x nTt A R n then its gradient is denoted by?f?e?f=?x1;…;?f=?x nTt and the Hessian matrix by?2f?e?2f=?x i?x jTn i;j?1.

The paper is organized as follows.Section2dwells brie?y ELM.In Section3,sparsity inducing1-norm regularization is introduced.In Section4,the primal linear programming ELM(LPELM)is formu-lated.Further,since LPELM leads to increase in number of unknowns and constraints and hence increase in problem size,the method of obtaining its solution by solving its dual exterior penalty problem as an unconstrained minimization problem by Newton–Armijo algo-rithm is described in this section.Experimental results obtained by the proposed NLPELM method with additive and radial basis function (RBF)hidden nodes are compared with the results of LPELM,OP-ELM,ELM and SVM in Section4.Finally,Section5concludes this work.

2.Extreme learning machine method

Let fex i;y iTg i?1;…;m be a set of training samples given where for the input example x i?ex i1;…;x inTt A R n let its corresponding desired output value be y i A R.Then,for the randomly assigned values for the weight vector a s?ea s1;…;a snTt A R n and the bias b s A R connecting the input layer to the s th hidden node,the standard SLFN s with?number of hidden nodes approximate the input examples with zero error if and only if there exists an output weight vector w?ew1;…;w?Tt A R?connecting the hidden nodes to the output node such that the following condition holds:

y i?∑

s?1

w s Gea s;b s;x iTfor i?1;…;m

where Gea s;b s;x iTis the output of the s th hidden node for the input example x i.The above system of linear equations can be,equiva-lently,written in matrix form as

H w?y;e1Twhere

Gea1;b1;x1T…Gea?;b?;x1T

:…:

Gea1;b1;x mT…Gea?;b?;x mT

m??

e2T

is the hidden layer output matrix of the network and y?

;…;y mTt A R m is the vector of desired outputs.

For the randomly assigned values of the parameters a s A R n and b s A R,training the SLFN is equivalent to obtaining a least squares solution w of the linear system(1).In fact,w is determined to be the minimum norm least squares solution of(1)which can be explicitly obtained to be[20]

w?H?y;

where H?is the Moore–Penrose generalized inverse[30]of the matrix H.Finally,by obtaining the solution w A R?the regression estimation function feUTis determined to be:for any input example x A R n,

fexT?eGea1;b1;xT;…;Gea?;b?;xTTwe3aTHowever,for binary classi?cation problem,the decision func-tion feUTwill become

fexT?signeeGea1;b1;xT;…;Gea?;b?;xTTwT:e3bTFor multiclass classi?cation with k number of classes,let ELM have k number of nodes in the output layer.Then,for every input example x i A R n,the network output will be equal to the target outputey i1;…;y ikTt A R k if

H W?Y

where H is the hidden layer output matrix given by(2)

W??w1…w k ?

w11 (1)

:…:

w?1…w?k

Y??y1…y k ?

y11 (1)

:…:

y m1…y mk

75;

the unknown vector w j?ew1j;…;w?jTt A R?is the weight vector connecting the hidden nodes with the j th output node and y j?ey1j;…;y mjTt A R m is the output vector corresponding to the

S.Balasundaram et al./Neurocomputing128(2014)4–145

j th output node.For any test example x A R n,its predicted class label will be determined by

arg max

j A f1;…;k g

f jexT;

whereef1exT;…;f kexTT?eGea1;b1;xT;…;Gea?;b?;xTTW: Note that,once the values of the weight vector a s A R n and the bias b s A R are randomly assigned at the beginning of the learning algorithm they remain?xed and therefore the matrix H remains unchanged.Further,since the sigmoidal,radial basis, sine,cosine and exponential functions are in?nitely differentiable in any interval of de?nition they can be chosen as activation functions[20].

3.Sparsity-inducing1-norm formulation

Consider the1-norm convex optimization problem with the general form:

min w;b

∑m

i?1

L∑

j?1

w j h jex iTtb;y i

tλjj w jj1;e4T

where Le:;:Tis a non-negative,convex loss function;f h1exT;…;h qexTg is a dictionary of basis functions;w?ew1;…;w qTt is the unknown vector in R q and b is the bias.Hereλ40is the regularization parameter which controls the tradeoff between loss and regulariza-tion,i.e.the?rst and second terms of(4)respectively.Typical examples of loss functions are the squared loss,absolute loss and hinge loss functions,de?ned by:For real numbers x;y;

(i)Lex;yT?exàyT2,(ii)Lex;yT?j xày j and(iii)Lex;yT?e1àxyTt. Note that hinge loss makes sense only for classi?cation whereas squared loss and absolute loss functions are applicable for either regression or classi?https://www.wendangku.net/doc/293884506.html,ing the solution of(4),the?tted model is obtained:For any x A R n;fexT?∑q j?1w j h jexTtb: For linear regression problems with squared loss function,(4) leads to the popular LASSO minimization problem[33],i.e.

min w;b

∑m

i?1

àew t U x itbTT2tλjj w jj1:e5T

With the objective of obtaining a model representation whose results are not sensitive to outliers,i.e.a robust model representa-tion,regularized least absolute deviation(RLAD)linear regression model is proposed in[35].More precisely,since the results obtained by minimizing the mean absolute errors are more robust in comparison to mean squared errors,the squared loss in(5)is replaced by absolute loss function.The coef?cients of the estima-tor are computed by solving the following minimization problem [35]:

min w;b

∑m

i?1

j y

àew t U x itbTjtλjj w jj1:e6T

Finally,when hinge loss function is used in(4),1-norm SVMs [39]can be obtained:

min w;b

∑m

i?1

1ày i∑

j?1

w j h jex iTtb

tλjj w jj1;e7T

with y i in fà1;1g:

Although LASSO penalized formulation(4)has the advantage of yielding sparse model,the issue of ef?ciently solving it is less obvious since its objective function is non-differentiable which therefore precludes the application of well-known unconstrained methods.Tibshirani[33]proposed an ef?cient algorithm for tracking the whole1-D solution path of the problem(5)as a function of the parameter C by transforming the initial problem (5)into an equivalent LPP and obtaining its solution satisfying Karush–Kuhn–Tucker optimality conditions.Similar approaches have been followed in[35,39]for solving the problems(6)and (7).Other than the classical procedure of transforming the problem(4)into linear programming problem and solving using packages,many alternative generic algorithms such as sub-gradi-ent,coordinate descent,stochastic gradient descent and interior point methods have been proposed in the literature for solving(4) over several loss functions and the interested reader is referred to [21,31,32,36].

4.Proposed1-norm extreme learning machine method

In this Section,1-norm ELM with absolute loss is proposed as a uni?ed method for regression and classi?cation resulting in a robust and sparse model representation.Further,for solving the proposed optimization problem whose objective function is non-differentiable,the approach of transforming the initial problem into another problem with differentiable objective and constraint functions is considered in this work.In fact,motivated by the study of[23]on1-norm SVM,it is proposed to solve1-norm ELM by formulating it as a linear programming problem(LPP)whose solution will be obtained by considering its dual exterior penalty problem as an unconstrained minimization problem and solving it by Newton–Armijo algorithm.The proposed formulation leads to a simple and fast converging iterative method of solution for regression,binary and multiclass classi?cation problems.

For the sake of simplicity,the1-norm ELM for regression and binary classi?cation problems are considered?rst in Section4.1 where a single output node will be used to determine the estimation functions and subsequently in Section 4.2the one-against-all(OAA)1-norm ELM for multiclass classi?cation.

4.1.Regression and binary classi?cation

For a given SLFN having?number of hidden nodes,the ELM learning method determines the unknown output weight vector w?ew1;…;w?Tt A R?connecting the hidden nodes to the output node having smallest norm and minimum training error property [20],i.e.w is the minimum norm least squares solution:

min jj H wày jj2and min jj w jj2;e8Twhere the matrix H is given by(2)and y?ey1;…;y mTt A R m is the vector of desired outputs.

Consider the minimum norm least squares problem(8)for-mulated in1-norm de?ned by

min

w A R?

jj w jj1tC jj H wày jj1e9T

where C40is a constant.Following the procedure of[23],the1-norm ELM problem(9)is formulated into a LPP as follows:For r;s A R?and p;q A R m;let

w?ràs and H wày?pàqe10Tbe such that

r;s Z0and p;q Z0

hold.Then,using(10)in(9)one can obtain the linear program-ming ELM(LPELM)problem in primal of the following form:

min

r;s;p;q

e t?ertsTtC e t meptqT

subject to

HeràsTàptq?y;

r;s;p;q Z0;e11Twhere e?and e m are the column vectors of ones of dimension?and m respectively.

S.Balasundaram et al./Neurocomputing128(2014)4–14 6

Since the linear problem(11)is feasible and further its objective function is bounded below by zero,it is https://www.wendangku.net/doc/293884506.html,ing the optimization toolbox of MATLAB one can easily solve LPELM de?ned by(11).However,because of increase in number of unknowns and constraints and therefore increase in problem size it is proposed to consider its dual exterior penalty problem as an unconstrained minimization problem in m variables whose solu-tion can be obtained by Newton–Armijo method.Finally,an exact solution of LPELM,de?ned by(11),can be constructed using the following theorem proved in[23].

Theorem1.[23].Assume that the primal LPP given by

min

ex;yTA R ntl

c t xt

d t y s:t:A xtB y Z b;E xtG y?h;x Z0;e12T

is solvable,where c A R n;d A R l;A A R m?n;B A R m?l;b A R m;E A R k?n;

G A R k?l and h A R k:

Then its dual exterior penalty problem de?ned by

min eu;vTA R mtk θeàb t uàh t vTt1

eJeA t utE t vàcTtJ22tJ B t u

tG t vàd J2

tJeàuTtJ22Te13Tis also solvable for allθ40.Moreover,for everyθAe0;θ for some θ40,the paireu;vTwill be a solution of(13)implies:

x?1

eA t utE t vàcTt;y?1

eB t utG t vàdT:e14T

will be an exact solution to the primal problem(12).

It follows from Theorem1that the dual exterior penalty problem corresponding to the LPELM de?ned by(11),obtained to be of the form:

min u A R m LeuT?àθy t ut

eeH t uàe?Ttj2

teàH t uàe?Ttj2

tjjeàuàC e mTtjj2

tjjeuàC e mTtjj2

T;e15T

is solvable for allθ40and,moreover,there existsθ40such that for anyθAe0;θ :

r?1

eH t uàe?Tt;s?1

eàH t uàe?Tt;

p?1

eàuàC e mTtand q?

euàC e mTte16T

will generate an exact solution to the primal problem(11),where θ40is the penalty parameter and u is a solution of(15).

In this work,we solve the above unconstrained non-smooth minimization problem(15)by Newton–Armijo iterative algorithm. Finally,using its solution and Eqs.(10)and(16),the decision function(3)will be determined.

Since the gradient of LeUT,given by

?LeuT?àθytHeH t uàe?TtàHeàH t uàe?Tt

àeàuàC e mTtteuàC e mTt

is not differentiable and therefore the Hessian matrix of second order partial derivatives of LeUTdoes not exist in the usual sense. However,the gradient of LeUTis Lipschitz continuous and hence its ‘generalized Hessian’can be obtained as follows[13]:For u A R m,?2LeuT??2L?ediageeH t uàe?TnteàH t uàe?TnTTH t

tdiageeàuàC e mT

n teuàC e mT

?H diageej H t u jàe?TnTH ttdiageej u jàC e mTnT

where diageUTis a diagonal matrix.The last equality follows from the following result:

eaà1T

n teàaà1T

?ej a jà1T

for any a A R:

The generalized Hessian is useful because it satis?es many of the properties of the regular Hessian and one can apply them for the study of non-smooth optimization problems in the same way the regular Hessian is done with smooth optimization problems.For example, when the smallest eigenvalue of?2LeuTis greater than a constant value for all vectors u A R m then the objective function LeUTis strongly convex and hence will have a unique minimal solution[23].

4.2.Multiclass classi?cation

Following the notations of Section2and the discussion in Section4.1on1-norm ELM formulation for regression and binary classi?cation,one can solve the1-norm k-class classi?cation problem using the popular one-against-all(OAA)method.

In the proposed1-norm ELM for multiclass classi?cation,as in OAA method,k binary classi?ers will be constructed in which all the training examples will be used at each time of training.In fact, by considering all the training examples having their original class label j for each j A f1;…;k g as elements of positive class and the remaining training examples to be the elements of negative class,the j th1-norm ELM will be trained.Finally,if f1exT;…;f kexTare the decisions functions of the k binary classi?ers of the form (3a)then for any test example x A R n its predicted class label will become

arg max

j A f1;…;k g

f jexT:

For this,consider the ELM for multiclass classi?cation problem, formulated as k binary ELM classi?cation problems of the follow-ing form:

H w1?y1;…;H w k?y k;e17Twhere for each j A f1;…;k g;w j?ew1j;…;w?jTt A R?is the unknown weight vector connecting the hidden nodes to the output node and the output vector y j?ey1j;…;y mjTt A R m is such that y ij?1if the original class label for the input example x iei?1;…;mTis j and y ij?à1otherwise.

The minimum norm least square k-classi?er ELM in1-norm can be formulated as

min

w j A R?

jj w j jj1tC jj H w jày

jj1;j?1;…;k;e18T

where H is the hidden layer output matrix given by(2)and C40is a constant.It can be easily observed that the proposed multiclass classi?cation formulation leads to a k-classi?er OAA-ELM in1-norm.

By extending the procedure explained in the previous sub-section on binary classi?cation problems,binary LPELM can be constructed for each j A f1;…;k g whose solution will be obtained by formulating its dual exterior penalty problem as an uncon-strained optimization problem and further solving it using New-ton–Armijo algorithm.

4.3.Newton–Armijo algorithm

In this section,the Newton–Armijo algorithm[11]used for solving the unconstrained minimization problem(15)discussed above is stated and its proof of convergence will follow from Proposition4of[23].

Newton Algorithm with Armijo stepsize

Given:

u?initial guess vector in R m

tol?expected learning accuracy

itmax?maximum number of iterations

Solution phase:

iter?0

while(iter o itmax and jj?LeuTjj

4tol)

/*Determine the direction vector d A R m as the solution of the following system of linear equations in m

variables*/

S.Balasundaram et al./Neurocomputing128(2014)4–147

?2LeuTd?à?LeuT

/*Armijo stepsize*/

/*Choose the stepsizeλ?max f1;1=2;1=4;…g so that

LeuTàLeutλdTZàδλ?LeuTt d andδAe0;1=2T*/ u?utλd

iter?itert1

end while

Clearly,?2L is a symmetric and positive semi-de?nite matrix of order m.However,since it is possible that the matrix may be ill-conditioned and therefore we will useeρI mt?2LTà1in place of the inverse of?2L where the regularization parameterρis taken to be a very small positive number.

The convergence result of Newton–Armijo algorithm to the solution of the exterior penalty problem de?ned by(15)will follow from the following theorem.

Theorem2.[23].Let the penalty parameterθ40be chosen suf?-ciently small.Suppose u is an accumulation point of the sequence f u i g generated by the above algorithm.Then,u is a solution of the exterior penalty problem(15).

For simplicity reason,we solve the problem(15)in this work using Newton's method without Armijo stepsize,i.e.the unknown u it1at the(it1)th iteration is obtained by solving the following matrix equation

eρI mt?2Leu iTTeu it1àu iT?à?Leu iTe19Twhere i?0,1,….

5.Numerical experiments and comparison of results

In this section,the performance of NLPELM de?ned by(15)will be investigated by comparing it numerically with the LPELM primal problem(11)solved using the optimization toolbox of MATLAB,ELM and SVM on a number of real-world,publicly available benchmark datasets of regression,binary and multiclass types.

All experiments were carried-out in MATLAB R2010a environ-ment on a PC running on Windows7OS with64bit,3.0GHz Intel (R)core(TM)2Duo processor having8GB of RAM.

The performance of the proposed method has been tested on additive and RBF hidden nodes.For this,the activation function Gea;b;xTin ELM,LPELM and NLPELM is chosen as the sigmoid function

Gea;b;xT?

1texpeàea t xtbTT

for additive nodes and both multiquadric and Gaussian functions, de?ned by[19]:

Gea;b;xT?ejj xàa jj2

tb2T1=2

and

Gea;b;xT?expeàb jj xàa jj2

respectively,are considered for RBF hidden nodes.Also,for regression and binary classi?cation the experimental results of NLPELM are compared with OP-ELM.For sigmoid and multi-quadric activation functions the hidden node parameters were chosen randomly with uniform distribution in?à0:5;0:5 and for Gaussian function,however,they were chosen randomly in[0,1]. The penalty parameter was set toθ?0:1.The input weights and biases of the hidden nodes are selected randomly at the beginning of the algorithm for NLPELM,LPELM,OP-ELM and ELM,and they remain?xed in each trial of simulation.The optimal values of the regularization parameter C and the number of hidden nodes parameter?were determined by performing10-fold cross-valida-tion on the training dataset by varying them over pre-de?ned sets of values.Although Newton–Armijo algorithm converges globally, for simplicity,all the experiments were performed without Armijo stepsize,https://www.wendangku.net/doc/293884506.html,ing the Newton method(19).

For the implementation of SVM we used LIBSVM[4].The toolbox of OP-ELM[26]is used for OP-ELM.In all numerical experiments,the Gaussian nonlinear kernel function de?ned by [11]:8x;y A R m;Kex;yT?expeàjj xày jj22=e2s2TTwas applied and the optimal kernel parameter s40was chosen using10-fold cross-validation by varying s2from the set f2à4;…;26g.Further, the insensitive parameter epsilon,introduced by Vapnik[34], appearing in the support vector regression(SVR)formulation is chosen by the standard10-fold cross-validation methodology by varying its value from the set f0:001;0:01;0:1g:

The sparseness of the1-norm ELM formulation(9)solved using Newton method can be measured by the number of nonzero components of the optimal solution vector w A R?.Lower the number of nonzero components better is the sparseness.To illustrate the sparsity of NLPELM,two examples for regression and classi?cation types were considered and for each pair of parameter valueeC;?T,the number of nonzero components or degree of sparsity was computed as the number of contributing hidden nodes.

5.1.Regression

To demonstrate the effectiveness of NLPELM and LPELM for regression,comparison performance is carried-out on the following benchmark datasets including:the Box Jenkins gas furnace,Auto-Mpg,Machine CPU,Servo,Forest?res,Boston,Concrete CS,Abalone, Wine quality-white and Parkinson from UCI repository[29];Kin-fh, Demo and Bank-32fh from DELVE[7];Pollengrains and Bodyfat from StatLib collection:https://www.wendangku.net/doc/293884506.html,/datasets;the time series datasets generated by the Mackey-Glass differential equation; Sunspots and SantafeA taken from the web site:http://www.cse.ogi. edu/$ericwan and a number of interesting?nancial datasets of stock index taken from the web site:http://www.daily?https://www.wendangku.net/doc/293884506.html,.

In all the regression examples considered,the original data is normalized in the following manner:

x ij?

x ijàx min

x max

àx min

where x min

?min

i?1;…;m

ex ijTand x max

?max

i?1;…;m

ex ijTdenote the mini-mum and maximum values,respectively,of the j th attribute over all the input examples x i and x ij is the normalized value corre-sponding to x ij.The2-norm root mean square error(RMSE)was selected as the measure of prediction performance and it was calculated using the following formula:

RMSE?

???????????????????????????????

∑N

i?1

à~y iT2

;

where y i and~y i are the observed and its corresponding predicted values,respectively,and N is the number of test samples.

Since large value of?will result in increase in computational time and further it was observed from experiments that better generalization performance could be achieved for small/moderate values of?,it was decided to vary the values of?from10to500. More precisely,the optimal values of the parameters C and?were chosen from the sets f2à1;…;210g and{10,20,30,40,50,60,80, 100,200,500}respectively using10-fold https://www.wendangku.net/doc/293884506.html,ing the optimal values,the average test accuracy for each dataset was computed by conducting10independent trials.

S.Balasundaram et al./Neurocomputing128(2014)4–14 8

As the ?rst regression example,the Box and Jenkins gas furnace dataset [2]is taken.It consists of 296input-output pair of values of the form eu et T;y et TTwhere u et Tis the input gas ?ow rate whose output y et Tis the CO 2concentration from the gas furnace.The output y et Tis predicted based on 6attributes taken to be of the form:x et T?ey et à1T;y et à2T;y et à3T;u et à1T;u et à2T;u et à3TT.Thus,one can obtain 293samples in all in which each sample is of the form ex et T;y et TT.The even samples were chosen for training whereas the odd samples were taken for testing.The performance of the proposed method with sigmoid additive hidden node and multiquadric RBF hidden node were shown in Figs.1and 2respectively.

Experiments were performed on two time series datasets denoted by MG 17and MG 30generated using the Mackey-Glass time delay differential equation [27,28]de ?ned by dx et Tdt ?à0:1x et Tt0:2x et àτT

1tx et àτT10

corresponding to the time delay τ?17and 30respectively.Five previous values were used to predict the current value.Among the total of 1495samples obtained,the ?rst 500were considered for training and the rest for testing.

Finally,as examples of ?nancial datasets,the stock index of Citigroup,Google Inc.,IBM,Intel,Microsoft,Redhat software and Standard &Poor 500were taken.In total 755closing prices starting from 01-01-2006to 31-12-2008were considered.As in MG 17and MG 30examples,?ve previous values were used to

predict the current value.The ?rst 200samples were taken for training and the rest for testing.

Further to verify whether the proposed NLPELM solution approach results in minimum number of hidden nodes in the determination of the decision function,both Auto-Mpg and Wine-quality datasets for regression are considered.For each pair of parametric values eC ;?Tits degree of sparsity is computed.The results are shown in Fig.3.

It is well-known that the performance of SVM is sensitive to the choice of the parameters C and s [19].However,it can be observed that NLPELM can achieve,in general,good generalization perfor-mance interestingly even for small values of ?and was also not very sensitive to the user speci ?ed parameters (see Fig.4).Also from the ?gures,one can notice that NLPELM results in a sparse model representation and further showing good generalization performance.

For all the real-world regression datasets considered,the number of training and test samples chosen,the number of attributes,the optimal parameter values determined using 10-fold cross-validation and the numerical results obtained by NLPELM,LPELM,OP-ELM,ELM and support vector regression (SVR)were summarized in Table https://www.wendangku.net/doc/293884506.html,parable generalization performance of the proposed method on bench-mark datasets clearly demonstrates its effectiveness and applicability.5.2.Classi ?cation

In order to demonstrate the effectiveness of the proposed method,the performance of NLPELM,LPELM,OP-ELM,ELM and SVM were compared by conducting numerical experiments

Fig.1.Results of comparison for gas furnace data of Box-Jenkins with sigmoid additive node.(a)Prediction over the training set and (b)prediction over the testing

set.

Fig.2.Results of comparison for gas furnace data of Box-Jenkins with multiquadric RBF node.(a)Prediction over the training set and (b)prediction over the testing set.

S.Balasundaram et al./Neurocomputing 128(2014)4–149

9binary and 7multi-class classi ?cation datasets.In fact,all the datasets considered were taken from UCI repository [29]for both binary and multi-class classi ?cation.When the value of the regularization parameter C varies from the set f 2à5;…;220g it was observed again that better general-ization performance for binary classi ?cation may be obtained,

Fig.3.Number of actually contributing hidden nodes as function of the user speci ?ed parameters (C,?)showing NLPELM is a sparse model for regression.(a)NLPELM with Sigmoid additive node for Auto-Mpg dataset;(b)NLPELM with multiquadric RBF node for Auto-Mpg dataset;(c)NLPELM with Sigmoid additive node for Wine quality-white dataset;(d)NLPELM with multiquadric RBF node for Wine quality-white

dataset.

Fig.4.Insensitivity performance of NLPELM to the user speci ?ed parameters (C,?)on two regression datasets.(a)NLPELM with Sigmoid additive node for Auto-Mpg dataset;(b)NLPELM with multiquadric RBF node for Auto-Mpg dataset;(c)NLPELM with Sigmoid additive node for Wine quality-white dataset;(d)NLPELM with multiquadric RBF node for Wine quality-white dataset.

S.Balasundaram et al./Neurocomputing 128(2014)4–14

general,for moderate value of the number of hidden nodes.Since increase in number of hidden nodes will result in increase in computational time,the value of?is chosen as200in all our experiments.By performing10-fold cross-validation,the optimal value of the regularization parameter was obtained.The average test accuracy was computed by performing10independent trials.

In order to verify that NLPELM results in a sparse model representation for classi?cation,the Australian credit and Votes datasets are considered and their degrees of sparsity are computed. The results are shown in Fig.5.Again,like in the case of regression, for each pair of parameter valueseC;?T,the test accuracies of NLPELM for sigmoid additive node and multiquadric RBF node were obtained and shown in Fig.6.It can be observed from Fig.6that the proposed method is not very sensitive to the user speci?ed parametric values and further shows good generalization performance.

Finally to verify the performance on multiclass classi?cation, the following datasets were considered:Iris,Wine,Glass,Vehicle, Page-blocks,Segment and Satimage.The OAA-ELM in1-norm de?ned by(18)is solved using NLPELM.All the datasets were normalized in the same manner as in the case of regression.In all experiments,??200is assumed.For both SVM and NLPELM,the optimal value of C was chosen from the range f2à5;…;220g using 10-fold cross-validation.With these optimal values,the average test accuracy for each dataset was computed again by conducting 10independent trials.Among the7datasets considered,NLPELM achieves better generalization performance in4cases and also comparable performance in the remaining cases.

The number of training and test samples chosen,the number of attributes,the optimal parameter values determined using10-fold cross-validation and the classi?cation accuracy obtained by NLPELM,LPELM,and ELM using sigmoidal,multiquadric and Gaussian activation functions for binary and multi-class classi?ca-tion problems were summarized in Tables2and3respectively. Better or comparable generalization performance of the proposed method on the bench-mark datasets considered,clearly demon-strates its usefulness and applicability.

Table1

Performance comparison of NLPELM with LPELM and ELM having both sigmoid and multiquadric hidden nodes,OP-ELM having sigmoid hidden nodes,and SVR using Gaussian kernel for regression.RMSE is used for comparison.The best result is shown in boldface.

Datasets

(train size,test size)

SVR(C,s2,ε)ELM OP-ELM LPELM NLPELM

Sigmoide?TMultiquadric

e?TSigmoide?TSigmoid(C,?)Multiquadric

(C,?)

Sigmoid(C,?)Multiquadric

(C,?)

Gas furnace

(146?6,147?6)

0.0363(26,22,0.01)0.0420(10)0.0472(40)0.0200(100)0.0385(29,80)0.0356(29,80)0.0323(24,30)0.0388(23,500)

Auto-Mpg

(100?7,292?7)

0.1501(22,21,0.001)0.1572(20)0.1646(10)0.4608(100)0.1652(29,200)0.1668(24,100)0.1535(210,20)0.1654(22,100) Machine CPU

(100?7,109?7)

0.0359(27,24,0.01)0.0382(20)0.0385(40)0.0592(100)0.0309(29,500)0.0435(210,200)0.0797(21,100)0.0885(22,200)

Servo(100?4,67?4)0.1221(28,20,0.01)0.1319(50)0.1429(20)0.1049(100)0.1285(210,500)0.1574(27,200)0.1376(210,200)0.1904(210,40) Forest?res

(150?12,317?12)

0.1226(29,2–5,0.001)0.0893(10)0.0943(10)0.0706(100)0.0706(20,500)0.0713(21,500)0.0706(24,10)0.0706(25,10)

Boston

(200?13,306?13)

0.1239(21,20,0.001)0.1468(20)0.1267(20)0.6644(100)0.1531(26,500)0.1335(22,20)0.1345(26,80)0.2455(22,500)

Concrete CS

(700?8,330?8)

0.1583(24,2à1,0.01)0.1654(10)0.1588(10)0.4883(100)0.1406(22,200)0.1346(2à1,500)0.1375(23,500)0.1558(22,80)

Abalone

(1000?8,3177?8)

0.1946(27,23,0.001)0.118(10)0.1571(10)0.1000(100)0.1411(25,30)0.1559(23,30)0.1461(25,80)0.169(22,60)

Wine quality-white

(1000?11,3898?11)

0.1411(2à1,2à6,0.001)0.1375(10)0.1323(10)0.1386(100)0.1929(26,200)0.1944(23,500)0.1783(25,500)0.2078(24,500)

Parkinson

(1000?16,4875?16)

0.2840(2à1,2à5,0.001)0.2489(10)0.2487(10)0.288(100)0.3687(27,500)0.3378(27,500)0.3618(29,500)0.3446(28,500) Kin-fh

(200?32,7992?32)

0.0915(210,24,0.1)0.1127(20)0.1005(40)0.0884(100)0.1065(21,500)0.1005(22,80)0.145(24,60)0.134(23,500) Demo

(1000?4,1048?4)

0.0918(22,2à3,0.001)0.0938(20)0.0934(30)0.0956(100)0.0992(210,500)0.0960(210,200)0.0988(210,80)0.1004(28,200) Bank-32fh

(1000?32,7192?32)

0.1454(21,24,0.1)0.1458(60)0.1459(50)0.1258(100)0.1512(21,200)0.1499(21,500)0.1507(22,100)0.1504(20,500)

Pollengrains

(150?5,3698?5)

0.2890(2à1,24,0.01)0.291(10)0.291(10)0.5719(100)0.2896(22,60)0.3040(2à1,40)0.291(24,10)0.2839(23,500) Bodyfat

(150?14,102?14)

0.0200(21,24,0.001)0.0736(20)0.0455(20)0.0733(100)0.0129(25,500)0.0547(20,200)0.0156(21,500)0.1944(24,50)

Mg17(500?5,995?5)0.0049(28,2à3,0.001)0.0048(200)0.0083(100)0.0051(100)0.008(27,200)0.0258(28,100)0.0276(29,500)0.0253(24,500) Mg30(500?5,995?5)0.0210(25,2à3,0.001)0.0245(200)0.0375(100)0.0222(100)0.0282(27,100)0.0595(27,60)0.0393(29,500)0.062(28,500) Sunspots

(150?5,140?5)

0.0843(24,21,0.01)0.0962(20)0.0819(20)0.0899(100)0.0932(23,80)0.0939(27,80)0.0914(210,500)0.1078(22,500)

SantafeA

(500?5,495?5)

0.0427(24,2à2,0.001)0.0568(50)0.0395(30)0.0399(100)0.0428(27,100)0.0882(26,200)0.0774(29,500)0.047(23,500)

Citigroup

(200?5,550?5)

0.0244(23,24,0.01)0.0227(30)0.0226(20)0.3605(100)0.0238(25,80)0.0311(22,30)0.0215(24,30)0.045(23,500) Google(200?5,550?5)0.0273(25,24,0.01)0.0284(20)0.0285(30)0.3745(100)0.0273(24,10)0.0322(21,20)0.0287(28,60)0.0477(21,60) IBM(200?5,550?5)0.0314(27,24,0.001)0.0319(10)0.0333(30)0.4794(100)0.0319(27,40)0.0336(28,10)0.032(29,10)0.0301(22,20) Intel(200?5,550?5)0.0335(27,24,0.01)0.0336(10)0.0354(20)0.2335(100)0.0375(28,50)0.0374(26,20)0.0314(27,10)0.0455(210,20) Microsoft

(200?5,550?5)

0.0310(24,23,0.001)0.0314(10)0.0338(10)0.2893(100)0.0312(26,10)0.0327(23,80)0.0312(29,10)0.0352(23,10) Redhat

(200?5,,550?5)

0.0335(22,23,0.001)0.0344(10)0.0365(10)0.0618(100)0.0345(26,20)0.0596(22,20)0.0339(28,20)0.0364(23,10)

S&P500

(200?5,550?5)0.0285(26,23,0.1)0.0296(10)0.0266(10)0.2232(100)0.0275(29,100)0.0324(25,80)0.0263(29,200)0.0582(24,40)

S.Balasundaram et al./Neurocomputing128(2014)4–1411

Fig.5.Number of actually contributing hidden nodes as function of the user speci ?ed parameters (C,?)showing NLPELM is a sparse model for classi ?cation.(a)NLPELM with Sigmoid additive node for Australian Credit dataset;(b)NLPELM with multiquadric RBF node for Australian Credit dataset;(c)NLPELM with Sigmoid additive node for Votes dataset;(d)NLPELM with multiquadric RBF node for Votes

dataset.

Fig.6.Insensitivity performance of NLPELM to the user speci ?ed parameters (C,?)on two classi ?cation datasets.(a)NLPELM with Sigmoid additive node for Australian Credit dataset;(b)NLPELM with multiquadric RBF node for Australian Credit dataset;(c)NLPELM with Sigmoid additive node for Votes dataset;(d)NLPELM with multiquadric RBF node for Votes dataset.

S.Balasundaram et al./Neurocomputing 128(2014)4–14

6.Conclusions

In this work,a novel approach for extreme learning machines in1-norm for regression and multiclass classi?cation has been proposed as a linear programming problem whose solution is obtained by solving its exterior penalty dual problem formulated as an unconstrained minimization problem using Newton–Armijo algorithm.The algorithm converges from any starting point and thus leads to an exact solution.The proposed approach has the advantage that it results in a robust and sparse model so that a large number of components of the solution vector becomes zero which allows in selecting only a small number of hidden nodes. Empirical tests on a number of benchmark datasets for regression, binary and multiclass classi?cation show similar or better general-ization performance of the proposed method in comparison to ELM,OP-ELM and SVM.The sparseness solution and the compar-able generalization performance of the proposed method clearly illustrate its ef?cacy and applicability.

Acknowledgments

The authors are extremely thankful to the learned referees for their critical and constructive comments that greatly improved the earlier version of the paper.

References

[1]S.Balasundaram,Kapil,Application of error minimized extreme learning

machine for simultaneous learning of a function and its derivatives,Neuro-computing74(2011)2511–2519.

[2]G.E.P.Box,G.M.Jenkins,Time Series Analysis:Forecasting and Control,

Holden-Day,San Francisco,1976.

[3]F.Cao,Y.Yuan,Learning errors of linear programming support vector

regression,Appl.Math.Model.35(2011)1820–1828.

[4]C.-C.Chang,C.-J.Lin,LIBSVM:A library for support vector machines,?http://

https://www.wendangku.net/doc/293884506.html,.tw/$cjlin/libsvm?,2001.

[5]Z.Chen,H.Zhu,Y.Wang,A modi?ed extreme learning machine with

sigmoidal activation functions,Neural Comput.Appl.22(3–4)(2013) 541–55010.1007/s00521-012-0860-2.

[6]N.Cristianini,J.Shawe-Taylor,An Introduction to Support Vector Machines

and Other Kernel Based Learning Methods,Cambridge University Press, Cambridge,2000.

[7]DELVE,Data for Evaluating Learning in Valid Experiments.?http://www.cs.

https://www.wendangku.net/doc/293884506.html,/$delve/data?,2005.

[8]G.Feng,G.-B.Huang,Q.Lin,R.Gay,Error minimized extreme learning

machine with growth of hidden nodes and incremental learning,IEEE Trans.

Neural Networks20(2009)1352–1357.

[9]S.Floyd,M.Marmuth,Sample compression learnability,and Vapnik–Chervo-

nenkis dimension,Mach.Learning21(1995)269–304.

[10]B.Frenay,M.Verleysen,Using SVMs with randomized feature spaces:an

extreme learning approach,in:Proceedings of the18th European Symposium on Arti?cial Neural Networks(ESANN),Bruges,Belgium,2010,pp.315–320.

[11]G.Fung,O.L.Mangasarian,Finite Newton method for Lagrangian support

vector machine,Neurocomputing55(2003)39–55.

[12]M.Han,J.Yin,The hidden neuron selection of the wavelet networks using support

vector machines and ridge regression,Neurocomputing72(2008)471–479. [13]J.-B.Hiriart-Urruty,J.J.Strodiot,H.Nguyen,Generalized Hessian matrix and

second order optimality conditions for problems with C L1data,Appl.Math.

Optim.11(1984)43–56.

[14]G.-B.Huang,L.Chen,Convex incremental extreme learning machine,Neuro-

computing70(2007)3056–3062.

[15]G.-B.Huang,L.Chen,Enhanced random search based incremental extreme

learning machine,Neurocomputing71(2008)3460–3468.

[16]G.-B.Huang,L.Chen,C.-K.Siew,Universal approximation using incremental

constructive feedforward networks with random hidden nodes,IEEE Trans.

Neural Networks17(2006)879–892.

[17]G.-B.Huang,X.Ding,H.Zhou,Optimization method based extreme learning

machine for classi?cation,Neurocomputing74(2010)155–163.

[18]G.-B.Huang,D.H.Wang,https://www.wendangku.net/doc/293884506.html,n,Extreme learning machines:a survey,Int.J.

Mach.Learn.Cybernet.2(2011)107–122.

[19]G.-B.Huang,H.Zhou,X.Ding,R.Zhang,Extreme learning machine for

regression and multiclass classi?cation,IEEE Trans.on Syst.Man Cybern.Part B:Cybern.42(2012)513–528.

Table2

Performance comparison of NLPELM with LPELM and ELM having both sigmoid and multiquadric hidden nodes,OP-ELM having sigmoid hidden nodes,and SVM using Gaussian Kernel for binary classi?cation.The test accuracy is shown for the optimal parameter values,??100for OP-ELM and??200for NLPELM,LPELM and ELM.The best result is shown in boldface.

Datasets(Train size,test size)SVM(C,s2)ELM OP-ELM LPELM NLPELM

Sigmoide?TMultiquadrice?TSigmoide?TSigmoid(C,?)Multiquadric

(C,?)Sigmoid(C,?)Multiquadric

(C,?)

Wdbc(100?30,469?30)94.88(2à3,20)76.80(200)81.96(200)95.10(100)84.62(214,200)79.94(2à2,200)90.35(211,200)89.94(21,200) Breast-cancer(150?10,533?10)96.25(20,21)87.53(200)80.41(200)96.37(100)95.62(21,200)96.14(27,200)96.40(21,200)96.25(26,200) Cleveland(150?13,147?13)78.91(27,24)64.76(200)65.92(200)72.79(100)80.48(2à2,200)78.52(2à1,200)81.36(2à2,200)82.52(2à3,200) Australian Credit

(150?14,540?14)

84.26(23,2à1)66.50(200)63.83(200)80.19(100)85.33(2à1,200)85.31(21,200)86.19(2à3,200)86.30(2à4,200)

Ionosphere(150?34,201?34)91.54(20,20)74.78(200)79.2(200)87.06(100)90.35(24,200)86.19(26,200)89.05(23,200)90.9(24,200) Liver-disorders(200?6,145?6)64.14(216,21)58.41(200)65.45(200)71.03(100)68.69(2à1,200)71.41(21,200)68.62(21,200)67.66(212,200) Votes(200?16,235?16)94.04(22,22)82.98(200)85.45(200)94.04(100)95.32(24,200)94.21(21,200)94.43(2à1,200)94.43(26,200) Diabetes(500?8,268?8)75.00(26,23)62.69(200)74.55(200)76.17(100)68.71(2à2,200)68.11(210,200)68.13(21,200)72.05(215,200) Splice(1000?60,2175?60)89.66(26,23)80.21(200)83.66(200)76.37(100)78.24(29,200)82.11(23,200)80.25(215,200)84.34(25,200)

Table3

Performance comparison of NLPELM with ELM having both sigmoid additive node and Gaussian and multiquadric RBF node,and SVM using Gaussian Kernel for multiclass classi?cation.The test accuracy is shown for the optimal parameter values and??200for NLPELM and ELM.The best result is shown in boldface.

Datasets(train size,test size)#Classes SVM ELM NLPELM

Sigmoide?TGaussiane?Tmultiquadrice?TSigmoide?TGaussiane?Tmultiquadrice?TIris(80?4,70?4)397.1491.28(200)92.85(200)86.28(200)95.71(200)94.86(200)94.14(200)

Wine(80?13,98?13)397.9692.22(200)87.75(200)87.74(200)93.16(200)92.55(200)95.20(200) Glass(100?9,114?9)683.8578.85(200)71.84(200)73.33(200)71.23(200)83.60(200)84.04(200) Vehicle(500?18,346?18)476.8879.71(200)77.45(200)79.85(200)80.29(200)78.5(200)80.63(200)

Page-blocks(1000?10,4473?10)595.5394.17(200)93.10(200)93.82(200)95.61(200)94.11(200)89.58(200) Segment(1000?19,1310?19)795.0494.22(200)92.29(200)94.80(200)95.19(200)92.78(200)94.16(200) Satimage(1000?36,5435?36)688.1085.76(200)82.94(200)83.95(200)87.17(200)85.27(200)86.61(200)

S.Balasundaram et al./Neurocomputing128(2014)4–1413

[20]G.-B.Huang,Q.-Y.Zhu,C.-K.Siew,Extreme learning machine:theory and

applications,Neurocomputing70(2006)489–501.

[21]K.Koh,S.Kim,S.Boyd,An interior point method for large-scale?1-regularized

logistic regression,J.Mach.Learn.Res.8(2007)1519–1555.

[22]https://www.wendangku.net/doc/293884506.html,n,C.Soh,G.-B.Huang,Two-stage extreme learning machine for regres-

sion,Neurocomputing73(2010)3028–3038.

[23]O.L.Mangasarian,Exact1-norm support vector machines via unconstrained

convex differential minimization,J.Mach.Learn.Res.7(2006)1517–1530. [24]Y.Miche, A.Sorjamaa,P.Bas,O.Simula, C.Jutten, A.Lendasse,OP-ELM:

optimally pruned extreme learning machine,IEEE Trans.Neural Networks21 (2010)158–162.

[25]Y.Miche,M.van Heeswijk,P.Bas,O.Simula,A.Lendasse,TROP-ELM:a double-

regularized ELM using LARS and Tikhonov regularization,Neurocomputing74 (2011)2413–2421.

[26]Y.Miche, A.Sorjamaa, A.Lendasse,OP-ELM:theory,experiments and a

toobox,LNCS-Arti?cial Neural Networks-ICANN2008-Part1,5163,2008, pp.145–154.

[27]S.Mukherjee,E.Osuna,F.Girosi,Nonlinear prediction of chaotic time series

using support vector machines,in:NNSP'97:Neural Networks for Signal Processing VII:Proceedings of the IEEE Signal Processing Society Workshop, Amelia Island,FL,USA,1997,pp.511–520.

[28]K.R.Muller,A.J.Smola,G.Ratsch,B.Sch?lkopf,J.Kohlmorgen,Using support

vector machines for time series prediction,in:B.Sch?lkopf,C.J.C.Burges,A.

J.Smola(Eds.),Advances in Kernel Methods—Support Vector Learning,MIT Press,Cambridge,MA,1999,pp.243–254.

[29]P.M.Murphy, D.W.Aha,UCI Repository of Machine Learning Databases.

University of California,Irvine,?https://www.wendangku.net/doc/293884506.html,/$mlearn?,1992. [30]C.R.Rao,S.K.Mitra,Generalized Inverse of Matrices and its Applications,John

Wiley,New York,1971.

[31]S.Shalev-Shwartz, A.Tewari,Stochastic methods for?1-regularized loss

minimization:in Proceedings of the26th International Conference on Machine Learning,Montreal,Canada,2009,pp.929–936.

[32]M.Schmidt,G.Fung,R.Rosales,Fast optimization methods for L1regulariza-

tion:a comparative study and two new approaches,in:Proceedings of the 18th European Conference on Machine Learning,Springer-Verlag,Berlin, Heidelberg,2007,pp.286–297.

[33]R.Tibshirani,Regression shrinkage and selection via the lasso,J.R.Stat.Soc.58

(1996)267–288.

[34]V.N.Vapnik,The Nature of Statistical Learning Theory,2nd ed.,Springer,

New York,2000.

[35]L.Wang,M.D.Gordon,J.Zhu,Regularized least absolute deviations regression

and an ef?cient algorithm for parameter tuning,in:Proceedings of the Sixth International Conference on Data Mining(ICDM'06),IEEE,2006,pp.690–700.

[36]T.Wu,https://www.wendangku.net/doc/293884506.html,nge,Coordinate descent algorithms for lasso penalized regression,

Ann.Appl.Statist.2(2008)224–244.

[37]Y.Yuan,Y.Wang,F.Cao,Optimization approximation solution for regression

problem based on extreme learning machine,Neurocomputing74(2011) 2475–2482.

[38]L.Zhang,W.Zhou,On the sparseness of1-norm support vector machines,

Neural Networks23(2010)373–385.

[39]J.Zhu,T.Hastie,S.Rosset,R.Tibshirani,1-norm support vector machines,

Adv.Neural.Inf.Process.Syst.16(2004)49–56.

[40]Q.-Y.Zhu, A.K.Qin,P.N.Suganthan,G.-B.Huang,Evolutionary extreme

learning machine,Pattern.Recognit.38(2005)1759–1763

.S.Balasundaram is a Professor of Jawaharlal Nehru University,India.He received his Ph.D.from Indian Institute of Technology,New Delhi in1983.From1983 to1985he was a post doctoral fellow in INRIA, Rocquencourt,France.He joined as an Assistant Pro-fessor in Jawaharlal Nehru University in1986.During 2003–2005he was a visiting faculty in Eastern Medi-terranean University,North Cyprus.His main research includes support vector machine and extreme learning machine methods for classi?cation and regression problems,fuzzy regression and applied

optimization. Deepak Gupta is a Ph.D.student of Jawaharlal Nehru University,India.He received his Masters of Computer Applications and M.Tech.degrees from Jawaharlal Nehru University in2009and2011respectively.His research interests include support vector machine and other data mining

techniques.

Kapil is an Assistant Professor at Birla Institute of Technology and Science,Pilani,India.He received his PhD in Computer Science from Jawaharlal Nehru Uni-versity,New Delhi in2012.His research interests include support vector machine,extreme learning machine methods along with other data mining techniques.

S.Balasundaram et al./Neurocomputing128(2014)4–14 14

e-Learning 分享：2015年elearning发展趋势

2015年elearning发展趋势 1.大数据 elearning涉及的数据量越来越大，而传统的数据处理方式也变得越来越难以支撑。仅在美国就有46%的大学生参加在线课程，而这也仅仅是一部分elearning用户而已。虽然大数据的处理是一个难题，但同时也是改善elearning的一个良好的契机。下面是大数据改善elearning的一些案例。大数据能够使人们更深入的了解学习过程。例如：对于（课程）完成时间及完成率的数据记录。大数据能够有效跟踪学习者以及学习小组。例如：记录学习者点击及阅读材料的过程。大数据有利于课程的个性化。例如：了解不同的学习者学习行为的不同之处。通过大数据可以分析相关的学习反馈情况。例如：学习者在哪儿花费了多少时间，学习者觉得那部分内容更加有难度。 2.游戏化游戏化是将游戏机制及游戏设计添加到elearning中，以吸引学习者，并帮助他们实现自己

的学习目标。不过，游戏化的elearning课程并不是游戏。 elearning的游戏化只是将游戏中的元素应用于学习环境中。游戏化只是利用了学习者获得成功的欲望及需求。那么，游戏化何以这么重要？学习者能够回忆起阅读内容的10%，听讲内容的20%。如果在口头讲述过程中能够添加画面元素，该回忆比例上升到30%，而如果通过动作行为对该学习内容进行演绎，那么回忆比例则高达50%。但是，如果学习者能够自己解决问题，即使是在虚拟的情况中，那么他们的回忆比例则能够达到90%！（美国科学家联合会2006年教育游戏峰会上的报告）。近80%的学习者表示，如果他们的课业或者工作能够更游戏化一些，他们的效率会更高。（Talent LMS 调查） 3.个性化每位学习者都有不同的学习需求及期望，这就是个性化的elearning如此重要的原因。个性化不仅仅是个体化，或差异化，而是使学习者能够自由选择学习内容，学习时间以及学习方式。相关案例：调整教学进度使教学更加个性化；调整教学方式使教学更加与众不同；让学习者自己选择适合的学习方式；调整教学内容的呈现模式（文本、音频、视频）。

连锁咖啡厅顾客满意度涉入程度对忠诚度影响之研究以高雄市星巴克为例

连锁咖啡厅顾客满意度涉入程度对忠诚度影响之研究以高雄市星巴克为例 Document number【SA80SAB-SAA9SYT-SAATC-SA6UT-SA18】

连锁咖啡厅顾客满意度对顾客忠诚度之影响－以高雄市星巴克为例 The Effects of Customer Satisfaction on Customer Loyalty—An Empirical study of Starbucks Coffee Stores in Kaohsiung City 吴明峻 Ming-Chun Wu 高雄应用科技大学观光管理系四观二甲学号：06 中文摘要本研究主要在探讨连锁咖啡厅顾客满意度对顾客忠诚度的影响。本研究以高雄市为研究地区，并选择8间星巴克连锁咖啡厅的顾客作为研究对象，问卷至2006年1月底回收完毕。本研究将顾客满意度分为五类，分别是咖啡、餐点、服务、咖啡厅内的设施与气氛和企业形象与整体价值感；将顾客忠诚度分为三类，分别是顾客再购意愿、向他人推荐意愿和价格容忍度，并使用李克特量表来进行测量。根据过往研究预期得知以下研究结果： 1.人口统计变项与消费型态变项有关。 2.人口统计变项与消费型态变项跟顾客满意度有关。 3.人口统计变项与消费型态变项跟顾客忠诚度有关。 4.顾客满意度对顾客忠诚度相互影响。关键词：连锁、咖啡厅、顾客满意度、顾客忠诚度 E-mail

一、绪论研究动机近年来，国内咖啡消费人口迅速增加，国外知名咖啡连锁品牌相继进入台湾，全都是因为看好国内咖啡消费市场。在国内较知名的连锁咖啡厅像是星巴克、西雅图极品等。本研究针对连锁咖啡厅之顾客满意度与顾客忠诚度间关系加以探讨。研究目的本研究所要探讨的是顾客满意度对顾客忠诚度的影响，以国内知名的连锁咖啡厅星巴克之顾客为研究对象。本研究有下列五项研究目的： 1.以星巴克为例，探讨连锁咖啡厅的顾客满意度对顾客忠诚度之影响。 2.以星巴克为例，探讨顾客满意度与顾客忠诚度之间的关系。 3.探讨人口统计变项与消费型态变项是否有相关。 4.探讨人口统计变项与消费型态变项跟顾客满意度是否有相关。 5.探讨人口统计变项与消费型态变项跟顾客忠诚度是否有相关。二、文献回顾连锁咖啡厅经营风格分类根据行政院（1996）所颁布的「中华民国行业标准分类」，咖啡厅是属於九大行业中的商业类之饮食业。而国内咖啡厅由於创业背景、风格以及产品组合等方面有其独特的特质，使得经营型态与风格呈现多元化的风貌。依照中华民国连锁店协会（1999）对咖啡产业调查指出，台湾目前的咖啡厅可分成以下几类：

elearning时代：企业培训大公司先行

e-learning时代：企业培训大公司先行正如印刷书籍造就16世纪后的现代大学教育，互联网的普及，将使学习再次发生革命性的变化。通过e-learning，员工可以随时随地利用网络进行学习或接受培训，并将之转化为个人能力的核心竞争力，继而提高企业的竞争力。从个人到企业，从企业到市场，e-learning有着无限的生长空间，跃上时代的浪尖是必然，也是必须。企业培训:e-learning时代已经来临约翰-钱伯斯曾经预言“互联网应用的第三次浪潮是e-learning”，现在，这个预言正在变为现实。美国培训与发展协会(ASTD)对2005年美国企业培训情况的调查结果显示，美国企业对员工的培训投入增长了16.4%，e-learning培训比例则从24%增长到28%，通过网络进行学习的人数正以每年300%的速度增长，60%的企业已使用网络形式培训员工;在西欧，e-learning市场已达到39亿美元规模;在亚太地区,越来越多的企业已经开始使用e-learning…… 据ASTD预测，到2010年，雇员人数超过500人的公司，90%都将采用e-learning。 e-learning:学习的革命正如印刷书籍造就16世纪后的现代大学教育，互联网的普及，将使学习再次发生革命性的变化。按照ASTD给出的定义,e-learning是指由网络电子技术支撑或主导实施的教学内容或学习。它以网络为基础、具有极大化的交互作用，以开放式的学习空间带来前所未有的学习体验。在企业培训中，e-learning以硬件平台为依托，以多媒体技术和网上社区技术为支撑，将专业知识、技术经验等通过网络传送到员工面前。通过e-learning，员工可以随时随地利用网络进行学习或接受培训，并将之转化为个人能力的核心竞争力，继而提高企业的竞争力。据IDC统计，自1998年e-learning概念提出以来，美国e-learning市场的年增长率几乎保持在80%以上。增长速度如此之快，除了跟企业培训本身被重视度日益提高有关，更重要的是e-learning本身的特性: 它大幅度降低了培训费用，并且使个性化学习、终生学习、交互式学习成为可能。而互联网及相关产品的开发、普及和商业化是e-learning升温最强悍的推动力。数据表明，采用e-learning模式较之传统模式至少可节约15%-50%的费用，多则可达

星巴克swot分析

6月21日 85度C VS. 星巴克SWOT分析星巴克SWOT分析优势 1.人才流失率低 2.品牌知名度高 3.熟客券的发行 4.产品多样化 5.直营贩售 6.结合周边产品 7.策略联盟劣势 1.店內座位不足 2.分店分布不均机会 1.生活水准提高 2.隐藏极大商机 3.第三空间的概念 4.建立电子商务威胁 1.WTO开放后，陆续有国际品牌进驻 2.传统面包复合式、连锁咖啡馆的经营星巴克五力分析１、供应商：休闲风气盛，厂商可将咖啡豆直接批给在家煮咖啡的消费者２、购买者：消费者意识高涨、资讯透明化（比价方便）３、同业內部竞争：产品严重抄袭、分店附近必有其他竞争者４、潜在竞争者：设立咖啡店连锁店无进入障碍、品质渐佳的铝箔包装咖啡５、替代品：中国茶点、台湾小吃、窜红甚快的日本东洋风...等８５度ｃ市場ｓｗｏｔ分析：【Strength优势】具合作同业优势产品精致以价格进行市场区分，平价超值服务、科技、产品、行销创新，机动性强加盟管理人性化【Weakness弱势】通路品质控制不易品牌偏好度不足通路不广财务能力不健全 85度c的历史资料，在他们的网页上的活动信息左邊的新聞訊息內皆有詳細資料，您可以直接上網站去查閱，皆詳述的非常清楚。

顧客滿意度形成品牌重於產品的行銷模式。你可以在上他們家網站找找看！【Opportunity机会】勇于變革变革与创新的经营理念同业策略联盟的发展、弹性空间大【Threat威胁】同业竞争对手(怡客、维多伦)门市面对面竞争加盟店水准不一，品牌形象建立不易直，间接竞争者不断崛起(壹咖啡、City Café．．．) 85度跟星巴克是不太相同的零售，星巴客应该比较接近丹堤的咖啡厅，策略形成的部份，可以从 1.产品线的宽度跟特色 2.市场区域与选择 3.垂直整合 4.规模经济 5.地区 6.竞争优势這6點來做星巴客跟85的区分可以化一个表，來比较他們有什么不同內外部的話，內部就从相同产业來分析(有什麼优势跟劣势) 外部的話，一樣是相同产业但卖的东西跟服务不太同，来与其中一个产业做比较(例如星巴客) S(优势):點心精致平价,咖啡便宜,店面设计观感很好...等等 W(劣势):对于消费能力较低的地区点心价格仍然较高,服务人员素质不齐,點心种类变化較少 O(机会):对于点心&咖啡市场仍然只有少数的品牌独占(如:星XX,壹XX...等),它可以透过连锁店的开幕达成高市占率 T(威协):1.台湾人的模仿功力一流,所以必须保持自己的特色做好市场定位 2.消費者的口味变化快速,所以可以借助学者"麥XX"的做法保有主要的點心款式外加上几样周期变化的點心五力分析客戶讲价能力（the bargaining power of customers）、供应商讲价能力（the bargaining power of suppliers）、新进入者的竞争（the threat of new entrants）、替代品的威协（the threat of substitute products）、现有厂商的竞争（The intensity of competitive rivalry）

Elearning平台未来发展新趋势

Elearning 平台未来发展新趋势

随着e-Learning概念被大众逐渐所接受，如今许多e-Learning公司纷纷涌现，现基本形成三大氛围趋势。一类提供技术（提供学习管理平台，如上海久隆信息jite公司、Saba公司、Lotus公司）、一类侧重内容提供（如北大在线、SkillSoft公司、Smartforce公司、Netg公司）、一类专做服务提供（如Allen Interactions公司）。不过随着e-Learning发展，提供端到端解决方案的公司将越来越多，或者并不局限于一个领域，如目前北大在线除提供职业规划与商务网上培训课件外，还提供相应的网络培训服务，e-Learning作为企业发展的添加剂，必将成为知识经济时代的正确抉择。 e-Learning的兴起本身与互联网的发展应用有着密切关联，它更偏重于与传统教育培训行业相结合，有着扎实的基础及发展前景，因而在大部分互联网公司不景气之时，e-Learning公司仍然被许多公司所看好。权威的Taylor Nelson Sofresd对北美的市场调查表明，有94%的机构认识到了e-Learning的重要性，雇员在10000人以上的公司62.7%都实施了e-Learning，有85%的公司准备在今后继续增加对e-Learning的投入。而54%的公司已经或预备应用e-Learning来学习职业规划与商务应用技能。面对如此庞大的e-Learning市场，全世界e-Learning公司将面临巨大机遇，但竞争也更加激烈。优胜劣汰成为必然。将来只有更适应市场需求，确保质量，树立信誉，建设自有品牌的公司才更具备有竞争力。那么未来e-Learning到底将走向何方？如今很多平台开发厂商以及用户都很关心这个问题。其实从e-Learning诞生到现在，如今大家有目共睹，移动终端上的3-D让我们的体验飘飘欲仙，该技术放大了我们的舒适度；云计算使得软件的访问轻而易举；全息图像为学习环境提供了逼真的效果；个性化的LMS使得用户可以学会更多专业人员的从业经验；更多的终端、更低廉的带宽连接为现实向世界提供解决方案的能力。个性化学习是e-Learning平台共同追求的目标，力求解决单个群体的日常需求，将成为elearning平台下一阶段发展的新追求。那么个性化的学习将为何物呢？它又将通过何途径来实现呢？移动学习移动学习（M-learning）是中国远程教育本阶段发展的目标方向，特点是实现“Anyone、Anytime、Anywhere、Any style”（4A）下进行自由地学习。移动学习依托目前比较成熟的无线移动网络、国际互联网以及多媒体技术，学生和教师使用移动设备（如无线上网的便携式计算机、PDA、手机等），通模板版次：2.0 I COPYRIGHT? 2000- JITE SHANGHAI

星巴克咖啡连锁店客户满意度测评报告

顾客满意度测评报告星巴克咖啡连锁店顾客满意度测评报告一、报告摘要我们对本公司的星巴克咖啡连锁店进行了顾客满意度指数测评。通过对珠海市范围内92名用户的问卷调查，测评出星巴克咖啡连锁店的顾客满意度指数为。测评结果反映出顾客对星巴克咖啡连锁店的满意程度，以及存在的急需解决的问题，我们对此提出了针对性的改进建议。二、基本情况介绍行业分类：快餐行业调查地点：广东省珠海市调查方法：问卷发放地点为珠海市各星巴克咖啡店门口，进行实地人员访问调查，并在现场对研究主题加以解说，对受访者之提问加以解析，以防止漏答或乱答所产生之无效问卷，以求收集到最多、最正确的资料。调查时间：2011年10月23日~10月30日样本数量：92份样本情况：2个星期内光顾星巴克咖啡店的顾客调查机构：星巴克咖啡连锁公司报告撰写：星巴克咖啡连锁公司三．正文 1、测评的背景随着咖啡产业的开发，咖啡的激烈竞争及消费者意识升高压力，提升咖啡店的服务品质，追求顾客的满意，进而提高顾客的忠诚度已成为咖啡店经营管理者一個重要课題。为了更深入、客观地了解顾客对本公司主导星巴克系列产品的需求和使用感受，开展了顾客满意度指数的测评工作。本次测评的目的是：确定影响星巴克系列产品顾客满意度指数的主要因素；了解顾客对星巴克系列产品的满意度水平；分析店铺环境、服务质量等因素对满意度结果的影响；分析星巴克咖啡连锁店与竞争对手相比较存在哪些强项和薄弱环节。

2、测评指标设定该模型主要由6种变量组成，即顾客期望、顾客对质量的感知、顾客对价值的感知、顾客满意度、顾客抱怨、顾客忠诚。其中，顾客期望、顾客对质量的感知、顾客对价值的感知决定着顾客满意程度，是系统的输入变量；顾客满意度、顾客抱怨、顾客忠诚是结果变量。顾客满意度指数测评指标体系分为四个层次：第一层次：总的测评目标“顾客满意度指数”，为一级指标；第二层次：顾客满意度指数模型中的6大要素，如下所示；顾客对星巴克的期望顾客对星巴克质量的感知顾客对星巴克价值的感知顾客对星巴克的满意度顾客对星巴克的抱怨顾客对星巴克的忠诚为二级指标第三层次：由二级指标具体展开而得到的指标，为三级指标；第四层次：三级指标具体展开为问卷上的问题，形成四级指标。测评体系中的一级和二级指标适用于所有的产品和服务，实际上我们要研究的是三级和四级指标。见下表：顾客满意度指数测评的二、三级指标

企业战略Elearning在企业里的发展与展望

企业战略E l e a r n i n g在企业里的发展与展望集团文件发布号：（9816-UATWW-MWUB-WUNN-INNUL-DQQTY-

★★★文档资源★★★【摘要】E-learning是知识经济时代新兴的一种学习方式。它以其丰富的信息资源、友好的交互性能以及优良的开放性等特点而越来越受到人们的青睐。在知识经济时代，合理地利用时间，高效地接受信息，更快地更新知识已成为人们占主导地位的价值观念。互联网的建立为快捷化的学习提供了良好的内部和外部环境。文章首先对E-learning进行了简单介绍，其次阐述了E-learning在我国企业的发展现状；最后提出未来E-learning在我国企业里最好的发展途径就是建立“企业大学”培训方式。【关键词】E-learning；现状；展望 E-learning自从20世纪90年代以来，被称作“第四媒体”的互联网以其无限的容量、广阔的覆盖面实现了人类传播史上新的****，一个个由它创造的新景观，不断向人们展示它独特的魅力。此外，E-Learning以其网络化、个性化、可管理等优点成为一种新的技术应用时尚，受到各国企业培训的青睐。中国的E-Learning市场在经过几年的培育以后，国内的企业开始对E-Learning表现出浓厚的兴趣，并逐步实施E-Learning的解决方案。E-Learning是一种不同与传统企业培训的一种培训方式，有自己独特的优势，如果不考虑区别，培训很难达到预期的效果。一、E-learning

依据美国教育部2000年度“教育技术******”权威的论述，认为“E-learning”是一种受教育的方式，主要包括新技术条件下的沟通机制和人与人之问的交互作用。这些沟通机制与方式主要指：计算机网络、多媒体、专业内容网站、信息搜索、电了图书馆、远程学习与网上课堂等.通过该概念可以引伸为通过互联网进行的教育及相关服务。其作用主要有：提供了学习的随时随地性;改变传统的教师作用和师生之问的关系；提高学生批判性思维和分析能力；极大地改变课堂教学的目的和功能。二、E-learning在我国企业里的发展现状中国大陆关于E-learning问题的研究热潮比国外滞后近10年左右，中国台湾要比大陆早一些，有关E-learning研究的文章，有许多都是对外国资料的引入，本土化的E-learning研究还十分少，主要原因是E-learning确实还是新生事物，其发展还需要很长的一个过程。尤其是在企业培训方面的学术论文很少。而研究新员工E-learning培训的就少之又少了。在中国大陆，E-learning研究主要有两个方向：（1）学术界（主要是教育技术界），将视角放在了远程教育，关注网络学历教育，学校的网络课堂建设，学习支持服务系统的建设等等，几乎与网络远程教育是一个概念；（2）企业培训及咨询业，焦点则是落在企业的培训，且重点放在企业管理培训。对于新员工的E-learning培训方面的研究，无论是国外，还是国内，都很少见到有专门的分支研究，而是常常在企业E-learning培训系统中作为案例或是系统的一个部分来解说。现

连锁咖啡店顾客满意度涉入程度对忠诚度影响之研究以台北市星巴克为例

连锁咖啡店顾客满意度涉入程度对忠诚度影响之研究以台北市星巴克为例 Document number【SA80SAB-SAA9SYT-SAATC-SA6UT-SA18】

连锁咖啡店顾客满意度、涉入程度对忠诚度影响之研究-以台北市星巴克为例 The Effects of Customer Satisfaction and Involvement Levels on Customer Loyalty—An Empirical study of Starbucks Coffee Stores in Taipei City Area 郭佳铭 Chia Ming Kuo 高雄应用科技大学观光管理系四观三甲中文摘要近年来咖啡馆不断地在大街小巷崛起，可见各品牌咖啡馆的战国时期已然展开，连锁品牌的引进与知名财团的投入，不外乎是因为看好国内咖啡市场，着眼於其背後可观的商机，但到底台湾的咖啡市场还有多大的发展空间随着国人饮用咖啡的习惯逐渐养成，频率日渐升高，各咖啡品牌展店的数目和速度也全力加快，市场供需间的竞争自然日渐白热化，因此，各咖啡连锁店如何锁定目标客层，培养满意的顾客群，进而建立消费者对其的忠诚度，就必须各凭本事了。本研究试图针对咖啡连锁店建构消费者满意度与忠诚度间关系的架构，并纳入消费者本身对咖啡的涉入程度加以探讨，希望能提供相关业者在经营管理上与行销策略上的建议与修正。本研究选定的产业范围为咖啡连锁店，碍於人力与时间的限制，选定台北市作为本研究的研究地理范围，并定义本研究所指咖啡连锁店乃是在台北市具有七家分店以上的咖啡连锁业者。本研究选定星巴克六家台北市咖啡店之消费者作为研究母体，根据过往研究结果预期得知以下研究结果： 1.「人口统计变数」对「满意度」或对「忠诚度」的差异只有部分显着。 2.「涉入程度」对「满意度」或对「忠诚度」的差异皆为显着。 3.「满意度」与「忠诚度」具有显着的关联性。 4.「涉入程度」对满意度与忠诚度间关系具有中介效果。针对以上研究结果，本研究提出以下建议：研究计画提建议 1.吸引一个新顾客所需花费的成本是留住一个旧有顾客的五倍之多(Kolter, Leong, Ang & Tan,1996)，显见提升消费者忠诚度的重要性，而本研究发现消费者满意度与忠诚度间的确呈现正向关系，表示满意度的提升能有效带动忠诚度，使得消费者愿意再次购买、推荐亲朋好友购买，甚至可以容忍产品价格的些微上涨，因此，咖

学习课程eLearning基础知识

e-Learning基本知识（2008）一、e-Learning概念及定义 E-learning又称电子化学习，是指通过计算机、网络等数字化方法进行学习与教学的活动，它充分利用IT技术所提供的、具有全新沟通机制与丰富资源的学习环境，实现一种新的学习方式。（1）广义地讲，E-learning指电子化、数字化或互联网化学习。（2）狭义地讲，E-learning指互联网化学习，因为互联网对人们工作和生活的影响越来越巨大，特别是在学习方面，日益成为一种主流的学习方式。（3）从企业角度讲，E-learning是一种学习与绩效提升的解决方案，它通过网络等技术实现学习的全过程管理（设计、实施、评估等），使学习者获得知识、提高技能、改变观念、提升绩效，最终使企业提升竞争力。] 综上，E-learning是由网络电子技术支撑或主导实施的教学内容和学习，它是网络与信息科技（指学习管理系统-learning management system）、教学内容和学习技术的完美结合。包括因特网在线学习和内部网在线学习两种基本方式。二、e-learning的历史及发展趋势 1、在E-learning已经得到了重视和较为广泛的运用：（1）在E-learning的发源地美国，自从1999年，在美国加州的online Learning大会上第一次提出这个概念以来，在全球范围内电子化学习的应用一直保持了一个很高的增长速度；（2）美国92%的大型企业已经或开始采用在线学习，其中60%的企业已经将e-Learning 作为企业实施培训的主要辅助工具。（3）截止到2004年，全球20%的企业运用了在线学习模式；（4）据IDC(互联网数据中心)的估计，E-learning将达到占企业总训练量的40%，而传统的教室训练将下降至60%；（5）据ASTD（美国培训与发展协会）预测，到2010年，雇员人数超过500人的公司中90%都将采用E-learning进行培训。 2、在中国，E-leaning经历了一个逐渐被接受的过程：（1）从1999年已经有人开始研究E-learning的企业应用；（2）2001年国外在线学习公司开始进入中国（3）2002年开始，很多中国大型的金融、通讯企业开始采用在线学习，如工行、太保、中移动、中国电信等企业。但由于发展阶段和需求的不同，国外的系统在中国无法大规模地推广和有效使用，市场呼唤国内的专业化企业能够开发出有效的系统和内容以满足市场不断增长的需求。（4）04、05年在线学习以各种方式在中小企业开始运用2004年以来，中国企业E-learning 在加速发展之中，企业的HR部门逐渐看到了应用E-learning能为企业带来的优势，不但节省经费、时间、人力，而且使企业内部知识得到快速的更新，成为企业竞争力提升必不可少的一部分。另外，市场上也出现了一些能够提供很好内容和服务的供应商，如以时代光华为代表的e-learning提供商。 2001年e-learning的市场总额为52亿美元，预计2006年市场总额将达到237亿美元，年平均增长率为35.6%。这主要是由于这种新的学习培训模式确实能够大幅降低培训的

服务营销案例-星巴克的困境2.0

星巴克的困境一、服务产品分析 (一) 星巴克的战略与差异化定位 1. 服务发展战略：为人们提供“第三空间”的享受，倡导将生活与咖啡结合起来，将享受咖啡时的体验交错在人们的生活中； 2. 差异化定位： (二) 服务设计过程产生难点和解决 1. 服务设计过程中的难点：服务不同于有型的产品，它具有四种主要的特性：不可储存性、无形性、同步性和可变性。星巴克的战略与差异化定位服务设计过程产生难点和解决服务包的构成与服务的传递

导致人们只能通过语言来描述服务，这就需要星巴克解决服务设计中的两个问题：（1）. 过于简单，不全面；（2）. 主观性较强，阐述具有偏见 2. 服务设计过程中难点的解决：（1）. 服务过程中提供丰富多样的产品，如零售渠道销售咖啡产品、咖啡豆和咖啡粉等，让用户从多个角度感受到了星巴克的品牌价值和服务主张；（2）. 无形的服务通过有形的产品表现出来，如用户在进入星巴克之前，可能会尝试星巴克的冰淇淋，让用户切实感受到星巴克的优质产品。 (三) 服务包的构成与服务的传递 1. 服务包的构成（1）. 核心服务——提供根本服务为顾客提供最好的咖啡和周边产品，如罐装咖啡、咖啡粉、咖啡豆和冰淇淋等。（2）. 便利性服务——增加服务可获得性咖啡粉、罐装咖啡等产品遍布在机场、餐厅和旅馆；（购买前）星巴克服务场所多位于交通密集、可见度高的场所；（购买前）整个咖啡购买过程当中，3分钟内的服务承诺；（购买中）拟引进自主咖啡机，用户更方便地获取咖啡；（购买中）会员卡制度，为顾客提供方便的；（购买中）提供享用咖啡的场所，用户可以更方便地品尝咖啡（购买后）（3）. 支持性服务——增加服务的互动性合作者被鼓励和顾客交谈，为顾客提供意想不到的惊喜；提供点心、苏打水和果汁和咖啡相关的器具；无线高速上网；与其他顾客进行互动，周围的环境让顾客愿意留下来（4）. 顾客参与核心服务便利性服务辅助性服务顾客参与

企业eLearning在线学习建设方案

企业e-Learning在线学习建设方案企业培训是企业提升员工技能、提高员工个人能力素质的有效手段，企业对于培训需求的确定，是企业培训管理者根据企业自身的发展、企业员工岗位的任务要求以及员工的个人能力素质目标来确定的。新形式下，大多数大型企业为了迅速建设自己的高效团队，纷纷建设企业大学，以培养高素质人员，为了节省成本，都青睐于引入e-Learning机制，为员工提供方便灵活的学习方式。企业引入e-Learning后，培训效果将取得较大的提高，您的企业将会： 1.充分利用企业内部培训资源，创造反复使用的学习内容 2.降低培训成本，大大节省面对面培训的花费，如教师费用、场地费用、差旅费用 3.降低员工离职所带来的损失 4.提高培训效率，与面授培训相结合，可以最大化的提高培训效率. 经过大量的企业应用后，根据调查结果显示，实施e-Learning后的直接效果： ☉面授培训时间缩减了↓40% ☉差旅的费用下降了↓50% ☉培训的完成率增加了↑2倍 ☉员工培训总数增加了↑25% ☉学习曲线加快↑60％ ☉内容保持力提高↑25－60％ ☉学习收获增加↑56％ ☉连贯性增强↑50－60％ ☉培训过程压缩↓70％或更高 ☉内容传递的准确减少了教学过程中偏差↓26% XR-e-Learning企业版是专门为企业实施网络学习（e-Learning）而打造的网络培训和知识管理平台，它是针对合作企业内部人才培养的特点，专门为合作企业单独设计的。包含了实现企业内部知识管理体系的功能，同时可以与企业的其他系统如OA、绩效等管理平台及数据库无缝联接起来，是将传统教育形式（面授、阅读等）与在线网络学习有机结合在一起的一套教育综合管理系统。平台支援企业所独有的：技能管理、学习管理、内容管理、知识管理、教室管理及工作流管理等。尤其是其操作界面打破了传统的表格式管理界面，具有独创的

星巴克环境分析报告

小组成员：李丹胡燕徐钰容韩乐星巴克环境分析报告目前国内咖啡消费60%为速溶咖啡，传统冲泡方式的咖啡消费仅占30%。据统计，中国速溶咖啡是40%的年增率，传统咖啡是30%的年增长率，可见，未来中国的消费增长空间极大，中国的咖啡消费市场是一块可口的蛋糕。星巴克看好中国市场的巨大潜力，致力于在不久的将来使中国成为星巴克在美国之外最大的国际市场。自1998年3月在台湾开出第一家店和1999年1月在北京开出大陆第一家店以来，星巴克已在中国大陆、香港、台湾和澳门开设了近 500多家门店，其中约230家在大陆地区。（一）宏观营销环境： 1, 对于咖啡来说，文化环境最重要。在中国，历来被国民所接受的是茶叶。而茶叶和咖啡，有些水火不容的味道。让习惯喝茶的中国人来普遍地喝咖啡还有很长的路要走。星巴克在以绿茶为主要饮料的国家的初步成功，可以说明其理念能被不同文化背景所接受。 2, 消费者支付能力提高。近年来，中国经济飞速发展，国民生活水平显著提高，消费水平也在与日俱增，为星巴克在中国扩大市场提供了条件.。 3, 中国人口总量巨大，因而营销的市场广阔。同时咖啡没有特别的年龄阶段以及性别的限制，主要是针对职业结构和受教育程度等结构特点，开展营销活动。 4，地理分布对咖啡来说很重要，但其重要逐渐减小。地理分布决定了自然条件，也就在一定程度上决定了人们的生活习惯以及地区的经济。随着经济的发展和咖啡文化的普及，此影响减弱，但也是不可忽视的一方面。一般来说，人口密度大，顾客越集中，营销成本就越底。因此，开在繁华的街道，对星巴克来说，是非常有益的。（二）微观营销环境： 1, 对于企业内部，星巴克提供非常全面且多样化的职业发展机会。星巴克所有的职位都将提供不同的职业发展机会–以及极佳的培训和福利条件。坚持员工第一原则，对员工大量投资。建立了完善的薪酬福利制度，增加员工福利，同时，对员工进行非常培训，即栽培和辅导训练。使员工对企业的满意度高，流失率少，能够为顾客提供一流的服务水平。而且，星巴克认为，在服务业，最重要的营销管道式分店本身，而不是广告，通过员工的完美服务，使星巴克赢得信任和口碑。

连锁咖啡厅顾客满意度、涉入程度对忠诚度影响之研究—以高雄市星巴克为例

连锁咖啡厅顾客满意度对顾客忠诚度之影响－以高雄市星巴克为例 The Effects of Customer Satisfaction on Customer Loyalty—An Empirical study of Starbucks Coffee Stores in Kaohsiung City 吴明峻 Ming-Chun Wu 高雄应用科技大学观光管理系四观二甲学号：1093136106 中文摘要本研究主要在探讨连锁咖啡厅顾客满意度对顾客忠诚度的影响。本研究以高雄市为研究地区，并选择8间星巴克连锁咖啡厅的顾客作为研究对象，问卷至2006年1月底回收完毕。本研究将顾客满意度分为五类，分别是咖啡、餐点、服务、咖啡厅内的设施与气氛和企业形象与整体价值感；将顾客忠诚度分为三类，分别是顾客再购意愿、向他人推荐意愿和价格容忍度，并使用李克特量表来进行测量。根据过往研究预期得知以下研究结果： 1.人口统计变项与消费型态变项有关。 2.人口统计变项与消费型态变项跟顾客满意度有关。 3.人口统计变项与消费型态变项跟顾客忠诚度有关。 4.顾客满意度对顾客忠诚度相互影响。关键词：连锁、咖啡厅、顾客满意度、顾客忠诚度 E-mail：hisoka47@https://www.wendangku.net/doc/293884506.html,

一、绪论 1.1研究动机近年来，国内咖啡消费人口迅速增加，国外知名咖啡连锁品牌相继进入台湾，全都是因为看好国内咖啡消费市场。在国内较知名的连锁咖啡厅像是星巴克、西雅图极品等。本研究针对连锁咖啡厅之顾客满意度与顾客忠诚度间关系加以探讨。 1.2研究目的本研究所要探讨的是顾客满意度对顾客忠诚度的影响，以国内知名的连锁咖啡厅星巴克之顾客为研究对象。本研究有下列五项研究目的： 1.以星巴克为例，探讨连锁咖啡厅的顾客满意度对顾客忠诚度之影响。 2.以星巴克为例，探讨顾客满意度与顾客忠诚度之间的关系。 3.探讨人口统计变项与消费型态变项是否有相关。 4.探讨人口统计变项与消费型态变项跟顾客满意度是否有相关。 5.探讨人口统计变项与消费型态变项跟顾客忠诚度是否有相关。二、文献回顾 2.1连锁咖啡厅经营风格分类根据行政院（1996）所颁布的「中华民国行业标准分类」，咖啡厅是属于九大行业中的商业类之饮食业。而国内咖啡厅由于创业背景、风格以及产品组合等方面有其独特的特质，使得经营型态与风格呈现多元化的风貌。依照中华民国连锁店协会（1999）对咖啡产业调查指出，台湾目前的咖啡厅可分成以下几类： 2.1.1欧式咖啡

星巴克客户关系管理现状分析及解决方案

星巴克咖啡客户关系管理现状分析及解决方案内容摘要：通过对星巴克的客户特征、客户满意、客户忠诚、客户保持及客户关系管理现状的分析，设计其客户满意体系、客户保持方案及CRM系统方案，全面了解关于客户关系管理方面的问题。目录星巴克咖啡客户关系管理现状分析及解决方案 (1) 1企业基本背景 (2) （1）星巴克概况 (2) （2）星巴克品牌 (4) （3）发展大事记 (5) 2客户特征分析 (5) （1）“星巴克”名字由来和定位 (5) （2）调查研究 (6) A年龄 (6) B教育程度 (6) C职业 (6) 3客户满意、客户忠诚现状分析 (6) （1）企业因素 (7) （2）产品因素 (8) （3）营销与服务体系 (9) （4）沟通因素 (11) （5）客户体验 (11) （6）小结 (12) 4客户保持现状分析 (12) （1）为顾客提供免费商品（购买会员资格后才可享受） (13) （2）客户体验 (13) （3）服务创新 (14) （4）渠道创新 (14) （5）消费教育 (15) （6）口碑营销 (15) 5客户关系管理中存在的问题分析 (15) （1）品牌的迷失。 (15) A 经济下滑，购买力下降。 (15) B扩张无度，加速品牌平淡化。 (15) C 品牌泛化，无异于品牌自宫。 (16)

(2)服务质量下降。 (16) (3)解决方案 (16) 6客户价值识别 (16) （1）客户价值定位 (16) a.企业为客户创造或提供的价值。 (16) b.客户为企业创造的价值。 (16) （2）客户价值的定义 (17) 7客户满意度评价指标体系设计 (17) 8客户保持方案设计 (18) 9CRM系统方案设计（系统功能、及子系统功能） (19) （1）CRM系统定义 (19) （2）CRM系统功能 (19) A接触活动 (19) B业务功能 (19) C数据仓库功能 (19) （3）CRM各子系统功能 (20) 10小结 (22) （1）星巴克成功原因 (22) （2）自己的收获 (22) 1企业基本背景（1）星巴克概况星巴克（英文：Starbucks） NASDAQ：SBUX 港交所：4337 星巴克（Starbucks）是美国一家连锁咖啡公司的名称，1971年成立，为全球最大的咖啡连锁店，是世界领先的特种咖啡的零售商，烘焙者和星巴克品牌拥有者。旗下零售产品包括30多款全球顶级的咖啡豆、手工制作的浓缩咖啡和多款咖啡冷热饮料、新鲜美味的各式糕点食品以及丰富多样的咖啡机、咖啡杯等商品。其总部坐落美国华盛顿州西雅图市。此外，公司通过与合资伙伴生产和销售瓶装星冰乐咖啡饮料、冰摇双份浓缩咖啡和冰淇淋，通过营销和分销协议在零售店以外的便利场所生产和销售星巴克咖啡和奶油利口酒，并不断拓展泰舒茶、星巴克音乐光盘等新的产品和品牌。除咖啡外，星巴克亦有茶、馅皮饼及蛋糕等商品。星巴克在全球范围内已经有近12,000间分店遍布北美、南美洲、欧洲、中东及太平洋区。

星巴克服务质量差距模型分析

星巴克服务质量差距模型分析一、服务质量差距模型星巴克作为全球最大的咖啡连锁店，其提供的服务一直被作为研究对象进行研究，但是通过查找文献发现，对其基于服务质量差距模型一直空白，而其又拥有较高的研究价值，所以通过对成都地区星巴克部分门店的走访调查以及总结归纳，对星巴克服务质量差距模型进行了简要分析。二、服务蓝图服务蓝图是一种有效描述服务传递过程的可视技术。通过服务蓝图技术，将星巴克门店提供给一般消费者服务的全过程记录下来，以便于更好的观察该服务组织的服务质量差距产生的位置。三、发现差距通过对成都星巴克总府路王府井店进行观察，并随机对顾客进行了询问，总结出了几条该店在服务上与顾客期望感知之间的差距。并对这些差距对照服务质量差距模型进行了分类汇总

星巴克服务上与顾客期望感知之间的差距分类汇总表四、差距描述与原因诊断（一）消费者期望与管理认知之间的差距差距描述：顾客和星巴克的管理者直接对于门店合适的面积的观点似乎并不一致。顾客期望能够得到一个宽松的，相对不太拥挤的环境。然而，星巴克总府路王府井门店的面积和顾客期望之间是有一定的差距的。经过调查发现，春熙路商圈内的星巴克门店不少于五家。在与考察的门店相隔一个天桥的对面便有两家星巴克，然而，通过询问前来消费的顾客得知，他们对其他门店的存在并不太清楚。原因： 1.营销研究导向不充分——类似于星巴克这种服务提供组织，店铺需要铺设在高地租的商圈内，由于客流量大，经常需要在同一个商圈中开设几家门店。可以采取的措施是在店内标示出附近星巴克门店的具体位置，以提醒消费者周围并不是只有这里一家。 2.缺乏向上沟通——在一般的服务传递中，员工没有及时向上级传递这一问题。

复旦大学eLearning平台用户手册

复旦大学eLearning教学平台用户手册 2010年9月

1. 前言 (3) 2. 教学平台模块简介 (3) 2.1 个人工作空间 (4) 2.1.1 主页 (4) 2.1.2 个人资料 (5) 2.1.3 所属站点 (7) 2.1.4 日程 (9) 2.1.5 资源 (9) 2.1.6 通知 (17) 2.1.7 站点设置 (17) 2.1.8 用户偏好 (20) 2.1.9 账户信息 (22) 2.1.10 帮助 (22) 2.2 教师使用手册 (23) 2.2.1 主页 (23) 2.2.2 课程大纲 (23) 2.2.3 日程 (27) 2.2.4 通知 (27) 2.2.5 资源 (30) 2.2.6 作业 (30) 2.2.7 练习与测验 (35) 2.2.8 成绩册 (45) 2.2.9 投递箱 (47) 2.2.10 网站内容 (48) 2.2.11 花名册 (49) 2.2.12 站点信息 (49) 2.2.13 讨论区 (50) 2.2.14 站内消息 (50) 2.2.15 调查工具 (51) 2.2.16 帮助 (53) 2.3 学生使用手册 (53) 2.3.1 主页 (53) 2.3.2 课程大纲 (53) 2.3.3 日程 (54) 2.3.4 通知 (54) 2.3.5 资源 (54) 2.3.6 作业 (55) 2.3.7 练习与测验 (57) 2.3.8 成绩册 (57) 2.3.9 投递箱 (58) 2.3.10 网站内容 (58) 2.3.11 花名册 (58) 2.3.12 站点信息 (59) 2.3.13 讨论区 (59)

对顾客满意度的现状研究论文

目录 1 绪论 (1) 1.1 国内外现状综述 (1) 1.2 论文研究内容及意义 (2) 2 江动集团的现状及面临的主要问题 (3) 2.1 江动集团的现状 (3) 2.2江动集团面临的的主要问题 (3) 2.3江动集团问题的解决措施 (4) 3 江动集团满意度测评流程及其内容构成 (5) 3.1 顾客满意度的内涵 (5) 3.2 江动集团顾客满意度测评的步骤 (7) 3.3 江动集团顾客满意度测评问卷的设计 (8) 3.3.1 测评问卷设计的思路 (8) 3.3.2 江动集团测评问卷的优化 (9) 3.4 江动集团顾客满意度评价问卷的调查 (14) 3.4.1 确定调查对象 (14) 3.4.2 实施调查的方法 (14) 3.5 江动集团满意度测评的报告 (14) 4 江动集团顾客满意度测评指标体系权重计算及结果分析 (15) 4.1 江动集团对顾客满意度的测评指标体系 (15) 4.1.1 满意度测评指标体系的原则 (15) 4.1.2 江动集团满意度测评指标体系的构成 (15) 4.1.3 测评指标的量化 (17) 4.2 指标体系权重计算及结果分析 (17) 4.2.1 顾客满意度测评指标体系权重的确定 (17) 4.2.2 指标体系权重的计算 (19) 4.2.3 指标体系权重的结果分析 (22) 5 江动集团顾客满意度指数计算及结果分析 (24) 5.1 江动集团顾客满意度指数的计算及结果分析 (24) 5.1.1 江动集团顾客满意度调查表调查结果分析 (24) 5.1.2 江动集团顾客满意度指数计算 (24) 5.2 确定改进对象及方法 (26) 6 总结和展望 (30) 参考文献 (31) 致谢 (32)

南京医科大学elearning网络自主学习平台邀请招标书

南京医科大学网络自主学习平台邀请招标书招标书编号：公司：因教案管理需要，拟采购以下网络软件系统和服务。现发出邀请招标文书，凡符合招标书要求的厂商均可按要求填写招标文件，参与投标。投标有关内容如下：一、标的说明网络自主学习平台软件系统，具体要求见附件。二、招标单位要求、投标单位必须有独立法人资格要求并响应本次招标技术规格要求。、付款方式：安装调实验收合格后付，剩余在半年后付清。、报价方式：人民币价格。三、标书要求、标书一式四份，正本一份、副本三份。、标书内容：（）具有法人资格，国内注册专业生产、经营（投标设备名称）符合参加本次招标的投标人资格。（）公司情况介绍、经营许可证（产品生产许可证）、税务登记证书、法人授权投标人证明、公司对本次项目承诺书等有效资质证明。（）报价：人民币价格，金额单位“万元”，保留两位小数，单价和总价应包括安装、调试、售后服务等相关的所有费用。（）系统详细的技术参数、售后服务承诺、价格；报价含设计、安装、调试费等全部费用，保价表中的配置要求为基本要求，投标的配置需高于或等于配置表中的要求，报价采用一次性报价。（）服务（包括售后）承诺、执行进度和期限。（）主要业绩情况说明。、投标文件须密封并在封口加盖公章。四、评标方式

招标人在对投标人的实力、信誉、价格、业绩及服务等综合因素进行综合评定的情况下确定中标人。对未中标人，招标小组将不作任何解释。五、其他要求投标人请仔细阅读招标文件，并按要求编制和提交投标文件。、投标标书是合同不可分割的一部分，参与投标即视为对本邀标书要求的接受。、请投标人务必仔细阅读本招标文件，严格遵照标书内容应标，投标书不符合邀标书的要求即视为废标。、投标文件发布时间：20XX年1月9日上午～时（北京时间）。投标文件发布地点：南京医科大学教务处。、投标截止时间：：20XX年1月15日时整前（北京时间）。投标人应在规定的时间内将制作的投标文件一式四份密封加盖公章后送达南京医科大学教务处办公室。联系人：南京医科大学教务处喻荣彬联系电话：特此邀请。南京医科大学 20XX年1月9日