文档库 最新最全的文档下载
当前位置:文档库 › ELM-极限学习机-黄广斌学术报告-讲稿

ELM-极限学习机-黄广斌学术报告-讲稿

ELM-极限学习机-黄广斌学术报告-讲稿
ELM-极限学习机-黄广斌学术报告-讲稿

>> Zhengyou Zhang: Okay. So it's my pleasure to introduce Professor Guang-Bin Huang from Nanyang Technological University of Singapore, NTU. So he is talking about his long interest research in extreme learning machine.

He graduated from Northeastern University in China in applied mathematics and has been with NTU for quite a few years. And he's associate editor of Neurocomputing and also IEEE Transactions on Systems, Man and Cybernetics.

And he's organizing our workshop. You can advertise your conference later. So please.

>> Guang-Bin Huang: Thanks, Dr. Zhang, for inviting me here. It's my honor to give a talk to introduce the extreme learning machine.

This is actually -- the idea was initially initiated in '9 -- actually 2003. That was the first time we submitted papers. And then it become recognized by more and more researchers recently. So we just had a workshop in Australia last December. So we are going to have an international symposium on extreme learning machine that is coming December in China. It's Dr. Zhang's hometown, or nearby his hometown. So we're hoping you can join.

Okay. So what is extreme learning machine? Actually, this is a technical talk about -- talking about the kind of learning method. Such a learning method is different from the traditional learning method. So tuning is not required.

So you will wish to know extreme learning machine I think it's better to go back to review the traditional feedforward neural networks, including support vector machines.

I assume many people consider support vector machines to not be known to neural networks. But actually in my opinion [inaudible] they have the same target, same architecture in some aspects, in some sense.

So after we review feedforward neural networks, then we can go to the -- to introduce what is called extreme learning machine. Actually, extreme learning machine is very, very simple. Okay. During my talk I also wish to give the comparison between the year end and this square SVM. So finally I wish to show linkage between the ELM and also the traditional SVM, so we know what's the difference, what's the relationship. Okay.

So the feedforward neural network, I have several types of frameworks, architectures. One of the popular ones is multi-layer feedforward neural networks. But then in theory and also in applications people found a single-hidden layer is enough for us to handle all the applications in theory. So that means given any applications, we can design a single-hidden layer feedforward network.

This single-hidden layer feedforward network can be useful as to approximate any continuous target function. Can be useful as to classify any disjoined regions.

Okay. So for single-hidden layer feedforward networks, usually we also have the two type of popular architectures. The first one is so-called sigmoid type of feedforward network. So that mean the hidden layer here use sigmoid of type network. Okay. Sometime I call additive hidden nodes. That means the input of each hidden node is weighted sum of the input.

Okay. So of course G here, usually people are using sigmoid type of function. But you can also write the output hidden layer -- output hidden node as uppercase G.

So AIPI and X. X is input of the network. AIPI are the hidden parameters for

each -- say for hidden node I, the AIPI are the parameters of the node I.

All right. So this is the sigmoid type of hidden feedforward network. Of course another one is very, very popular is RBF network. So RBF network here, a hidden node output function is RBF function.

So what you rewrite then in this computer format, we actually have the same

so-called output function of the single-hidden layer feedforward network as sigmoid type network. So here always uppercase G.

So these two type of networks are very interesting in the past two to three decades. Two groups wrote papers, researchers, to work on these two areas, and they consider them separate. And they use a different learning method for these two method networks.

But generally for both type of networks, we have this so-called theorem. So given any target continuous function, FX, so the output of function of this single-hidden layer can be as close as to this target continuous function given any arrow.

And definitely in series, we can find such a network. So that is the output of this network can be as close to this target function effects.

Of course, in real applications, we do not know the target function, FX. All right. So your sigmoid processing will have a sampling. We only can sample the discrete samples. And we wish to learn these discrete samples, training samples. So we wish to adjust the parameters of the hidden layer and also the waste between the hidden layer to the output layer. Try to find some algorithm to learn these parameters to make sure the output of the network to approximate a target function. Okay. From learning point of view -- so given -- we're given, say, N training samples, XI, TI, so we wish to have the network -- output of the network with respect to the input XJ incurred to your target output, TJ. Of course, in most cases, your output not exactly the same as your target. So there is some error there. Suppose the output of the network is OJ, so we wish to minimize this cost function. So in order to do this, many people, right, actually spend time finding different methods on how to tune the hidden layer parameters, AI, BI, and also the waste between the hidden layer to the output layer. That is beta I, which is I call output weight.

All right. So that is the situation for the learning. So that is for approximation.

But how about classification? So in my theory published in year 2000 we say as long as this kind of hidden -- single-hidden layer feedforward network can approximate any target function, this network in theory can be used for us to classify any disjoint regions. So this is a classification case. Yeah.

>>: So there's a very well-known result in neural networks that you probably know which says that in order for the [inaudible] to be valid, the number of units had to be very, very large and that if you can infinite [inaudible] what's called a Gaussian process.

>> Guang-Bin Huang: Yeah, you're right.

>>: So that gives [inaudible] indication of processing [inaudible].

>> Guang-Bin Huang: Ah. Okay. That is actually very useful theory. That one actually used for our further development. So that is why we come to extreme learning machine. That is a guide.

But infinity number of hidden nodes usually not required in real applications. But in theory, in order to approximate any target function, say epsilon, the arrow reaching zero, then in that sense infinity number of hidden nodes needs to be given. But in real application we do not need that. So I will mention later.

But that theory is very, very important. Okay. Without -- um-hmm.

>>: Also, recently many people observe that if you have many layers, actually you can [inaudible] single layer system, even if you have very large number of hidden units in a single layer system.

>> Guang-Bin Huang: You're right. So this will have another paper -- actually, I didn't mention here -- that is we prove instance of, say, three hidden layer. Just

talk about two hidden layer. Compare one hidden layer of architecture. So two hidden layer of network usually need much fewer number of hidden nodes than single-hidden layer.

>>: [inaudible].

>> Guang-Bin Huang: Yeah. I proved in theoretical and also showing in simulations. So that means that from learning capability point of view, my hidden layer looks more powerful. But that is actually not shown here. Okay. That one we can discuss later.

Okay. So then you want to learn these kind of networks, so then in the past two decades most people use gradient-based learning method. So one of the most popular one is back-propagation. Of course and its variants. So many variants. People talk about just some parameter. They generate another learning method. Okay.

Another method is called for RBF network. So talk about least-square method. But in this least-square method is some kind of -- something different from ELM, which I will introduce later.

So single-hidden -- so single impact factor used in all hidden nodes. That means in all hidden nodes they use the -- all hidden nodes use the same impact factor [inaudible]. Okay. Sometime it call sigma. Right?

Okay. So what's the drawbacks of those gradient-based method or -- so usually we will fear very difficult in research. So different group of researchers will count different network. Actually, intuitively speaking, they have the similar architectures, but usually RBF network researcher work on RBF. So feedforward network people work on feedforward network. So they consider the little difference. So we actually sometime waste resources.

And also in all the network, users -- actually, you [inaudible] two users. There's sometimes so many parameter for user to tune manually. Right? It's case by case. Okay. So it's sometimes inconvenient for long expert users. Okay.

So usually we also face overfitting issues. So this is why even too many use the number of hidden nodes use it in hidden layer. We were afraid it's a problem, overfitting problem. Right?

And also for RBF a local minimum. Right? So you kind of get the optimum solution. Usually get a local minimal solution. That is better for that local area but not for the general -- the entire applications.

Of course time-consuming. Time-consuming not only on learning [inaudible] and also actually on human effort. Human has to spend time to find the so-called proper parameters, user-specified parameters.

So we wish to overcome all these limitations constraints in the original learning methods.

Now, let's look at support vector machine. Is there any relationship between the support vector machine and the traditional feedforward network. Of course, when SVM people talk about SVM, they never talk about neural network. They say they're separate. So this is why there is a story.

So when I joined [inaudible] in 2004 -- before 2004, SVM paper is seldom appearing neural network conference. So then in 2004 the organizer of 2004, right at the top we have a committee, committee meeting, and they say why this year so many SVM papers come to neural network conference? Okay. So people consider these are different.

But what I found, actually they are very close to each other. Generally speaking, they are the same, in the same network architecture. Let's look at the SVM.

So SVM, of course, we talk a little bit about optimization objective. SVM is to minimize this formula. Right. So this objective function. So minimize the weight,

the actual output of weights, apply the tuning error. On this you [inaudible] conditions. Right?

But in looking at the final solution, decision of SVM is this decision. But what is this? K here is a kernel of the SVM [inaudible] is a parameter we want to find. Okay. But looking at this formula, this actually exactly is a single-hidden layer feedforward network. What is single-hidden layer? Single-hidden layer formed by the hiddens with these kernels. Right? So KX, X1, KX, X2, XI2, KX, XN. So

this is the hidden layer. Right? This is the kernel -- hidden layer with this kernel.

So what is the output weight, then? Output weight is F over 1, T1, R over I, TI, R over N, TN. That is the output weight. That is a better [inaudible] in feedforward network. Okay. Yeah. Please.

>>: This is [inaudible].

>> Guang-Bin Huang: Okay. Yeah. This is not objective of the SVM. But

finally it turn out to be in this formula. So I say from the architecture point of view. So finally they have a similar architecture.

>>: [inaudible] was called new ways to train neural networks. So that was a connection from the beginning [inaudible].

>> Guang-Bin Huang: Yeah. Yeah. So actually -- yeah. You will talk about [inaudible] paper published in 1995. So WAPNIC [phonetic], what is the so-called expectation inspiration to [inaudible]? So WAPNIC say actually we have to go back to the original neural network.

Okay. So [inaudible] actually 1962 [inaudible] studied much in that year feedforward network. In those days, we found has new idea how to tune in the hidden layer. So then [inaudible] said how about it? We considered a [inaudible] layer [inaudible] hidden layer. You could consider master hidden layer. Master hidden layer is output mapping function. Looks at output of mapping function.

So then can we find out some way to just find us an output -- the last hidden layer output function. That is feature map function. Then you can [inaudible]. So then WAPNIC say, okay, so what is -- so the output function of the master hidden layer that can be considered is phi XI. But what is phi XI? I have no idea. So then

you -- WAPNIC paper say how about it, let phi X [inaudible].

So we have this kind of constraint, because this is very important for classification.

Under this constraint they finally get this. So although the output hidden layer mapping function, that is the SVM feature mapping function, phi X is [inaudible]. But we can find the corresponding kernel. So that's why I come to this stage.

But I should -- I'm [inaudible]. I'm talking about from a structural point of view, if we go to this stage, final turn out to be in the same format. Of course from here the question is how to find other ITS. How to find other ITS is how to find this output [inaudible]. So you would consider the hidden layer feedforward network.

Feedforward of hidden network also try to find the parameter. So SVM and the traditional BPN. So in this sense is either just different ways to find the parameters of the hidden layer feedforward network, single-hidden layer feedforward networks.

SVM find it in SVM format. BP find it in BP format. RBF found it in RBF format. So this is why different people find methods in different ways. So then the question is can we unify them. Okay.

So this is one of my so-called research work. Okay. Actually, in order to say -- to show the linkage between the SVM and the single-hidden layer feedforward network, I think that people better read these two papers. Okay. Because these

two papers also give me some idea, inspire my ideas. And they build the linkage between the ELM and SVM first. So then I found this linkage also.

Okay. So now let's go back to the -- let's go to talk about extreme learning machine. Of course extreme learning machine originally start from the neural networks. We come from neural networks. So those days we try to say, okay, we started from BP network.

So BP we found is inefficient. But what is the sense, what is the original expectation for neural network? We try to eliminate human synching. The human can synch something, can find a solution very quickly. So the question is do we have tuning in human brain. But usually to me it's neutron in there in most cases, right? Just find a solution theoretically or I can learn very fast. Right? Either very fast, either without tuning.

But go back to the original learning method of the neural network. So whatever machine you see, always tuning there. Right? Always you had to find a parameter is there. Can we have some way to simplify the implementation of the computation intended in this method?

So then we found -- we first we -- from all this [inaudible] we see O can be simplified in this single-hidden layer feedforward networks.

So O is a hidden layer, AI, BI. But hidden layer here, hidden node may not be neural. You can be a kernel. You can be other subnetwork even. But each hidden layer had a parameter, AI, BI. Those are the parameters.

[inaudible] real formally of each output of hidden -- each output function of the hidden node, right, details. We just write them in this compact format. Okay. So then the output of this -- output function of this network is FX equals summation of beta I, G AI, BI, X. G, AI, BI, X actually is the output function of the I's hidden node. Or it can be RBF little kernel. It can be so-called sigmoid type network if something else. Even can be long differential [inaudible] network.

Okay. So then you see this hidden layer, the entire hidden layer, what is the output of the entire hidden layer. The entire hidden layer supposed to

have -- supposed -- the entire hidden layer has a AO hidden nodes. So then the output layer, what is output layer? The output layer is HX, is that vector of these

AO elements of this -- the output of functions, output of this AO hidden nodes. Right? It's a vector.

HX here is the feature mapping, is a hidden layer so-called output mapping. Right?

Okay. So in this case the question is in the traditional method, AI, BI, AAO, A1,

B1, AO, BAO, O have to be tuning there. So this is why we have to have the gradient-based method or SVM method. Because, you see, if this parameter -- say parameter or hidden layer has been tuning, the result you obtain in the early stage

in the output layer, right, beta I here, may not be optimum. You have to adjust the output layer again. If you adjust the output layer, the beta I, beta 1 to beta AO,

then the parameter in the hidden layer may not be optimal further anymore. So

you have to adjust again. So this is why you always have to adjust it, right, iteratively.

So then luckily in our theory, we find tuning in the hidden layer actually is not required. All the parameters in the hidden layer can be random generated. This go back to your question. So this assume infinity number of hidden nodes can be used. Of course, this assumption that actually we do not need.

Okay. Right? So this also go back to the SVM. SVM is -- go to the feature space, suppose the dimension of the feature space is very naughty, what is very naughty, can be good to infinity. Right? So that is one -- that is why they have the kernel concept there.

So you will say if the hidden layer can -- so number of hidden nodes in a hidden layer can be infinity or as naughty as possible. Then the question is -- then we

have this conclusion. So tuning is not required. All the parameters in the hidden layer can be random generated.

This based on a lot of assumptions. Say suppose given an output function, given output function G of the hidden layer, if there exists a method, okay, which can be used in tuning to find the proper parameters for AI, BI.

So that means [inaudible] parameter we can have a parameter. We will have a tuning method, BP method, RBF method, or SVM method, whatever method. As long as there is -- there exists such a tuning method which can be used to train our single-hidden layer feedforward, okay, then such a tuning is not required. Such a method is not required. Hidden layer can then be random generated. That is given

in this paper. Okay.

So -- and also another assumption is G here should be nonlinear piecewise continuous. You can't just give it the error function. Then -- yeah. Please.

>>: Can you explain the difference between the AI and BI?

>> Guang-Bin Huang: Okay. So this [inaudible] AI, BI, sometime they can be put together, just one parameter. Okay? But AI, BI here built usually in the traditional neural network. You see RBF network. You have center and impact factor. So AI here in RBF hidden node, AI is center. BI is impact factor.

If you got a sigmoid function, AI is input always. BI is the buyers of the hidden node. So this [inaudible] comes from the traditional neural network. So it's special for neural network researchers. Usually they have two parameters. So this is why I put AI, BI.

But if you go to the frequency [inaudible] usually you have the position and the frequency. Right? Not the location and the frequency. Right? So usually have the two.

>>: What is it for SVMs?

>> Guang-Bin Huang: Oh. For SVMs, SVM you will talk about kernel [inaudible]. Kernel you have the so-called -- AI is the input that is called sample, training samples. BI is a so-called -- we call kernel parameter. That is sigma. But, of course, here we talk about random. So later on we can build the linkage between SVM and this method. Okay? But of course here we'll talk about random first.

Okay. So -- yeah.

>>: [inaudible] any application do we need to tune AI [inaudible]?

>> Guang-Bin Huang: I will address this issue later. Yeah. It is an important question. I will come back. Yeah.

Okay. So since all the hidden node parameter need not be tuning, a lot

of -- actually, not be tuning in the sense, before you see the training data you can randomly generate the parameter first. Okay. Then say for this application, you can use a random -- already random generated nodes for that kind of application. That means the parameter can be independently from the training from your applications.

So since AI, BI -- all the parameter, hidden node parameters can be random generated, they go here. So suppose you have N training samples, given any training sample XG, so we wished the output function output of this network [inaudible] to your target, TJ. Right? The left-hand is the output of the network. We wished this equal to your target. So we have the AO hidden nodes. So here we have an AO [inaudible]. Only AO on nonparameter is beta I. Because we have no idea what is beta I. That is what we want to find.

Although the hidden layer can be randomly generated, the TJ your target. So we have the N training samples. So from here we know we have [inaudible] equations with AO on nonparameters, right?

So then for these AO equations, we can -- can be written in the computer format H beta [inaudible] T. So H is we call hidden node output matrix. Right? It's also the so-called hidden layer mapping. Okay. Feature mapping. Okay? So we have the N training data. Right? So each row represent the output of the hidden layer with respect to one input training XI.

Okay. So we wish to -- T is already given. We want to find this parameter. So for this H beta [inaudible] T, H is already known. T is already known. So then we can find the parameter beta easily. Three steps then.

So the first step randomly assign hidden node parameters, AI, BI. And then calculate the hidden layer output function matrix H. All right. So then just do a certain inverse to the beta. That's all.

So these are three steps. Okay. So the source code can be found here.

For this we will find the simple math is enough in many cases. Of course then for this single-hidden layer feedforward network, the hidden node, okay, need not be sigmoid type. Needn't be RBF type. It can be others. [inaudible] hidden node because previously [inaudible] can be trained directly. So with this method, [inaudible] came out to be trained.

But that is the basic idea. So now what relationship -- yeah.

>>: [inaudible] generalized data models where you take the input data, you map it to some nonlinear thing, and then you take a linear combination, which looks exactly like this. So how is it related? How is [inaudible]?

>> Guang-Bin Huang: What is a nonlinear --

>>: Generalized linear [inaudible].

>> Guang-Bin Huang: Uh-huh.

>>: Where you basically map the data to a nonlinear form and then take the linear combination [inaudible]. So it looks very similar to this [inaudible] what the difference is.

>> Guang-Bin Huang: Yeah. So many method actually look similar, but the key point is here the hidden nodes here all random generated [inaudible] other methods. And they have two so-called either user preset -- preset some parameters or either tune it in some way and then get the hidden layer output. So then they use the least-square method. Even the original RBF method. So they have to do something before they go to the random generated -- yeah.

>>: When you say that taking a random hidden layer is enough, and I don't understand what enough means. I mean, does it mean that you get uniformly good approximation across the input space? I mean, this can't be a

distribution-dependent thing. You can't get actual learning from this. This has to be assuming some type of uniform distribution on your input space.

>> Guang-Bin Huang: Okay. Easy enough to talk about here, say given any target function, continual target function, FX, we can say for given such a kind of

single-hidden layer feedforward network may not be a neural network, okay, so as long as a number of hidden nodes is not enough, we can say the output of this

single-hidden network can be as close as to your target function. So given any target function, any continuous target function.

So that we can make the output of the network as close as to your target. So this

is -- because it's from the learning point of view. In theory we wish to have this universal approximation property.

>>: I think what you want is to get -- to learn the specific distribution from that end, so --

>> Guang-Bin Huang: Oh.

>>: This is some kind of extreme L infinity type of -- or, I mean, you're talking here about a uniform convergence, right, across the entire space, whereas we may actually care about some little part of the space which is governed by the distribution [inaudible]. I mean, that's what [inaudible].

>> Guang-Bin Huang: You're right. So talk about -- say you give an entire -- so here talk about the error in the entire [inaudible] space. Of course talk about

so-called some [inaudible] space, that could be different. So this actually -- this problem actually true to many, many method. Even go to the BP. And they also talk about general. Even SVM talk about over a space. Given this region, we wish to learn to see how good in this entire region instead of specific so-called locations, area, small area. But that is problem to the general learning method. Because learning method usually say give us space, give us this training data, any training data. We wish to learn the overperformance on these, any training day.

Of course somehow we are also interested in some particular subspace. But that is different ways then.

>>: I don't think you understood what I said. When we get a training set, you care about the distribution from which that day set came. You don't care about [inaudible] specially has an analysis, so the only learning and generalization [inaudible] SVM talks about generalizing with respect to the distribution of the training day, not uniformly across the space.

>> Guang-Bin Huang: Okay. You're talking about distribution of the training data. In which sense?

>>: In which sense. You're assuming that your sample is [inaudible] from some distribution. That's a distribution with respect to which you measure risk and you want to generalize with respect to. You don't really care about doing anything uniformly across the entire space. In essence [inaudible] some infinite dimensional thing, and all you really care about is the subspace scanned by your data and the distribution order [inaudible].

>> Guang-Bin Huang: Okay.

>>: Here this is -- this is kind of a -- you know --

>> Guang-Bin Huang: Okay. I will go back to your [inaudible] so why they find

it can be so-called even -- so least-square then can be simplified to this stage. Okay. I come back later. Yeah.

So that is the three steps. But [inaudible] we talk about these three steps. So usually we -- after we get a beta, beta can be, say, calculated through the pseudo inverse format. You have a different way to calculate pseudo inverse. So that one of the ways [inaudible] method. So we say we have the pseudo inverse of

the -- okay, beta -- so-called H can be in this way or in this way. Right?

But it -- from the ridge regression theory point of view, in order to improve the stability, improve the performance of the pseudo inverse, so usually we wish to add a so-called [inaudible] in the biangle entry of this path. So then we -- here.

So when -- after we added this one here, because this one is the usual way for us to handle the pseudo inverse, so there was a relationship with SVM then. Okay.

So after we added this, the issues become very interesting. If HX is known to us, this is different from SVM. In SVM, HX usually is unknown. So if HX is known to us, then we have this. All right.

If HX is unknown, we can also use that same format. We can use a kernel format. So from the first formula. So H times H, then we get this, kernel format.

In this case, as long as K is known to us, even the feature mappings function HX is unknown to us, we can get this formula. Okay. So this is another formula from ELM, from the ELM we mentioned in this paper.

So then what happens then? You compare these two. This is the solution of the ELM. This is a solution of the least-square SVM. Least-square SVM you see, you really can know the first column, first row, first column of this least-square SVM solution. This part, this part actually very similar to this part, right, to ELM solution.

But then [inaudible] of course here we have the K times K. Here is Z times Z. So they have this one. Okay. So roughly looks similar. But this actually is a simplified version of least-square SVM.

Why this simplified version? [inaudible] in this paper because we found it may have some flaw in this square SVM and the SVM series I will mention later.

But you will look at this square SVM. So we have the T. So [inaudible] here actually is a kernel matrix. Omega here is a kernel matrix. But [inaudible] in their kernel -- kernel is the related feature mapping. It should not be related to the target. But in their feature mapping, kernel mapping actually is related to the target. But this is a problem. So your feature mapping is coming from the input instead of you have feedback from the output. But in their solution, they have the TI -- so T you have a target in the kernel mapping matrix. Okay. So this is a problem there.

But then from here we have simplified with [inaudible] the first column, first row, first column, then make it simple. So I will come back later.

So why this one actually can be so called -- it can be this constant. It may have something which can be improved somewhere. Okay.

So now let's look at the classification so-called performance of the ELM. So now let's look at the [inaudible] problem. All right. So this is output boundary of the ELM which founded by ELM. So we look at another case, right, this also boundary from the so-called ELM method.

So now go back to your other questions, do we need infinity number of hidden nodes? Usually not. So we say -- so as long as the number of hidden nodes is not enough, is not enough, then okay. So in what sense not enough? So you know all simulations, so even for this case, we only have four parameter, four training data, 1,000 hidden nodes.

Even here you have, say, here maybe 80,000 data, 1,000 training data or 1,000 hidden nodes. Even for other case we'll somehow have a meaning of training data, okay, is the case. We only just needed 1,000 hidden nodes.

So that means usually 1,000 enough, 1,000 to 2,000 should be fine for ELM usually. This -- what we find -- yes?

>>: Is that an empirical observation [inaudible] theory that actually suggests that that's --

>> Guang-Bin Huang: Oh, no, we do not have theory. This is just from experimental simulation. Also not only founded by us, also founded by some other teams from Europe and China. Yeah. So --

>>: [inaudible].

>> Guang-Bin Huang: Ah. Okay. For the moment we found it. It looks likely independent from the dimensions. Sometime -- later on it was issue, some is higher dimension, some is lower dimension. Some you even say were [inaudible]. Also okay. Yeah.

>>: Also, when you said 1,000, do you mean you need 1,000 to match the performance of a [inaudible]? What do you --

>> Guang-Bin Huang: We also compare with SVM, regular SVM later.

>>: But what I'm asking is what performance do you try to get? I mean, I've had two.

>> Guang-Bin Huang: No. We wish to compare with traditional SVM in the same meaning. So even if you have the dimension reduction, then we compare in the same way. Even if do not have the dimension reduction, we also compare the

performance result dimension reduction. We wish to see the -- if [inaudible] came, we want to see the classification read in the testing data. Yeah.

>>: [inaudible].

>> Guang-Bin Huang: Yeah.

>>: With a style of method. So for about a thousand hidden nodes [inaudible].

>> Guang-Bin Huang: Sorry?

>>: If I was [inaudible] comparing the styling methods [inaudible] actual error rate [inaudible] same error, you need a thousand hidden nodes with ELM.

>> Guang-Bin Huang: Yeah.

>>: So what's the [inaudible]? What's the largest problem you can apply this to?

>> Guang-Bin Huang: Okay. So this problem -- because you see we -- ELM scores scalability actually is very, very good. Because you see the function. That calculation is very, very simple. You look at it here. This actually is unified for [inaudible] label classification also.

But this here, similar to SVM. It's just for binary. So you will [inaudible] very naughty case for SVM and not -- least-square SVM usually have so-called [inaudible] kind of method for both SVM, and least square SVM.

So usually we have to find the supercomputer for very naughty case. But to us we usually run it in the home computer or just the desktop in the lab.

So we [inaudible] the largest one we run is however many of data. With each

so-called data instance is, say, 59 input [inaudible]. But for the rest, because due to the simple computing [inaudible] somehow very difficult. We have to wait

for -- somehow wait for months to get the result. This is a problem.

But for us just several minutes we'll get a result. So somehow we [inaudible] we also had to wait all the rest.

>>: [inaudible] you reach probably up to however many --

>> Guang-Bin Huang: Yeah, the maximum. I will bet you -- I have one slide to show.

>>: One that you have [inaudible] variability when you try a different set of random parameters.

>> Guang-Bin Huang: For the moment we have a --

>>: [inaudible].

>> Guang-Bin Huang: Yeah, yeah, yeah, that's right. We usually have to seem round -- we will -- in order to compare SVM and other method, not only SVM, BP, we -- to us we -- actually, you know, just the hidden number of hidden nodes. That one usually can be just given then round for all. But for the rest messy order, you have to tune this method, tune to that method. In order to find one method, even given one method, they have to spend a lot -- so it will cost a lot of time for them

to compute.

>>: Procedure-wise [inaudible] when you actually use a random generator to get this AB, how many runs do you have to [inaudible]?

>> Guang-Bin Huang: Usually just one.

>>: But you get a different [inaudible].

>> Guang-Bin Huang: For the result here, we -- later I show we can generate many times. We see the average. We can see the average. Yeah.

>>: I see. Do they differ a lot?

>> Guang-Bin Huang: I will show you that. There is some [inaudible]. Very interesting result you will see.

Okay. Let's look at these several data cases. Actually, not very nice, okay? So

let's say they have this case. Say the training number of training data for letter recognition is 13,000, then testing is 6,000, okay, so then this number of input features in classes here.

So this means the data can be -- so training data and the testing every time during our trial are reshuffled. Or erased. No means no shuffle according to the simulation done by others. The training data fixed, testing data fixed. Okay. Now, let's look at this case. So we talk about the -- let's look at the lower part first. If SVM, least-square SVM, ELM all use Gaussian kernel. Gaussian kernel here is lot of random generate. You look at an earlier solution here. I say if the HX

is -- are really known to you, you can use these two formulas. If HX is unknown, similar to SVM, but to us is needing to be tuning then, right, so we can use the same format here using the kernel format. This is different from least-square SVM. We can also use a kernel here, because usually here no [inaudible] just use one parameter.

Of course for any method [inaudible] definitely do something for some -- right, at this one parameter.

So you will talk about a good kernel, Gaussian kernel method you see over here. This is the testing rate for SVM. Okay. So this is training time. This is

least-square method. You see the ELM.

For all the method, for all the data, the ELM achieved better generation the testing accuracy. Okay, you see here, all better here.

You see the time. The same Gaussian kernel. This time [inaudible] all shorter than the SVM and this SVM, because least-square SVM is much starter than SVM. Okay. So this is the -- using a kernel format. Right?

But if the -- you see the standard duration here also, zero. Because the training data is fixed. Right? Training data is fixed, right? So then these three methods have the same -- almost the same standard duration.

>>: [inaudible] Gaussian kernel likely choosing the [inaudible]?

>> Guang-Bin Huang: Yeah. No. This not. Can't. This can't. These we want to intentionally compare SVM, least-square SVM, ELM, if they use the same kernel, that we compare them in the same environment. Of course, you have to choose the parameter gamma impact factor for all these three methods. Right? So this is

the -- one comparison.

But you will -- because ELM here, we say the hidden layer output function is unknown to us, but we can use in the kernel format similar to the SVM. Just kernel learning.

But if the HX is known to us, it seems the RBF network, sigmoid network, the traditional neural network, so then we say we're just using one method, sigmoid or hidden node.

So then all the parameter are random generated. And the number of hidden nodes is said as 1,000. Then what happen? You look here. It's very interesting. You see the classic case you read here also higher, almost better than all the rest.

And then, of course, because they're random generated, so some [inaudible] duration here. Even the training data is fixed, [inaudible] fixed. You still have some kind of [inaudible]. Because every time you random generate, then have some kind of [inaudible] there.

But you see the duration is not so high. All right. Yeah.

>>: [inaudible] styling methods gap.

>> Guang-Bin Huang: Support vectors for this -- actually, this one -- oh, I see can be found from a paper. I cannot really remember. But usually --

>>: [inaudible] runtime. It's going to be slower if you have a thousand versus ten.

>> Guang-Bin Huang: [inaudible] so I will go back later. So you see the training time. So the training time here is much, much shorter than the SVM, and it is least square SVM. Even shorter than the ELM with Gaussian kernel. Because when you go to Gaussian kernel, you see usually we consider a feature space go to infinity. I mean, in theory. But here --

>>: -- is comparing the sigmoid even though [inaudible] do you have general statement as to which one is better in terms of performance versus --

>> Guang-Bin Huang: Okay. In terms of classification -- in terms of stability, Gaussian kernel is better because always standardization. Here is zero. Whenever fixed, training data fixed, then the result is definitely deterministic.

But you determine the time, training time is better. Sigmoid. So-called

long-kernel based is better. Because the kernel, you know, we need not give

the -- so we need not give too many hidden nodes. So usually 1,000 enough. But 1,000 enough very good for very large applications. [inaudible] go to a very naughty application, the scalability of SVM and least square SVM would be very hectic, troublesome to us. But then ELM is better. Right? So this is --

>>: So I'm thinking what you are doing now there is to use the random [inaudible] at the top level.

>> Guang-Bin Huang: Yeah, yeah, yeah.

>>: So why is that feature a good one?

>> Guang-Bin Huang: I've already explained to you --

>>: It's not intuitive.

>> Guang-Bin Huang: Yeah, it is not. Even in the beginning when I think of

this -- when I thought of this idea, if I could myself, can it work? I can't find any natural phenomena at the latter stage. But now I found the natural phenomena. So I say this a natural learning scale. I will go back later. It's very interesting. I will go back later.

>>: By the way, so the way you use SVM [inaudible] automatically determined, right? [inaudible] will be considered [inaudible] hidden units in your ELM. So do you have idea what is a number of support vectors you get with the SVM?

>> Guang-Bin Huang: Okay. Usually for SVM, what I found is in some order related to number of training data. [inaudible] regression method usually half of the training data will be used at the support vectors. Go to this page.

Just run a difficult case, SinC function. Usual that's a very important benchmark problem to test the SVM performance. Right? We have a 5 training -- 5,000 training data and 5,000 testing data. So training data we intentionally add some noise there. Testing data is noise-free.

So if we go to SVM, of course here we're training to support vector regression [inaudible] regression case. So we have the 5,000 data, training data is here. Nearly half of them use as support vectors. So this is why even in other case we all usually say the number of support vectors is very comparable to the number of training data. So we talk about very nice good large dataset. The number of

support vector will be very huge. So then these go through the runtime issue that are raised by this gentleman, right?

So you could do testing. The test time was very huge for support vectors, a

least-square SVM. But while the number of hidden nodes, ELM can be very short. Look at this case, [inaudible] fix them as 1,000. I just choose 20. You see the testing accuracy for this case, ELM already short, smaller then, is better than SVM. SVR. The training time here is -- actually is 0.1 seconds. You see the SVR is near 20 minutes. So this is -- yeah, please.

>>: If you use, say, a thousand on ELM in this case, would it maintain the same accuracy, or would it overtrain?

>> Guang-Bin Huang: Okay. These are different methods then, because we all have the different version [inaudible] on the developing.

Here for this case we use -- because it's probably in the year 2006, quite a long time ago. For that case we did not include constant [inaudible] diagonal entry of this, of edge, right? We do not include this I over C here. We just use the -- using a formula without I over C. This part.

For that result, that diagonal entry, you can tune -- you can tune -- you can keep in the number of hidden nodes.

So then the number of hidden nodes, you will go back to see this paper different from the back-propagation method. Back-propagation method is very sensitive to the number of hidden nodes. All right. You could go to the peak performance easily, quickly, then drop down quickly because overfitting there.

[inaudible] ELM usually have a very large advantage, so the performance, the general performance is very stable in the very naughty range [inaudible]. Of course, the number of hidden nodes in that sense -- number of hidden nodes required in ELM is higher than number of hidden nodes in BP because every hidden node in the BP had to be tuned carefully. But to us random generated.

But why is this single step in a very wide range [inaudible]? Because every one is random generated. So you add one more or one less, usually more or less there. So in that sense, intuitive case, most [inaudible]. This is why we have a [inaudible] original intention of the ELM should be so-called tolerance -- we should have some kind of redundancy. We should have some kind of error tolerance, right? But for [inaudible] BP, it's not [inaudible] feature. Yes, please.

>>: In that sense there's no real tuning parameters?

>> Guang-Bin Huang: Yeah. So in this case we're just using [inaudible] section. Say ten hidden nodes too little, too few, incapable of learning. 1,000 too few. It just say the 500. Just reach that usually. You want simulations. We do not have

to be carefully tuning everything. Yeah. So this is why even say in this case number of hidden nodes have to be given. But usually can be given very easily

just using a [inaudible]. Okay. Of course this is for [inaudible] case I won't repeat any more, compare with BP, compare with ELM and support vector machine, okay?

Now, you look here [inaudible] parameters. BP have to tune in a parameters here. Of course [inaudible] have to tune in the number of learning [inaudible]. Then the SVR have [inaudible] parameters. ELM for that traditional one have to give a number of hidden nodes.

Of course, after we add the diagonal entry in the formula, number of hidden nodes can be fixed, such as 1,000, without doing any tuning here.

>>: Now you need to tune C, right?

>> Guang-Bin Huang: You are right. But C --

>>: Yeah, it does. Definitely. That means either you're tuning -- either you're tuning -- tuning number of hidden nodes here. But of course this is not sensitive.

Or either turning C. But C also not so sensitive compared with SVM. I mean, you have to do tuning. But it's not sensitive compared to SVM. Okay.

>>: So this slide seems like to suggest you need to use much more number of hidden units than the BP.

>> Guang-Bin Huang: You're right. I just mentioned because number -- so-called number of the hidden -- the rules of which hidden node in the BP have to be already calculated carefully by the BP, or it's just random. So sometime it's random redundancy there. But redundant is good. Help us to remain stateable in some case.

>>: So my question is for large problems, for huge problems, will this cause any problem?

>> Guang-Bin Huang: No. You will see here. Not too many even compared -- if you compare with SVR, not too many.

But as I just mentioned, after you add the diagonal entry, number of hidden nodes can be fixed. Usually we just pick 1,000. Right? In almost all the test cases, we have so far, just a [inaudible].

>>: [inaudible] so suppose you compare the same network [inaudible] the same level parameters, all that, and when you are doing that you fixed the earlier parameters by random number [inaudible] and then of course you are not going to get as low error as I do both, optimize both. I do the fixed whatever you have at

the top level better, and I guess we have freedom to train the other one, you get better result in the training one. So [inaudible] about the training result [inaudible]. >> Guang-Bin Huang: Correct. Great idea -- question. This depends. So it depends which error we are using to train your network. Say if we are using BP, then you're stuck in the local minimum. You cannot go to the global optimal.

>>: [inaudible] all I'm saying that you try [inaudible] I think I would certainly do that, you use whatever method to get [inaudible] a good result, and then you fix [inaudible] and then I fix those and I do back-propagation BP to train my AB so that you won't be the [inaudible] of what you randomly [inaudible]. I actually get [inaudible].

>> Guang-Bin Huang: No, BP will not get a better result in most cases. It may get a better result in some cases. What I say, it won't get -- BP won't get a better result in most case in terms of accuracy, okay?

>>: No, I'm saying that given that I used the same parameter that you learned and had, I [inaudible] to adjust other parameters. I mean, have [inaudible] I will just use any [inaudible] I might get some low error. Is that going to give you --

>> Guang-Bin Huang: No, definitely --

>>: The issue will be generalization [inaudible].

>> Guang-Bin Huang: No. Definitely BP will get a lower general performance. You say why? Because when the BP have to tune all the hidden nodes -- even ELM hidden nodes is random generated, okay, in most cases, BP have to tune in all parameters. Say we have 1,000 hidden nodes random generated. For ELM it works. Then go to BP. Overfitting. Because say you just have 100 data. You have -- using 1,000 hidden nodes, then this network definitely can not learn this 100 data. Because number of hidden nodes is larger than number of training data. It's impossible. Right?

The training data -- training error can go to zero. Then testing accuracy can be good to -- even good to infinity in some sense [inaudible] overfitting there.

But ELM is random generated. So in which sense why the ELM is better than BP. Of course [inaudible] other reason. One reason is here. When we generated the hidden nodes, we do not give the preference to training data. So that means the hidden node is -- can be generated without seeing the training data. So when they see the testing data and the training data in the same rule, give the same priority [inaudible].

But in the BP, hidden node is tuning the base of the training data. In other words, you give the priority, give the advantage to the training data. Give the buyers the testing data. So in this sense BP actually is overfocused on training data. Right? So maybe not balanced, the training and the testing performance, testing data. So in this sense the BP is not good. Okay. So similar to other method. Okay? Now, you look at a larger case, this also appeared probably five or six years ago. So we have this -- so you have a -- actually it's a [inaudible] 600,000 data. So 100,000 for training. Actually later on we also test for however many. Use these here. SVM spend 12 hours for one trial. That means suppose we already found the best parameters or optimum user space by the parameters. Then how many number of hidden nodes -- number of support vectors.

So here you have 100,000 training data. So here is 1/3 of them used as support vectors. This go back to that gentleman. Right. So that means for the runtime case, the testing time will be very huge. So even you have the training network. Give application. You are [inaudible] SVM may respond very slowly compared to ELM. ELM only 200 hidden nodes for this case.

Then for each training just 1.6 minutes. All right? So -- but look here. That here

is roughly 12 hours for one trial. But [inaudible] appropriate parameter, you have

to run maybe 200 times to find the best parameters. So usually we have to spend one month to get SVM complete. But for ELM, what did I say just several times, okay? So this one.

Now, go back to here. Why ELM is better. So the key expectation of ELM is hidden layer need not be tuned. Okay. It can be generated before we see the training data. This is the first key point.

Second key point is hidden layer actually should solidify universe or approximate condition. Even I just mentioned, you -- even HX is unknown. We assume HX should certify this condition. If HX hidden layer does not solidify this function, let me give some application, you cannot approach it -- you cannot solve that application. Then what shall we use? Why do we need to use this method. Right? Okay.

So then we want to minimize these two. Actually, this similar to SVM. But this come from neural network theory. Actually, probably quite a long time ago. Okay. This is proved by so-called -- actually, UC Berkeley professor. UC Berkeley professor say we should -- if the training error is already smaller, we try to minimize the width of -- norm of the widths, of the whole network.

So then I combine, then, these two. Why not minimize margin error as well as minimize width of the whole network. Right? So this then why we come here. In order to address why ELM is better than SVM in most cases, we [inaudible] just

we compare least-square SVM. From that we kind of see the why -- the reasons. Okay. We go back to the original SVM. Original SVM actually talk about this kind of optimization target which [inaudible] conditions. Rights?

If we go back to ELM, we also use the [inaudible] conditions. But we know HX satisfied the universe approximate of condition. So we have -- in ELM we have this target function. I mean, this is analyzed from optimization aspect.

So then we also have this condition. Note here, this different from SVM. SVM here had a B there, right? But in ELM no B there, because HX satisfy universal approximation. Give any target in the feature space, suppose the separating boundary should pass through origin in the feature space.

So without B, then we have this dual optimization problem for ELM. We only have this condition. That means [inaudible] get a solution for ELM. We only [inaudible] optimal parameter of I in this cube, from 0 to C.

极限学习机

1 介绍 我们在这提出一个基于在线极限学习机和案例推理的混合预测系统。人工神经网络(ANN)被认为是最强大和普遍的预测器,广泛的应用于诸如模式识别、拟合、分类、决策和预测等领域。它已经被证明在解决复杂的问题上是非常有效的。然而,神经网络不像其他学习策略,如决策树技术,不太常用于实际数据挖掘的问题,特别是在工业生产中,如软测量技术。这是部分由于神经网络的“黑盒”的缺点,神经网络没能力来解释自己的推理过程和推理依据,不能向用户提出必要的询问,而且当数据不充分的时候,神经网络就无法进行工作。所以需要神经网络和其他智能算法结合,弥补这个缺点。 案例推理的基本思想是:相似的问题有相似的解(类似的问题也有类似的解决方案)。经验存储在案例中,存储的案例通常包括了问题的描述部分和解决方案部分;在解决一个新问题时,把新问题的描述呈现给CBR系统,系统按照类似案件与类似的问题描述来检索。系统提交最类似的经验(解决方案部分),然后重用来解决新的问题。CBR经过二十多年的发展,已经成为人工智能与专家系统的一种强有力的推理技术。作为一种在缺乏系统模型而具有丰富经验场合下的问题求解方法,CBR系统在故障诊断、医疗卫生、设计规划集工业过程等大量依赖经验知识的领域取得了很大的成功。但是由于案例属性权值的设定和更新问题,CBR 在复杂工业过程的建模与控制工作仍处于探索阶段,尤其对于预测回归问题,研究的更少。 不同于传统学习理论,2006年南洋理工大学Huang GB教授提出了一种新的前馈神经网络训练方法-极限学习机(ELM),能够快速的训练样本(比BP神经网络训练速度提高了数千倍),为在线学习和权值跟新奠定了基础。我们提出的基于在线极限学习机的案例推理混合系统,能够使用案例来解释神经网络,用在线学习的方法为案例检索提供案例权值和更新案例权值,为在线预测某些工业生产提供了较好的模型。 2使用在线极限学习机训练特征权值的算法 2.1 训练和更新样本特征权值(不是训练样本权值的,要记好,从新选择小题目) 在这一节中我们提出如何使用在线极限学习机确定和更新案例库属性权值。首先使用固定型极限学习机【】对给出的数据进行充分的训练,使训练的样本达到预期的误差范围内。通过训练后的网络和

(完整版)英文学术报告开场白、结束语

問侯語或開場白的寒喧(Greetings) 開場白很重要,最常用的問候是“Ladies and gentlemen”,但要視場合而定。例如在會議討論會場合時,經由主席介紹上台時可先說Mr.Chairman,Honorable guest,Ladies and gentlemen,good morning ,It's very great pleasure indeed for me to be able to attend this meeting 主席先生,各位貴賓,各位女士,先生早安. 非常榮幸能參加這次的會議。或者你也可以說I'm hornored and proud to have the opportunity to speak at this meeting . 禮貌性的問侯語這是對主持人和來賓的一種尊重。 開始簡報(opening a presentation)—提出簡報摘要 在正式進入主題之前可先扼要說明簡報的內容與順序,幫助聽眾了解您的報告的大概內容。 例1.Today I would like to present my paper“The challenges of pharmacy practice in Taiwan”,In the first part of the report ,I'm going to begin with a few general comments concerning the Taiwan Medical care enviroment recently, and then discuss in more detail specific issue which concerned community pharmacy, and how the National Health cave Insurance influence the future of pharmacist career? 例2.你亦可將要簡報的摘要條列式的依序說明。 My presentation will cover the following aspects: professional pharmacy practice as part of the health -care system Safe distribution of medicine co-operation for better drug therapy promotion of good health Remuneration for pharmaceutical servicss 進入主題(Main points)—演講部份的主要內容,論證與比較事實。對所要簡報主題內容逐一詳細說明。例如將上例每一項摘要逐項詳細闡釋說明,依序讓文章或演說有系統的講解。在presentation 時如果能井然有序的,依段落分明,串聯成一篇完整文章,聽眾必定能印象深刻。

学术报告的英文开场白

学术报告的英文开场白 篇一:英文学术报告开场白、结束语 問侯語或開場白的寒喧 開場白很重要,最常用的問候是“Ladies and gentlemen”,但要視場合而定。例如在會議討論會場合時,經由主席介紹上台時可先說,Honorable guest,Ladies and gentlemen,good morning ,It's very great pleasure indeed for me to be able to attend this meeting 主席先生,各位貴賓,各位女士,先生早安. 非常榮幸能參加這次的會議。或者你也可以說I'm hornored and proud to have the opportunity to speak at this meeting . 禮貌性的問侯語這是對主持人和來賓的一種尊重。 開始簡報—提出簡報摘要 在正式進入主題之前可先扼要說明簡報的內容與順序,幫助聽眾了解您的報告的大概內容。 例1.Today I would like to present my paper“The challenges of pharmacy practice in Taiwan”,In the first part of the report ,I'm going to begin with a few general comments concerning the Taiwan Medical care enviroment recently, and then discuss in more detail specific issue which concerned community pharmacy, and how the

大学生英语演讲稿2020【精品】

in my 1 years of life, there have been many things. university days are the best part of them. i can never f et the days when i stepped into my university. i was impressed by its garden-like campus, its enthusiastic students and especially its learning atmosphere. i at once fell in love with it. after the arduous military training, i get absolutely absorbed in my studies. the classes given by the teachers are excellent. they provide us with information not only from our textbooks but from many other sources as well. they easily arouse my insatiable desire to take in as much as i can. frankly speaking, at first i had some difficulty following the teachers. however, through my own efforts and thanks to my teachers' guidance, i made remarkable progress. now i've benefited a lot from lectures and many other academic reports. learning is a long process; i'll keep exploring in the treasure house of knowledge to enrich myself. this summer i got out of the ivory tower and entered the real world. a publishing house offered me a part-time jo b in pilation and revision. at the beginning i was belittled by my colleagues. but they were really surprised when i translated seven english articles over 5, words on only one day. gradually, they began to look at me with respectful eyes. in their opinion i turned out to be a useful and trustworthy colleague. i also realize that only those who bring happiness for others can be truly happy. so i often take part in activities concerning public welfare. i once went to a barren mountain village with my classmates. we taught the kids there who could not afford school. while showing them how broad and how civilized the outer world is, i was deeply touched by their eagerness to learn, their honesty and their purity. i couldn't control my tears on the day when we left. the precious experience with the poor kids made me aware of the responsibility on the shoulders of us, future teachers. besides study and social practice, there are entertainments as well. i do body building every day, hoping to keep healthy and energetic. we also write a play and put it on in our spare time. campus life is the most splendid time. but different people have different choices. the majority of students cherish their beautiful season and cherish the hope that one day they'll be e outstanding. but there are indeed some students still under ignorance. they gather together for eating, drinking or playing cards. they're busy in searching for a girlfriend or a boyfriend. they f et pletely about their mission as college students and the hope of their motherland. finally, i do hope everybody can try their best to be e a worthy citizen of the country. i do hope everybody can be e the backbone of our nation and make great

学术报告心得体会范文

华南理工大学建筑学科八十华诞?第十九届当代中国建筑创作论坛 2012年11月3日 广州市天河区五山路381号励吾科技楼国际报告厅 本次活动是庆祝华南理工大学建筑学科八十载砥砺耕耘,同时举行召开当代中国建筑创作论坛,会议邀请了何镜堂、吴硕贤、程泰宁、彭一刚、王小东、郑时龄、容柏生、刘加平、崔愷9位院士,柴裴义、刘景樑、郭明卓、孟建民、庄惟敏、周恺、胡越7位设计大师参加,大部分院士和设计大师将在会议上作主题讲演。主题演讲都紧密结合现当代建筑趋势,表达各大家的想法创意以及研究领域成果。 吴硕贤先生在关于《宜居城市与生态建设》的演讲中,从宜居城市和生态建设两方面来解读。宜居城市是我国新的城市理念,是国内各城市发展的目标。其中提到了建筑节能方面的内容。建筑节能,指的是在建筑材料生产、房屋建筑和构筑物施工及使用过程中,满足同等需要或达到相同目的的条件下,尽可能降低能耗。建筑节能可采用节能型的技术、工艺、设备、材料和产品,提高保温隔热性能和采暖供热、空调制冷制热系统效率,加强建筑物用能系统的运行管理,利用可再生能源,在保证室内热环境质量的前提下,增大室内外能量交换热阻,以减少供热系统、空调制冷制热、照明、热水供应因大量热消耗而产生的能耗。吴先生还提到采光对建筑节能的重要性,也有其他如不要浪费土地面积,利用建筑紧凑化来节能,争取建筑标准化,建设好室内环境,重视声景的设计与规划,做好生物多样化保护,可改善人居声环境。照明对生物多样性的重要影响,提醒需要对光环境有必要的重视,争取做绿色建筑。刘加平《现代建筑创作中的节能设计思考》中,重点来思考节能设计方面的内容。现代的建筑总是追求新、奇、大,而不顾现实中生态环境和人居环境的关系。建筑体量越大,能耗则越大。越不满足节能概念,也不符合整个生态要求。建筑能耗中,主要是采暖、通风和空调能耗为主,占到建筑总能耗的50%以上。我国的建筑节能必须从现在做起,从优化建筑结构、降低采暖系统能耗,减少空调能耗等方面下功夫,从构筑节能型建筑的角度来创造节约型社会。 崔愷院士作主题讲演《耕耘——本土建筑的思考与实践》,他结合十三个经典项目案例进行深度分析,针对“本土设计的理念”进行了阐述。“本土设计”是以自然和人文环境资源之土为本的建筑设计,它是中国当今社会核心的文化理念在建筑中的具体推行;反对全球化导致文化特色的缺失和民族精神的衰落;它提倡的是回归理性的思考,反对浮夸的,以吸引眼球为目的的形式主义和时尚追风;它承担的是对人居环境的长久责任,反对急功近利唯利是图的商业主义;它主张的是立足本土文化的创新,反对保守倒退,要积极的从传统文化中吸取营养面向未来;它追求的是保持和延续不同地域环境的建筑特色,反对千篇一律和模仿平庸。 “本土设计的理念”强调建筑与环境的关系,本土设计与民族风格,继承传统不同,只因项目不同,地域环境不同,无统一风格,但立场统一,这是本土设计的基本想法。它与批判地域主义不同,我们更希望中国的建筑发展立足于本土文化,而不是一个在国际主流之外的地域主张。当然它与乡土建筑、文脉主义建筑不同,它能够更多的反映当代新的地域文化和地域精神。 身为中国2010年上海世博会主题演绎顾问郑时龄院士作了关于《后世博的上海》的主题报告。后世博的上海以宜居城市作为城市发展的目标。2010年世博会的举办使上海在城市的综合实力方面,在城市空间环境和建筑品质方面更上一个层次。2010年世博会已经成为上海城市发展的里程碑,在城市空间和环境方面,将推动上海继续迈向可持续发展和宜居的城市。前世博和后世博的上海有着重大的变化,这一变化不仅表现在城市的基础设施方面,也表现在城市发展的理念以及建设未来的理想城市的蓝图。上海正在规划并实施一系列的城市

英文学术报告怎么写

英文学术报告怎么写?Write an Academic Report 学术报告(academic report or paper)的写作包括阅读,思考,针对一个学术课题进行案例分析及写作,总体来说就是为了让你的读者以新的眼光来看待该课题。大学的学术报告与你之前所做过的其他类型的报告(如高中时期为完成老师布置的作业而写的报告)的不同之处在于,它既陈述了研究事实,又提出了你自己的推论。 写作过程指导: 1. 研究及总结(Research and Summarize) 研究所选的课题,找出针对课题的“who(谁),what(什么),when(何时),why(为什么)及how(如何)”的答案。总结收集到的信息,清楚自己所掌握的,评估自己所不知道的,以便专注于进一步的研究方向。列出一个参考书目来引用并避免抄袭。 阅读原始文献及补充信息并考虑关于课题的历史或流行观点。思考每篇文章的要点同时注意它们的共性。缩小你的课题范围这样你就可以准备写一篇充满智慧的报告了。 2.定义结构(Define the Structure) 定义报告的结构并将你的论点串联成一个连贯的推理。依据报告的主题制定一个大纲。以这种方式组织你的观点可以帮助你看到各个观点之间的关系。 大纲中的所标题结构要统一。例如如果你的第一个标题以一个行为动词(action verb)开头,那么所有其他标题都应该以行为动词开头。副标题(Sub-heading)应该包含支撑大标题的细节。重组你的大纲直到它能清晰的反映一个思想逻辑流程。 3. 写正文(Write your paper)

学术报告通常包含五个部分:标题页,引言(包含论文主题),背景信息,关于论点及结论的细节,陈述支持或反对某个特点观点的争论。 用辅助段落来讲述一个故事,提供一个观察的视角,描述一个过程,定义意义,对思想进行分类,比较和对比观点,类推或解释为什么这种现象会发生。提供证据,陈述假设并适当添加个人观点使论文连贯、清晰易懂。确保每段都有足够的结束语及过渡句。在学术报告中尽量减少使用个人代词如“我”。 4. 校对(Proofread) 打印并大声朗读报告来检查错误。保证报告没有语法及格式错误。 至此一篇英文学术报告就完成了,当然在做报告之前还要加强学术英语的学习,背诵一些相关专业的词汇和术语,并尽量多阅读一些英文的专业文献,做好基本功。 更多学习资料请见美联英语学习网。

【主持词范文】学术报告主持词结束语

学术报告主持词结束语 【篇一】 各位领导、各位来宾、老师们: 金秋送爽,桂花飘香。今天,我们迎来了酉阳教育界最尊贵的客人――全国教育改革家、特级教师魏书生老师。作为新学年的一大盛事,魏老师将与我们一起分享他几十年教育教学的宝贵经验。在这里,我谨代表中共酉阳县委教育工委、县教委以及全县15万师生员工向魏 老师表示热烈的欢迎和崇高的敬意! 花甲之年的魏书生老师莅临酉阳,我们全县师生倍感荣幸!关于魏 老师,人们对他有这样的评语―― 魏书生曾是一名工人,却写了150次申请要求做老师。 魏书生是语文老师,用20个课时就教完200个课时的内容,且学 生成绩好得出奇。 魏书生同时出任几个班的班主任,不管什么样的学生到了他的手上,个个品学兼优。 魏书生是一位称职的丈夫和父亲,家庭和-谐,儿子以优异成绩考 入清华大学。 魏书生曾是中国一位同时担任中学班主任、中学语文教师的市级 在任教育局长。 魏书生巡游讲学海内外,至今演讲1400多场次、上公开课1000 多节,所到之处,场场爆满。 魏书生不是作家,却出版了几十本专著,本本畅销。

魏书生从未上过大学,却做过大学校长,且同时被聘请为几所大 学的兼职教授。 魏书生不是学者,但他的教育思想却作为教育成果在海内外推广 了几十年。 魏书生看起来其貌不扬,但他光芒四射:他是“全国优秀班主任”、“全国十大杰出青年”、“全国十佳师德标兵”、“特级教师”、“全国劳动模范”、“全国有突出贡献的专家”、“五一劳动 奖章”等荣誉称号的获得者,甚至有人把他尊称为“穿西装的孔子”,而他却始终认为自己最名副其实的称号还是“人民教师”。 魏老师长期从事教育教学研究,坚持在教学第一线,他当了二十 年的班主任,二十多年的校长,十几年的教育局长(去年才御任辽宁盘 锦市教育局局长)。在任校长期间,他坚持任两个班的语文课,还做班 主任;在他做教育局长期间,他一直坚持为学生授课。 今天前来聆听报告的领导和老师有近XX人。今天的的报告会时间 安排是:上午九点到十二点,下午两点半到五点半。 下面,就让我们带着朝圣的虔诚,带着求知的热情,用热烈的掌 声邀请魏老师给我们作报告。有请魏老师。 (下午报告会结束后) 同志们,今天我们享受了魏老师关于教育教学的精彩演讲。魏老 师以渊博的学识、生动的事例、机智幽默的语言,折服了在场的每一 个人。整个报告会虽然长四五个小时,但大家似乎忘却了时间的流逝。魏老师的报告,实际上是关于人生观和人生态度的报告,是一场充满 哲理、充满诗意的报告。 魏老师的深情演讲,给我们带来快乐,也给我们带来了启迪,他 把我们思想引向崇高的境界,也使我们禁不住审视自己的灵魂。魏老 师的幸福观、苦乐观、“松静匀乐”理念,让我们感到顿悟和振奋。

极限学习机简介

1 极限学习机 传统前馈神经网络采用梯度下降的迭代算法去调整权重参数,具有明显的缺陷: 1) 学习速度缓慢,从而计算时间代价增大; 2) 学习率难以确定且易陷入局部最小值; 3)易出现过度训练,引起泛化性能下降。 这些缺陷成为制约使用迭代算法的前馈神经网络的广泛应用的瓶颈。针对这些问题,huang 等依据摩尔-彭罗斯(MP )广义逆矩阵理论提出了极限学习(ELM)算法,该算法仅通过一步计算即可解析求出学习网络的输出权值,同迭代算法相比,极限学习机极大地提高了网络的泛化能力和学习速度。 极限学习机的网络训练模型采用前向单隐层结构。设,,m M n 分别为网络输入层、隐含层和输出层的节点数,()g x 是隐层神经元的激活函数,i b 为阈值。设有N 个 不同样本(),i i x t ,1i N ≤≤,其中[][]1212,,...,,,,...,T T m n i i i im i i i in x x x x R t t t t R =∈=∈,则极限学习机的网络训练模型如 图1所示。 图1 极限学习机的网络训练模型 极限学习机的网络模型可用数学表达式表示如下: ()1,1,2,...,M i i i i j i g x b o j N βω=+==∑

式中,[]12,,...,i i i mi ωωωω=表示连接网络输入层节点与第i 个隐层节点的输入权值向量;[]12,,...,T i i i in ββββ=表示连接第i 个隐层节点与网络输出层节点的输出权值向量;[]12,,...,T i i i in o o o o =表示网络输出值。 极限学习机的代价函数E 可表示为 ()1,N j j j E S o t β==-∑ 式中,(),,1,2,...,i i s b i M ω==,包含了网络输入权值及隐层节点阈值。Huang 等指出极限学习机的悬链目标就是寻求最优的S ,β,使得网络输出值与对应实际值误差最小,即()()min ,E S β。 ()()min ,E S β可进一步写为 ()()()111,,min ,min ,...,,,...,,,...,i i M M N b E S H b b x x T ωβ βωωβ=- 式中,H 表示网络关于样本的隐层输出矩阵,β表示输出权值矩阵,T 表示样本集的目标值矩阵,H ,β,T 分别定义如下: ()()()()()111111111,...,,,...,,,...,M M M M N N m N M N M g x b g x b H b b x x g x b g x b ωωωωωω?++????=????++? ? 11,T T T T M N M N N N t T t βββ??????????==???????????? 极限学习机的网络训练过程可归结为一个非线性优化问题。当网络隐层节点的激活函数无限可微时,网络的输入权值和隐层节点阈值可随机赋值,此时矩阵H 为一常数矩阵,极限学习机的学习过程可等价为求取线性系统H T β=最小 范数的最小二乘解?β ,其计算式为 ?H T β += 式中H +时矩阵H 的MP 广义逆。 2实验结果

英文学术报告怎么写

英文学术报告怎么写? 学术报告尤其是英文学术报告的写作对于中国学生来说是一个薄弱环节,反之,国外的教育对此则相当重视。去国外学习,各种课程都可能要求你写报告,特别是商业、科学或技术类学科。依据内容及导师要求的不同,报告的形式可能有所差别,但是写作的过程却是相似的。美联英语学习网向您介绍英文学术报告怎么写? 学术报告(academic report or paper)的写作包括阅读,思考,针对一个学术课题进行案例分析及写作,总体来说就是为了让你的读者以新的眼光来看待该课题。大学的学术报告与你之前所做过的其他类型的报告(如高中时期为完成老师布置的作业而写的报告)的不同之处在于,它既陈述了研究事实,又提出了你自己的推论。 写作过程指导: 1. 研究及总结(Research and Summarize) 研究所选的课题,找出针对课题的“who(谁),what(什么),when(何时),why(为什么)及how(如何)”的答案。总结收集到的信息,清楚自己所掌握的,评估自己所不知道的,以便专注于进一步的研究方向。列出一个参考书目来引用并避免抄袭。 阅读原始文献及补充信息并考虑关于课题的历史或流行观点。思考每篇文章的要点同时注意它们的共性。缩小你的课题范围这样你就可以准备写一篇充满智慧的报告了。 2.定义结构(Define the Structure) 定义报告的结构并将你的论点串联成一个连贯的推理。依据报告的主题制定一个大纲。以这种方式组织你的观点可以帮助你看到各个观点之间的关系。

大纲中的所标题结构要统一。例如如果你的第一个标题以一个行为动词(action verb)开头,那么所有其他标题都应该以行为动词开头。副标题(Sub-heading)应该包含支撑大标题的细节。重组你的大纲直到它能清晰的反映一个思想逻辑流程。 3. 写正文(Write your paper) 学术报告通常包含五个部分:标题页,引言(包含论文主题),背景信息,关于论点及结论的细节,陈述支持或反对某个特点观点的争论。 用辅助段落来讲述一个故事,提供一个观察的视角,描述一个过程,定义意义,对思想进行分类,比较和对比观点,类推或解释为什么这种现象会发生。提供证据,陈述假设并适当添加个人观点使论文连贯、清晰易懂。确保每段都有足够的结束语及过渡句。在学术报告中尽量减少使用个人代词如“我”。 4. 校对(Proofread) 打印并大声朗读报告来检查错误。保证报告没有语法及格式错误。 至此一篇英文学术报告就完成了,当然在做报告之前还要加强学术英语的学习,背诵一些相关专业的词汇和术语,并尽量多阅读一些英文的专业文献,做好基本功。

ELM极限学习机相关

简单易学的机器学习算法——极限学习机(ELM) 一、极限学习机的概念 极限学习机(Extreme Learning Machine) ELM,是由黄广斌提出来的求解单隐层神经网络的算法。 ELM最大的特点是对于传统的神经网络,尤其是单隐层前馈神经网络(SLFNs),在保证学习精度的前提下比传统的学习算法速度更快。 二、极限学习机的原理 ELM是一种新型的快速学习算法,对于单隐层神经网络,ELM 可以随机初始化输入权重和偏置并得到相应的输出权重。

(选自黄广斌老师的PPT) 对于一个单隐层神经网络(见Figure 1),假设有个任意的样本,其中,。对于一个有个隐层节点的单隐层神经网络可以表示为 其中,为激活函数,为输入权重,为输出权重,是第个隐层单元的偏置。表示和的内积。

单隐层神经网络学习的目标是使得输出的误差最小,可以表示为 即存在,和,使得 可以矩阵表示为 其中,是隐层节点的输出,为输出权重,为期望输出。 , 为了能够训练单隐层神经网络,我们希望得到,和,使得 其中,,这等价于最小化损失函数 传统的一些基于梯度下降法的算法,可以用来求解这样的问题,但是基本的基于梯度的学习算法需要在迭代的过程中调整所有参数。而在ELM 算法中, 一旦输入权重和隐层的偏置被随机确定,隐层的输出矩阵就被唯一确定。训练单隐层神经网络可以转化为求解一个线性系统。并且输出权重可以被确定

其中,是矩阵的Moore-Penrose广义逆。且可证明求得的解的范数是最小的并且唯一。 三、实验 我们使用《简单易学的机器学习算法——Logistic回归》中的实验数据。 原始数据集 我们采用统计错误率的方式来评价实验的效果,其中错误率公式为: 对于这样一个简单的问题,。 MATLAB代码 主程序 [plain]view plain copy

大学英语演讲稿范文及翻译

大学英语演讲稿范文及翻译 下面是小编精心整理的大学英语演讲稿范文及翻译,希望能给大家带来帮助! in my 18 years of life, there have been many things. university days are the best part of them. i can never forget the days when i stepped into my university. i was impressed by its garden-like campus, its enthusiastic students and especially its learning atmosphere. i at once fell in love with it. after the arduous military training, i get absolutely absorbed in my studies. the classes given by the teachers are excellent. they provide us with information not only from our textbooks but from many other sources as well. they easily arouse my insatiable desire to take in as much as i can. frankly speaking, at first i had some difficulty following the teachers. however, through my own efforts and thanks to my teachers' guidance, i made remarkable progress. now i've benefited a lot from lectures and many other academic reports. learning is a long process; i'll keep exploring in the treasure house of knowledge to enrich myself. this summer i got out of the ivory tower and entered the real world. a publishing house offered me a part-time job in compilation and revision. at the beginning i was belittled by my colleagues. but they were really surprised when i translated seven english articles over 5,000 words on only one day. gradually, they began to look at me with respectful eyes. in their opinion i turned out to be a useful and trustworthy colleague. i also realize that only those who bring happiness for others can be truly happy. so i often take part in activities concerning public welfare. i once went to a barren mountain village with my classmates. we taught the kids there

学术报告心得体会

学术报告心得体会 进入大学以来,我利用课外的时间在学校的报告厅听了五场学术报告。可以说,每场报告都是非常的精彩,并且让我受益匪浅。作为一名大学生,学习不能仅仅局限于课堂,毕竟书本的知识还是有限的。而学校举办的各类学术报告的演讲正好给我们提供了一次课外学习的机会,让我们得以充实与提高。下面我将结合所听报告的内容,谈谈这五场学术报告给我带来的心得与体会。 第一场和第二场报告是由北京电影学院的朱青君教授演讲的,虽然她已年过花甲,但她的演讲却是那样的幽默风趣,迎来了大家的阵阵掌声。她所做的第一场演讲是关于人格魅力和人际关系,我觉得这个主题和我们大学生是密切相关。都说大学相当于半个社会,走上社会难免会要与人打交道,因而正确处理人际关系和提升自身人格魅力显得十分重要。朱青君教授给我们具体的讲解了人格魅力和人际关系的含义极其重要性,并且分析了人际吸引的种种因素,比如身体魅力、个性魅力、才能魅力等等。她得出了一个很有意义的结论:喜欢你的、赞赏你的、帮助你的人,对你最有吸引力,反之亦然。这句话给了我深深的思考,那就是我们应该毫不吝啬对于他人的赞美与赏识。威廉·詹姆斯曾说过:人性最深刻的原则就是觅求别人对自己加以赏识,讲的就是这个道理。朱教授还提到,真诚无私帮助过你的人是和你关系最好、让你最难忘的人。人们常说,送人玫瑰,手有余香,的确,一个懂得无私帮助别人的人才是最值得相信的人,才是让人最难忘的人。朱教授做的另一场演讲的主题是体验经济时代的新思维,这个主题的中心思想是体验经济带来快乐,快乐带来财富。她说,体验是一种感觉,分为四个部分:娱乐,教育,逃避现实,审美。而且体验经济有区别于服务业,因为体验经济是从服务中分离出来的,是种迄今尚未得到广泛认识的经济提供物。此外,她还指出,体验是老经济的;“助燃剂”,一方面我们要营造全新的体验经济,另一方面,在传统经济行业中也可以开发出新的“体验经济”。伴随着朱教授的精彩演讲,我们也对“体验经济”这个新鲜的词汇有了深刻的了解,也懂得了“体验经济带来快乐,快乐带来财富”这一深刻内涵。 第三场报告是先锋书店的总经理钱晓华先生做的,他的报告题目是“读书与创业人生”。他从自己的创业经历讲起,讲他如何一步步艰难创业并最终成功打造了自己的书店——先锋书店。他说,一个书店就是一个城市的镜子,照亮人们的爱憎与渴望。他还讲述了一些发生在先锋书店的感动人心的故事,即一些高校学子、年轻恋人与先锋书店的渊源。整场演讲高潮迭起,不时赢得同学们的掌声。在他的演讲中,他提到建议先锋书店给予农民工子女更多优惠折扣,让他们也能读得起书,因为他知道,读书可以改变一个人,要想让农民工真正的摆脱贫穷,读书是最好的出路。听完这场报告,更加激发了我读书的热情,真的让我感受到:书籍是人类进步的阶梯。除此之外,钱先生的坚持不懈也是我要学习的,先锋书

极限学习机elm代码

%与传统的学习算法不同,单隐层前馈神经网络(SLFNs)——极限学习机(ELM)对输入权值进行动态选择 %%清空环境变量 elm clc clear close all format compact rng('default') %% 导入数据 % [file,path] = uigetfile('*.xlsx','Select One or More Files', 'MultiSelect', 'on'); % filename=[path file]; filename='CPSO优化ELM实现分类\最终版本训练集.xlsx'; M=xlsread(filename); input=M(:,1:end-1); output=M(:,end); %% 数据预处理 [inputn,maps]=mapminmax(input',0,1); % 输出标签转换为one—hot标签 outputn=one_hot(output); n_samples=size(inputn,2); n=randperm(n_samples); m=floor(0.7*n_samples); Pn_train=inputn(:,n(1:m)); Tn_train=outputn(:,n(1:m)); Pn_valid=inputn(:,n(m+1:end)); Tn_valid=outputn(:,n(m+1:end)); %% 节点个数

inputnum=size(Pn_train,1);%输入层节点 hiddennum=160; %隐含层节点 type='sig';%sin %hardlim %sig%隐含层激活函数 %% tic [IW,B,LW,TF] = elmtrain(Pn_train,Tn_train,hiddennum,'sig'); TY2 = elmpredict(Pn_valid,IW,B,LW,TF); toc % 计算分类概率 prob0=prob_cal(TY2);%第一行为属于0(不滑坡)的概率第二行为属于1(滑坡)的概率,上下两个概率和为1 % 看看准确率 % 验证集分类结果 [~,J]=max(Tn_valid); [~,J1]=max(prob0); disp('优化前') accuracy=sum(J==J1)/length(J) TY2 = elmpredict(Pn_train,IW,B,LW,TF); % 计算分类概率 prob0=prob_cal(TY2);%第一行为属于0(不滑坡)的概率第二行为属于1(滑坡)的概率,上下两个概率和为1 % 看看准确率 % 验证集分类结果

学术报告论文格式

学术报告优秀论文格式 1 编写要求 1.1 学术论文须用A4标准白纸。 1.2 学术论文页边距按以下标准设置:上边距: 2.5cm;下边距:2.5cm;左边距:2.5cm;右边距:2.5cm;装订线:0.5cm;页眉:1.5cm;页脚:1.5cm。 1.3学术论文文正文字体为小四号宋体;行间距设置为“多倍行距1.2”。 2学术论文的构成 前置部分:题目——作者——摘要(——关键词) 主体部分:论文主体 参考文献 后置部分:英文题目——作者——英文摘要(——英文关键词) 3字体要求: 题目:小二号宋体 摘要:小四号宋体 一级标题:三号宋体 二级标题:小三宋体 三级标题:四号宋体 正文:小四号宋体 参考文献:五号宋体 英文摘要:Times New Roman 小四 4正文格式 (1).学术论文各章应有序号,序号用阿拉伯数字编码,层次格式为:1××××(三号黑体,居中) ××××××××××(内容用小四号宋体) 1.1××××(小三号黑体,居左) ××××××××××××(内容用小四号宋体) 1.1.1××××(四号黑体,居左) ××××××××××××(内容用小四号宋体) ①××××(用与内容同样大小的宋体) 1)××××(用与内容同样大小的宋体) a .××××(用与内容同样大小的宋体) (2).图: 图应编排序号。每一图应有简短确切的题名,连同图号置于图下。字体为5

号宋体。 (3).表: 每一表应有简短确切的题名,连同表号置于表上。表注应编排序号,并将附注文字置于表下。字体为5号宋体。 (4)参考文献: 参考文献以文献在整个论文中出现的次序用[1]、[2]、[3]……形式统一排序、依次列出。 参考文献的表示格式为: 著作格式:[序号]作者·书名·版本·出版地·出版社,出版时间 期刊格式:[序号]作者·译者·文章题目·期刊号·年份·卷号(期数)·页码 学术论文:[序号]作者·题名[学术论文](英文用[dissertation])·保存地点·保存单位·年份 专利:[序号]专利申请者·题名·国别·专利文献种类·专利号·发布日期技术标准:[序号]起草责任者·标准代号·标准顺序号—发布年·标准名称·出版地·出版者·出版年份 5其他: 本学术论文格式规范标准的解释权在东华理工大学研究生会。

演讲稿 英语演讲稿:青春梦想范文3篇

英语演讲稿:青春梦想范文3篇 青春是梦想泛滥的时代,因为我们想的多,而却做的太少了,以下是整理了英语演讲稿:青春梦想范文3篇,供你参考。 青春梦想英语演讲稿范文一 in addition to other life choices, and many give up. but a lot of life to give up in order to better choices. you this time, the face of the university are beckoning to you, you can choose not to give up. can also say: you give up a lot life is not long, but many times choose, you have many choices come through today. wait more than 60 days, you in turn to make a choice, but the choice is your life is very important, very crucial choice. the choice to a large extent directly related to your future direction in life, whether related to the future glory of life, and even the success of the big problems of life. the last choice, because your young, largely by your parents or relatives to help you to make choices. then the choice of their ability to help so you can create conditions to help you (for example, taking your difference, they can give you the money you read choosing health). entrance options, but this largely depends on yourself decide the. because: first, the difficulty of the selection bigger, some parents and relatives also lack guidance capability; second is the choice of policy, a

相关文档