A Bayesian Approach for Adaptive BCI Classi?cation


1Fraunhofer FIRST(IDA),Berlin,Germany

2Technical University Berlin,Berlin,Germany


SUMMARY:In this article,we present an adap-tive classi?er for BCI based on a mixture of Gaus-sian(moG)model of the features and a dynamical Bayesian model of the class means.We apply this approach to feedback data from the Berlin Brain-Computer Interface(BBCI).The proposed model can improve the classi?cation performance by compen-sating for substantial changes of EEG signals be-tween training and feedback sessions as well as for gradual nonstationarity in the feedback sessions. INTRODUCTION

EEG-based BCI systems are often subject to nonsta-tionarities that are caused by changes in the subject’s mental state during an experiment(e.g.due to fa-tigue,change of task involvement and demands for visual processing etc.).Recently Shenoy et al.[2] showed that a simple bias recalculation for the clas-si?er obtained from the training data can eliminate sources most detrimental effects of nonstationarities during feedback operation.In this paper,we pro-pose a Bayesian version of such adaptive classi?ers, where the class means are treated as random vari-ables and their posterior distributions are approxi-mated by a sequential manner as Kalman?lters.The proposed method was applied to BBCI data collected from three subjects.


We investigate data from a study of three subjects us-ing the BBCI system similar to[1]but with very long feedback blocks without break.The experiments con-sisted of a calibration measurement and a feedback period.In the calibration measurement,visual stim-uli(L),(R)(for imagined left and right hand move-ment)and(F)(for imagined foot movement)were presented to the subjects.Based on the recorded sig-nals,subject-speci?c features for the further analysis were calculated.The most discriminative frequency band for two of the three classes was selected man-ually by experts,and common spatial patterns(CSP) were calculated.For the data sets we analyzed,6(al),

2(aw)and4(VPt)CSP channels were used,respec-

tively.The bandpower of the CSP-projected channels

was estimated using windows of3seconds length,

and?nally a linear classi?er was trained by linear dis-

criminant analysis(LDA).

In the feedback phase,bandpower estimations of CSP

channels were calculated in a similar manner as in the

calibration session for sliding windows of1second

length.The real-valued output by the LDA classi?er

was used to move a cursor horizontally on the screen.

The subjects were then using this cursor for the oper-

ation of a text input(speller)software.

We employ the following mixture of Gaussian(moG)

model for each class distribution

p(x|y=1):=(1?p p)φ(x|μp,Σp)+p pφ(x|m,V), p(x|y=?1):=(1?p n)φ(x|μn,Σn)+p nφ(x|m,V),

whereφ(x|μ,Σ)is the Gaussian density function

with meanμand covarianceΣ.The?rst terms repre-

sent typical samples,while the common second term

corresponds to outliers with large covariance V.Al-

though we concentrated on the binary classi?cation

problem,the moG model also enables us to recog-

nize outlying observations from typical samples.In

the training session,we estimate the model parame-

ters,i.e.,the mean(μp,μn,m)and the covariance

(Σp,Σn,V)of each Gaussian prototype,their outlier

ratios(p p,p n),and the class probability(π:=p(y=

1))by EM algorithm with an extra restriction to keep

the covariance V of the outlier large.

To cope with the difference of EEG signals be-

tween training and feedback and the gradual non-

stationarity in the feedback session we assume that

the centers of both classes are random variables and

subject to the dynamical model(t≥1)




initial means are also assumed to be Gaussians cen-

tered at the estimators from the training session,i.e.

μp(0)~N( μp,Γp)andμn(0)~N( μn,Γn),re-

spectively.The covariances?p,?n,Γp andΓn con-

trol the speed of adaptation and should be chosen ac-

cording to the magnitude of the initial covariances.

The center m(t)of the outlier class is?xed at the


average of the positive and negative classes.The required parameters are determined on the training data.

When the samples and the labels D t={xτ,yτ}tτ=1 up to t-th trial are observed,we infer the posterior dis-tribution p(μp(t),μn(t)|D t)by a sequential scheme as Kalman?lter.However in contrast to the case in Kalman?lters,the posterior is not Gaussian in our moG model.Hence,we approximate it by a single Gaussian distribution with the same mean and covari-ance.We construct a classi?er based on the posterior distribution in order to predict the label of the(t+1)th trial from the inputs.In this study we adopted the classi?er based on the posterior probability of the typ-ical positive minus that of the typical negative class, i.e.f t(x):=P(y=1,z=0|x,D t)?P(y=?1,z=0|x,D t),where the latent variable z equals 1if the sample is an outlier and0otherwise. RESULTS

In our Bayesian framework,the posterior distribu-tions of the class meansμp(t)andμn(t)are approx-imated by Gaussians.In order to visualize the non-stationarity of the data infered by the moG model,we plotted the time course of the posterior means in Fig-ure1,where the horizontal axis is the direction of the original classi?er.Time is indicated by gray scale (black to white).At the beginning,because less in-formation about the class meansμp(t)andμn(t)is available,the posterior means can move by a large amount,while the changes get smaller as more trials are performed.For the subject aw,although the mean of the positive class again comes closer to the esti-mator from the training data after feedback learning, that of the negative class seems to stay away from the original estimator.This is the reason why classi?er modi?cation in the feedback session can improve the performance.

In Table1,we compare the classi?cation error in the feedback sessions of our approach(ADB)with the er-ror of the original classi?er(ORIG).ADB was much better for subject aw,equal for VPt and worse for al. Table1:Comparison of classi?cation errors (window-wise,the feedback error in the BCI task was lower)


22.59.0 VPt


