当前位置：文档库 › CVPR2011 Illumination Transfer of Face

CVPR2011 Illumination Transfer of Face

Face Illumination Transfer through Edge-preserving Filters Xiaowu Chen,Mengmeng Chen,Xin Jin?and Qinping Zhao

State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and Engineering,Beihang University,Beijing,China

Abstract

This article proposes a novel image-based method to transfer illumination from a reference face image to a tar-get face image through edge-preserving?lters.Accord-ing to our method,only a single reference image,without any knowledge of the3D geometry or material informa-tion of the target face,is needed.We?rst decompose the lightness layers of the reference and the target images into large-scale and detail layers through weighted least square (WLS)?lter after face alignment.The large-scale layer of the reference image is?ltered with the guidance of the tar-get image.Adaptive parameter selection schemes for the edge-preserving?lters is proposed in the above two?lter-ing steps.The?nal relit result is obtained by replacing the large-scale layer of the target image with that of the refer-ence image.We acquire convincing relit result on numer-ous target and reference face images with different light-ing effects and https://www.wendangku.net/doc/455231359.html,parisons with previous work show that our method is less affected by geometry differ-ences and can preserve better the identi?cation structure and skin color of the target face.

1.Introduction

Image-based photo-realistic face relighting without3D model has been extensively studied in the computer com-munity and widely used in?lm production and game enter-tainment.However,it is still a challenging problem when only a single reference image is available.

Recently,many face relighting methods have been pro-posed such as morphable model based methods[17]and quotient image based methods[14,2].However,according to morphable model based methods,a collection of scanned textured3D faces is often needed;while in quotient image based methods are required two reference face images:one has similar lighting effects to the target image,and the other has the desired novel lighting effects.

For more convenient use and wider application,our ob-?Corresponding author(email:jinxin@https://www.wendangku.net/doc/455231359.html,)jective is to generate photo-realistic relighting result of a frontal face image taken under nearly uniform illumination, so as to make the result as similar as possible in lighting effects to that of only a single reference face image under another illumination(see Figure

1).

(a) target face(b) reference face(c) relit result Figure1.The objective.(a)is the target face image,and(b)is the single reference image.(c)is our relit result which has similar lighting effects to that of(b).

The point of the achievement of this objective is that with the interference from the material and geometry informa-tion,how to extract the illumination component from a sin-gle reference image.The illumination component is to be used to relight the target image.Due to the ill-posedness of the problem,current automatic methods of extracting illumination component from a single image such as[15] will often fail to handle complex natural images,especially when large lighting contrast exists.Even by using an user-assisted as approach proposed by Bousseau et al.[1],the separated illumination components still contain not just illu-mination but also some geometry and material information. This is far from true illumination component where object re?ectance is all the same.The face relighting results would thus be in?uenced.

Most of current single face relighting methods such as [7]are related to face recognition.They are often used to deal with low-resolution face images,aiming to remove the light and the shadows in the face for recognition rather than generate convincing relighting results as surveyed in [16].Jin et al.[9]have used local lighting contrast fea-tures to learn artistic lighting template from portrait pho-

tos.However the template is designed for classi?cation and numerical aesthetic quality assessment of portrait photos. Thus,the template is not suitable to transfer illumination. In the application of face makeup transfer by a single refer-ence image,Guo and Sim[6]adopt a gradient-based edit-ing method,which add only large changes in the gradient domain of the reference image to the target image so as to transfer highlight and shading.However,their assumption that large changes are caused by makeup holds if the illu-mination of the reference image is approximately uniform.

To the best of our knowledge,the work most simi-lar to ours is[12].They relight a single image using a single reference image.They decompose images into large-scale layer(considered as illumination dependent) and small-scale layer(considered as illumination indepen-dent)in RGB channels respectively,and then get the result by re-combination of the target small scale layer and the reference large-scale layer in each channel.However,the color of the target face is not preserved well.

Recently,edge-preserving?lters have been an active re-search topic in computational photography[10,3,8].Edge-preserving?lters can smooth an input image and preserve the edges of another image at the same image,and the two input images can be the same or not.Retinex theory[11] tells us that large-scale variance in image is caused by il-lumination variance,and small-scale variance is caused by re?ectance variance.Edge-preserving smoothing can sepa-rate the image into large-scale layer and detail layer,detail-scale retaining small variance and large-scale large variance of the original image.For our single face relighting task, we are thus inspired to apply edge-preserving smoothing to approximate the process of decomposing lightness into il-lumination independent layer(re?ectance)and illumination dependent layer(shading).

We analyze the performance of edge-preserving?lters in our scenario(see Figure1).As Weight least square(WLS)?lter[3]shows best performance in image detail decompo-sition,we choose WLS?lter for our detail decomposition process.The joint bilateral?lter[10]could make the?l-tered result preserve well the structure of the target object, but may lose much shading distribution.WLS?lter could make the?ltered result preserve shading distribution of ref-erence object well but may lose the edge structure of the target object.Guided?lter[8]presents a straightforward method to smooth a single image with the guidance of an-other image by taking the assumption that the output image is a linear transform of the guidance image in local win-dows.When the input image and the edge image are differ-ent,it can get good results.For our task we observe guided ?lter[8]could perform a better trade-off between shading distribution preservation and edge structure preservation.

In this paper,we present a method of frontal face image relighting based on a single reference face image.The light color is considered as nearly white.We?rst decompose im-ages into three layers:color,large-scale,and detail layers; second,as it is assumed that lighting variance retains on the large-scale layer,we operate only on the large-scale layer. We next apply edge-preserving?lters to smooth the large-scale layer of the reference image and preserve the edges of the large-scale layer of the target image;Finally,we get convincing relit result while preserving good identi?cation characteristics of target face.

Our main contributions include:(1)A framework of face illumination transfer based on edge-preserving?lters,and (2)Adaptive parameter selection schemes in WLS?lter and guided?lter for face relighting.

2.Face Illumination Transfer

The work?ow of our method is illustrated in Figure2. The face alignment,layer decomposition and adaptive pa-rameter selection schemes for edge-preserving?ltering will be described in this section.

2.1.Face Alignment

To transfer the illumination effects from a reference face to a target face,we need to warp the reference face im-age according to target face image.We employ Active Shape Model(ASM[13])with104mark points to identify the mark points on both images.Due to various changes of different faces under various illuminations,current ASM methods tend to fail to locate the accurate mark points, therefore we get a rough initial mark points by using ASM, and then re?ne their accurate position in an interactive way. In our experiments,one minute is enough to?x the mark points accurately.We then take the mark points as control points to warp the reference image according to the target image by using the af?ne transform.

https://www.wendangku.net/doc/455231359.html,yer Decomposition

First,we decouple the image into lightness and color, and the lighting effects are considered mainly retained on lightness.We choose CIE1976(L?,a?,b?)color space,as it could separate color image to lightness and color well,L?channel contains lightness information(similar to human perception lightness),and a?channel and b?channel con-tain color information.We employ edge-preserving?lters to smooth the lightness layer so as to obtain the large-scale layer and then use division to obtain the detail layer.

d=l/s(1) Lightness layer,large-scale layer and detail layer are de-noted as l,s,and d.The detail layer d can be considered as illumination independent,and large-scale layer as illumina-tion dependent.We choose to apply WLS?lter to decom-pose the lightness layer into the large-scale layer and the

target

reference

color

lightness

color

lightness

details

large scale

details

large scale

result

lightness

Figure2.The work?ow of the proposed method.The reference face image is warped according to the shape of the target face.Both the target image and the warped reference image are cropped to the contour of the mark points.Then the two cropped images are decomposed into lightness layer and color layer,and only lightness layer is operated on.The two lightness layers are decomposed into the large-scale layer and the detail layer by using WLS?lter.The reference large-scale layer is?ltered with the guidance of the target large-scale layer using guided?lter to form the large-scale layer of the relit result.By compositing the?ltered large-scale layer and the target detail layer, the lightness layer of relit result is obtained.Finally,the relit result is calculated by compositing the lightness layer and the color layer of the target face image.

detail layer.It is observed that WLS?lter can perform well

in decomposing lightness into the large-scale layer(similar

to shading)and the detail layer(containing re?ectance).

2.3.WLS?lter with Adaptive Parameter

WLS?lter in[3]performs the same level of smoothing

all over the image.But when WLS?lter is used in our task,

it is expected to perform different levels of smoothing on

different regions of the image.Similar to[6],we set dif-

ferent different smooth levels in different regions of the im-

age.Thus,the modi?ed version of the energy function of

WLS[3]?lter is

E=|l?s|2+H(?s,?l)(2)

H(?s,?l)=

(λ(p)(

(?s/?x)2p

(?l/?x)p+

(?s/?y)2p

(?l/?y)p+

)).

(3)

where,|l?s|2is the data term to keep s as similar as to l,and

H(?s,?l)is the regularization term to make s as smooth

as possible.The subscript p denotes the spatial location of a

pixel.αcontrols over the af?nities by non-linearly scaling

the gradients.Increasingαwill result in sharper preserved

edges.λis the balance factor between the data term and

the smoothness term.Increasingλwill produce smoother

images.

It is observed that the less?at the region is,the larger

λis required.In the?at region,a smallλis enough to

produce a good separation of the large-scale and the detail

layers.Most re?ectance information can then be retained in

the detail layer.However,in the regions such as facial hair

and eyebrows,a largerλis required to perform higher level

of smoothing,so that re?ectance can be better maintained

in the detail layer.

A simple way to setλover the image is as follows:First,

vertical and horizon gradients g x and g y of lightness l are

calculated,and a threshold t1is given,second,for each

pixel p,compute the number of the pixels with the gradi-

ent scale larger than threshold t1in the local window of p,

γ(p)=

i∈w p

(

(?l/?x)2

+(?l/?y)2

≥t1).(4)

Afterγis normalized to0-1,we setλas follows,

λ(p)=λs+(λl?λs)?γ(p),(5)

whereλs andλl refer to the smaller and largerλto control

the lowest and highest levels of smoothing.

In our experiments,α=1.2,the local window radius is

8,λs=1,λl=4?λs and t1=0.02.

As shown in Figure3,by setting differentλspa-

tially,WLS?lter performs different levels of smoothing and

(a)

(b)(c)(d)

(e)(f)

Figure3.(a)Target lightness layer,decomposed to large-scale

layer(d)by using a sameλover all the image;(b)is normal-

izedγ,which is used to calculate spatialλ.(c)is the large-scale

layer calculated by using spatialλdetermined by(b).It could be

observed that(c)can obtain less detail information than(d)in the

regions of facial hair and eyebrows.

higher level of smoothing on the regions such as facial hair,

eyebrows,https://www.wendangku.net/doc/455231359.html,rge-scale layer can be obtained to retain

less detail information on the corresponding regions than

using a sameλ

over all

the

image.

2.4.Guided

Filter

with Adaptive

Parameter

(a)(b)(c)(d)

Figure4.(a)A rough contour line is decided by face mark points;

(b)the face structure region is then determined by the contour line;

(c)the Canny edge detector is applied to the face structure region;

(d)distance transform is employed to the detected edges,for the

pixels far from edges,smaller kernel sizes are used,and for the

pixels near the edges,larger kernel sizes are used.

The reference large-scale layer is?ltered by guided?lter

with the guidance of the target large-scale layer.Guided

?lter[8]is brie?y described here.Guided?lter has a key

assumption that it is a local linear model between guidance

I and?ltered output q,as q is the linear transform of I in

the window w k centered at the pixel k:

q i=a k I i+b k,?i∈w k,(6)

where,a k and b k are assumed to be constant in w k.Then,

the linear coef?cients a k and b k are determined by mini-

mizing the difference between q and?lter input

.The

cost

function is

de?ned in

Eq.7.

(a) target(b) reference(c) r =3(d) r= 18(e) adaptive r

https://www.wendangku.net/doc/455231359.html,rge-scale layers with different guided?lter parame-

ters.(a)target large-scale layer;(b)reference large-scale layer;(c)

a single kernel size r=3is used over all the image;(d)a sin-

gle kernel size r=18is used over all the image;and it can be

observed that(c)preserves much more structures of the reference

image,it retains much shading information as well as more iden-

ti?cation characteristics of the reference face;(d)maintains the

structure of the target face well,but blurs the shading information

of the reference face;(e)performs a trade-off between preserving

the target structure in the face structure region and retaining the

shading information of the reference face by using adaptive kernel

sizes.

E(a k,b k)=

i∈w k

((a k I i+b k?P i)2+ a2k).(7)

[8]gives the output of a k and b k:

a k=

|w|

i∈w k

I i P i?μk P k

σ2

(8)

b k=P k?a kμk,(9)

whereμk andσk are the mean and the variance respectively

in w k,|w|is the pixel number in w k,and P k is the mean of

P in w k.Then the linear model is applied to all local win-

dows in the image,for a pixel i involved in all the windows

w k that contain i.In[8],they average all possible values

of a k and b k.We observe that the averaged a k and b k is

similar to the smoothed version of a k and b k which are di-

rectly computed by the windows that they are centered.In

fact,when the guidance I and the?lter input P are different

images,the?ltered result with the averaged a k and b k and

the?ltered result with a k and b k from the centred window

have little difference.Thus,we omit the average process,

and consider a k and b k from the centered window of pixel

i as the representation of all a k and b k from the windows

involved in pixel i.

Since it is hard to?x a kernel size(window radius above)

for our task,a large kernel size can make the?ltered result

preserve the edge structure of the target object,but blur the

shading information;a small kernel size can get the oppo-site result.As guided?lter is a totally local method,it can be extended to different kernel sizes in different regions. Edges in the face structure region(such as eyes,eyebrows, nose and mouth)are important and edges in other regions are less important.We thus set the kernel size a small value in the non-face-structure region and treat the face structure region carefully,which can preserve better the structure of the target face by sacri?cing part of the shading distribution in the face structure region.

We extend guided?lter to different kernel size spatially as follows:we?rst de?ne a mask containing face struc-ture region,and then treat pixels in the mask region care-fully.Our basic idea is to set the kernel size near the edges in the face structure region to be of larger value,and dis-tance transform is applied to set the kernel size that gradual changes of the gradual change of the distance away from the edges in face structure region.As shown in Figure4,Ker-nel size r in the face structure region is decided as follows: First,the mark points can construct a rough contour of the face structure,and then the face structure region is deter-mined;second,the Canny edge detector is applied to detect the edges in the face structure region of the large-scale layer of the reference face;third,for the edges in the face struc-ture region,compute the distance of all pixels from these edges.Finally,spatially kernel sizes are de?ned by Eq.10 and11.

dist(p)=|p?q(min

(|q?p|))|(10)

r(p)=

r0+(r1?r0)?(T d?dist(p))

d t

if dist(p)≤T d r0others

(11)

|p1?p2|means the Euclidean distance from pixel p1to pixel p2.We set r1=18,r0=3.T d is a threshold of dist(p) and we set T d=10in our experiments.

As shown in Figure5,our modi?ed guided?lter with different kernel sizes over the entire image performs well the trade-off between preserving the edges of the target im-age and preserving the shading distribution of the reference image.

After we?lter the reference large-scale layer with the guidance of the target large-scale layer by using our modi-?ed guided?lter,the desired target lightness is got by com-positing the?ltered reference large-scale layer and the tar-get detail layer.Finally,we get the relit result by incorpo-rating the color information of the target face.

3.Experiments

Relit results with adaptive parameters.Performance of our relighting method majorly relies on decomposition of lightness layer into large-scale and detail layer using WLS ?lter,and guided?ltering process of the reference large-scale layer with the guidance of the target large-scale layer.

To validate the ef?ciency of the proposed method,we ?rst check the function of our modi?cation on WLS?lter parameter choice.As shown in Figure6,by setting larger λin the facial hair and eyebrow regions,less re?ectance information would retain in the large-scale layer and more re?ectance information retain in detail layer,and the relit re-sult can then retain more detail identi?cation characteristics of the target

face.

(a) λ= 1(b) adaptiveλ(c) λ=

4 Figure6.Our relit results with different WLS?lter parameters.

(a)is relit result when over all the image,the WLS?lter parameter λ=1;(b)is the relit result when the WLS?lter parameter is set to spatially different by using our method;(c)is the relit result when over all the image,WLS?lter parameterλ=4.It’s observed that the larger the value ofλis,the more details are retained on the ?nal result,but the lighting effect is also weakened.Our strategy can ensure that the?nal relit result preserve well the details and generate well the shading effect at the same

time.

(a) reference

(b) target(c) r = 3(d) adaptive

r(e) r= 18 Figure7.Our relit results with different guided?lter parameters.

(a)reference image,(b)target image,(c)the relit result for setting guided?lter parameter r=3over all the image,(d)the relit result when the guided?lter parameter r is set to be different spatially over all the image,(f)the relit result for the guided?lter parameter r=18over all the image.It can be found that blurring can be seen around the eye structure in(c),shading contrast is weakened in(e), and our method in(d)can preserve the identi?cation structure and retain the shading contrast at the same time.

We then check of the effect of our modi?cation on guided ?ltering process.As shown in Figure 7,our method per-forms well in preserving the identi?cation structure and re-taining the shading of the reference object at the same time.In Figure 8,we present some experiment results.The re-sults show that our method can perform well in illumination transfer between genders,and also between color and grey images.More results are in the supplemental

material.

target

reference f e m a l e

m a l

female

male

g r e y

male Figure 8.Our illumination transfer results between genders,and also between color and grey

images.

reference target our results [Li et al.

2009]

Figure https://www.wendangku.net/doc/455231359.html,parison with Li et al.[12].Our method better retains the color information of the target face while the method of Li et al.transfers the reference color to the target face.Our method also maintains better the identi?cation structure of the target face.

Comparison with the previous methods.We compare our method with the previous work.In [2],they relight a single target image under the frontal lighting condition by using two reference images (one is the reference image un-der the frontal lighting condition and the other is the refer-ence image under the desired lighting condition).We use our method to relight a single image by using a single refer-ence image.For a sake of a fair comparison,we also relight a single image by using a pair of reference images:?rst,a pair of reference images are divided pixel-to-pixel to get a quotient image,then the image warping technique is used

to warp the reference quotient image according to the tar-get image,and a rough relit result of the target image is acquired by multiplying the target image and the warped quotient image pixel-to-pixel,then the rough relit result is used as the input of our 2D-2D method.

As shown in Figure 10,the method in [2]over-stresses that the quotient image of the relit result and the original target image is locally scaled to warped reference quotient image;therefore,it introduces more reference shading de-tails to the relit result.But the assumption that the target face and the reference face have similar geometry and re-?ectance usually could not be ful?lled in practice.Preserv-ing more reference shading details of the target face may not ensure more naturalness and realness of the relit result.In contrast,according to our method,the image is decomposed into three layers,and only the large-scale layer of lightness is operated on so as to it avoid the introduction of much shading variance caused by the geometry difference of the reference and the target faces.In addition,our method with guided ?lter stresses that the relit result with the target im-age in local window can preserve the liner model and main-tain better identi?cation characteristics of the target

face.

Chen et al.with 2ref.our results with 1ref.our results with 2ref.

reference

target

Figure https://www.wendangku.net/doc/455231359.html,parisons with Chen et al.[2].The ?rst row is 3ref-erence images from YaleB [4](it should be noted that according to the method in [2],one reference image under the frontal light-ing condition and one reference image under the desired lighting condition are needed,and the reference images under the frontal lighting condition are not shown here),and the images in the ?rst column are target images.The second row contains results of [2].The third row contains our results using a single image as the ref-erence image.The fourth row contains results of our method by using the rough results of quotient image based method as the ref-erence images.

Figure 9gives a comparison of our result with [12].Un-like [12],in our method,only the lightness layer is operated on,leading to unchanged color of the relit result.Further-more,by using edge-preserving ?lters in our method is in-troduced less shading details caused by geometry difference

(a) target(b) reference(c) our results Figure11.Illumination transfer between simple objects.((a)and (b)are from the Amsterdam Library of Object Images[5])

between the target and the reference faces.

Relighting general objects.Our method can also be used to relight the general object when the reference object and the target object have similar geometry and re?ectance. Figure11shows the results of illumination transfer between a ball and an egg.Mark points are marked by hand,and our method is then followed.It can be seen that the relighting result is natural and photo-realistic.

4.Conclusion and Discussions

In this paper,we have presented a novel image-based method for face illumination transfer through edge-preserving?lters.We also propose adaptive parameter se-lection schemes in WLS?lter and guided?lter processes in face relighting application.The main advantage of our method lies in that only a single reference image is required with3D geometry or material information of the target face. Convincing relit results demonstrate that our method is ef-fective and advantageous in preserving the identi?cation structure and skin color of the target face.

Limitation and future work.The ASM we adopt in our study could only locate frontal face well,we thus only test our method with both the target face and the reference face being frontal.And the ASM also fails to locate accurate mark points under hash lighting,and more manual manipu-lations are thus required.In future work,we would employ more effective methods to reduce the amount of manual cor-rection.In our method,we make a trade-off between pre-serving the shading and the identi?cation structures in face face structure regions.We would introduce learning based method to detect the shadows in human face for the param-eter selection in the our relighting method.To overcome the assumption of similar shape between the reference and the target faces,we would explore a scheme for reference face selection in the future work.

Acknowledgments

This work was partially supported by National Natural Science Foundation of China(90818003&60933006),Na-tional High-Tech(863)and Key Technologies R&D Pro-grams of China(2009AA01Z331&2008BAH29B02),and Specialized Research Fund for the Doctoral Program of Higher Education(20091102110019).We would like to thank Qing Li for helpful suggestions,and Hongyu Wu for data processing.

References

[1] A.Bousseau,S.Paris,and https://www.wendangku.net/doc/455231359.html,er-assisted intrinsic

images.In Proc.ACM SIGGRAPH,2009.

[2]J.Chen,G.Su,J.He,and S.Ben.Face image relighting us-

ing locally constrained global optimization.In Proc.ECCV, 2010.

[3]Z.Farbman,R.Fattal,D.Lischinski,and R.Szeliski.Edge-

preserving decompositions for multi-scale tone and detail manipulation.In Proc.ACM SIGGRAPH,2008.

[4] A.Georghiades,P.Belhumeur,and D.Kriegman.From few

to many:Illumination cone models for face recognition un-der variable lighting and pose.TPAMI,23(6):643–660,2001.

[5]J.M.Geusebroek,G.J.Burghouts,and A.W.M.Smeulders.

The amsterdam library of object images.IJCV,61(1):103–112,2005.

[6] D.Guo and T.Sim.Digital face makeup by example.In

Proc.CVPR,2009.

[7]H.Han,S.Shan,L.Qing,X.Chen,and W.Gao.Lighting

aware preprocessing for face recognition across varying illu-mination.In Proc.ECCV,2010.

[8]K.He,J.Sun,and X.Tang.Guided image?ltering.In Proc.

ECCV,2010.

[9]X.Jin,M.Zhao,X.Chen,Q.Zhao,and S.-C.Zhu.Learning

artistic lighting template from portrait photographs.In Proc.

ECCV,2010.

[10]P.Kornprobst,J.Tumblin,and F.Durand.Bilateral?ltering:

Theory and applications.Foundations and Trends in Com-puter Graphics and Vision,4(1):1–74,2009.

[11] https://www.wendangku.net/doc/455231359.html,nd and J.J.McCann.Lightness and Retinex Theory.

Journal of the Optical Society of America(1917-1983),61, Jan.1971.

[12]Q.Li,W.Yin,and Z.Deng.Image-based face illumination

transferring using logarithmic total variation models.Vis.

Comput.,26(1):41–49,2009.

[13]https://www.wendangku.net/doc/455231359.html,borrow and F.Nicolls.Locating facial features with an

extended active shape model.In Proc.ECCV,2008. [14]P.Peers,N.Tamura,W.Matusik,and P.E.Debevec.Post-

production facial performance relighting using re?ectance transfer.In Proc.ACM SIGGRAPH,2007.

[15]L.Shen,P.Tan,and S.Lin.Intrinsic image decomposition

with non-local texture cues.In Proc.CVPR,2008.

[16]V.ˇStruc and N.Paveˇs i′c.Performance Evaluation of Photo-

metric Normalization Techniques for Illumination Invariant Face Recognition.IGI Global,2010.

[17]Y.Wang,Z.Liu,G.Hua,Z.Wen,Z.Zhang,and D.Sama-

ras.Face re-lighting from a single image under harsh lighting conditions.In Proc.CVPR,2007.