文档库 最新最全的文档下载
当前位置:文档库 › Multi-class supervised classification of electrical borehole wall images

Multi-class supervised classification of electrical borehole wall images

Multi-class supervised classification of electrical borehole wall images
Multi-class supervised classification of electrical borehole wall images

Multi-class supervised classi?cation of electrical borehole wall images using texture features

Matthias Jungmann a,?,Margarete Kopal b,1,Christoph Clauser b,Thomas Berlage a

a Fraunhofer Institute for Applied Information Technology,D-53754Schloss Birlinghoven,Germany

b Institute for Applied Geophysics and Geothermal Energy,E.ON Energy Research Center,RWTH Aachen University Mathieustr.6,D-52074Aachen,Germany

a r t i c l e i n f o

Article history:

Received20December2009

Received in revised form

11July2010

Accepted20August2010

Available online10November2010

Keywords:

Lithology reconstruction

Micro-resistivity data

Discriminant analysis

Texture features

Multi-class classi?cation

Binary classi?er

a b s t r a c t

Electrical borehole wall images represent micro-resistivity measurements at the borehole wall.The

lithology reconstruction is often based on visual interpretation done by geologists.This analysis is very

time-consuming and subjective.Different geologists may interpret the data differently.In this work,

linear discriminant analysis(LDA)in combination with texture features is used for an automated lithology

reconstruction of ODP(Ocean Drilling Program)borehole1203A drilled during Leg197.Six rock groups

are identi?ed by their textural properties in resistivity data obtained by a Formation MircoScanner(FMS).

Although discriminant analysis can be used for multi-class classi?cation,non-optimal decision criteria for

certain groups could emerge.For this reason,we use a combination of2-class(binary)classi?ers to

increase the overall classi?cation accuracy.The generalization ability of the combined classi?ers is

evaluated and optimized on a testing dataset where a classi?cation rate of more than80%for each of the

six rock groups is achieved.The combined,trained classi?ers are then applied on the whole dataset

obtaining a statistical reconstruction of the logged https://www.wendangku.net/doc/a213497061.html,pared to a single multi-class classi?er

the combined binary classi?ers show better classi?cation results for certain rock groups and more stable

results in larger intervals of equal rock type.

&2010Elsevier Ltd.All rights reserved.

1.Introduction

Lithology reconstruction from logging data is important for many

geological disciplines as used in scienti?c(e.g.history of the ocean

crust,Tarduno et al.,2001)and industrial(e.g.reservoir identi?cation,

Akbar et al.,2001)domains.The advantage of logging data is its

continuous nature supporting borehole studies where no core data

has been recovered.Here,the concept of lithofacies is used to

reconstruct lithology often based on conventional logs such as

porosity,density and natural gamma ray(Serra,1984).To do so,

available logging data are considered in combination with statistical

classi?ers(Bartetzko et al.,2002;Borsaru et al.,2006)or neural nets

(Benaouda et al.,1999;Chang et al.,2000;Yang et al.,2004).Unlike

conventional logging tools,the data obtained by the Formation

MicroScanner(FMS)are2-dimensional resistivity images of the

borehole wall.In a previous study(Linek et al.,2007)we showed

that the textural information of resistivity image logs can be

successfully used in combination with LDA to classify lithology.More

recent studies show,that the classi?cation accuracy for the single rock

groups is very sensitive to the texture features used.Furthermore,LDA

can produce non-optimal decision functions for certain groups if more

than two groups are to be discriminated.In this paper,further texture

features are investigated and an extended classi?cation scheme based

on weighted combinations of2-class(binary)classi?er is presented.

For each binary classi?er an optimal texture feature subset is found

within a calibration subset of images.This method improves the

discrimination of pillow lava and massive rocks in some intervals

where our previous classi?cation method failed(Linek et al.,2007).

We successfully applied this method to data from ODP(Ocean Drilling

Program)Hole1203A drilled during Leg197in the northwest Paci?c

Ocean,in which volcaniclastic sediments,pillow and massive basalts

were recovered.

2.Methodology

In this study,supervised classi?cation methods combined with

texture features are used for an automated identi?cation of six rock

groups identi?ed in FMS resistivity data.Fig.1summarizes the major

rock classes which have been differentiated in core data and their

corresponding resistivity images.As illustrated in this?gure,rocks

exhibit characteristics in the local statistic of gray values like most

natural surfaces.These allow a separation and classi?cation of

different objects.Texture analysis methods quantify these properties

Contents lists available at ScienceDirect

journal homepage:https://www.wendangku.net/doc/a213497061.html,/locate/cageo

Computers&Geosciences

0098-3004/$-see front matter&2010Elsevier Ltd.All rights reserved.

doi:10.1016/j.cageo.2010.08.008

?Corresponding author.

E-mail address:jungmann.matthias@https://www.wendangku.net/doc/a213497061.html,(M.Jungmann).

1Present address:Baker Hughes,Drilling&Evaluation Research,Houston

Technology Center,2001Rankin Road,Houston,TX77073,United States.

Computers&Geosciences37(2011)541–553

by mathematical methods (see e.g.Tuceryan and Jain,1993for an introduction).

Texture features are calculated in the small neighborhood of a pixel under investigation.The Formation MicroScanner (FMS)consists of four perpendicular oriented sensor pads with 16electrodes measuring the current needed to sustain constant potential between formation and guard electrode.This arrange-ment of electrodes results in a stripe-like image.Each image stripe represents data obtained from the pad with width of 16pixel coding the measured current for each electrode.Hence,we use a 16?16pixel sliding window for texture feature computation of each image stripe.Due to the small size of the image stripes,the sliding window is only applied vertically.Finally,the FMS images are transformed into texture logs as shown in Fig.2for eight different texture features.This ?gure shows data for a pillow lava interval as typically seen in FMS images recorded by the Ocean Drilling Program to study the oceanic crust.

The texture logs are computed from the ?rst pad of the represented FMS image.For every vertical sampling point,a feature vector ~x is constructed containing texture features x i computed at this depth.In order to successfully classify,the features need to be distinctive in terms of data belonging to the same rock group form a cluster in the feature space well-separated from clusters formed by the other rock groups.Decision boundaries between the clusters can be found using supervised classi?cation methods.This requires a training dataset manually annotated by a human expert.A new,unlabeled feature vector is assigned to a group dependent on the trained boundaries.This chapter describes the used texture features and classi?cation methods and presents our method

which increases the overall classi?cation accuracy in terms of multi-class rock classi?cation.2.1.Texture features

To cover different local geometric,spectral and statistical gray value properties in data,?ve different classes of texture features are calculated:

1.Haralick texture features :One of the de?ning qualities of textures is the spatial distribution of gray values.Haralick and Shapiro (1992)suggested the use of a gray level co-occurrence matrix (GLCM).Each entry (i ,j )in the GLCM corresponds to the number of occurrences of pixels with gray values i and j in the local neighborhood of a pixel under investigation.Different distances and orientations ~d of pixels pairs are possible.Normally neighbored pixel pairs (j ~d j ?1)are analyzed horizontally ~d ?e1,0T,vertically ~d ?e0,1Tand diagonally ~d ?e1,1T.To get a rotation invariant GLCM,the mean of the 3directional GLCM’s can be calculated.In order to estimate the similarity between different GLCM ’s,Haralick derived statistical expressions from them.We use eight statistical expressions:energy,entropy,contrast,homogeneity,maximal probability,information-cor-relation,cluster-shade and cluster-prominence.See e.g.Albregtsen (1995)for a detailed description of these features.

2.Statistical geometric features (SGF ):These features describe properties of connected regions in a sequence of binary images obtained from a gray valued image containing N gray levels

by

Fig.1.FMS texture facies assigned in ODP Hole 1203A drilled at Detroit Seamount,Northwest Paci?c in 2001.Core scan images give a detailed view of the corresponding depth interval (Linek et al.,2007).

M.Jungmann et al./Computers &Geosciences 37(2011)541–553

542

using thresholds 1Z t Z eN à1Tin successive stages (Walker,1996).The binary 0-valued and 1-valued pixels form connected regions of different size,shape and position (Fig.3).For each threshold t ,a set of six geometric properties of the connected regions for the 0-valued and for the 1-valued regions are calculated (Number of connected regions,mean irregularity of a connected region,averaged displacement from the center of gravity,averaged clump inertia,total clump area,mean clump area).For each of the 12functions dependent on the threshold t ,four statistical features are calculated (maximal value,averaged value,sample mean,sample standard deviation).Consequently,48features are formed describing geometric properties of the image structures.

3.Zernike moments :Moments are often used for image analysis because in certain combinations they are invariant to rotation,scale and translation of image structures (Hu,1962).Zernike (1934)introduced a set of complex polynomials V nl (x ,y )of order n and repetition l forming a complete orthonormal set over the interior of the unit circle.Zernike moments

Z nl ?n t1p

X x X y

V ?

nl ex ,y Tf ex ,y T

are the projection of the image f (x ,y )on the set of Zernike functions (Khotanzad and Hong,1990),where x 2ty 2r 1,

0r l r n and n àl even.V nl *(x ,y )are the complex conjugate Zernike polynomials.No image information is lost if all moments are considered.Moments of high polynomial order n contain ?ne structures but also the high-frequency noise present in the data.For this reason Zernike moments up to order n ?6are considered as a trade-off between smaller structures in the images and noise.

4.Wavelet-based texture features :The wavelet-transformation ana-lyses an image at different scales;the image is successively scaled to coarser resolutions by a low-pass ?lter.The details lost during scaling are captured by a high-pass ?lter based on a wavelet function.In the 2-dimensional case at each scale,three detail images are created containing horizontal,vertical and diagonal details dependent on the used wavelet function.The magnitude of the wavelet coef?cient is the contribution of the wavelet function to the image structure at this position and scale.In Fig.4,the wavelet-decomposition of an octagon for two coarser scales using the Daubechies-6wavelet function is shown (Daubechies,1992).According to van de Wouwer (1998),Haralick features were calculated from detail images.Another feature is the sum of the squared coef?cient magnitudes.This describes the image information (image energy)present at a certain scale and direction.Furthermore,the magnitude dis-tribution of the coef?cients u is analyzed.This distribution

can

Fig.3.Binary regions in a gray image for different thresholds t .Several statistical features are calculated for these connected

regions.

Fig.2.FMS resistivity data are transformed to texture log curves.A small subset of the texture features described in this paper are visualized for FMS data of pillow lava logged in ODP borehole 1203A.The texture logs are calculated for the ?rst pad.

M.Jungmann et al./Computers &Geosciences 37(2011)541–553543

be described by the empirical model h eu T?ke àj u j b =a

,where the ?t parameters a and b are used as discriminating features.Different wavelet functions are known in literature.The Dau-bechies wavelet function is often used for image analysis.Good results are obtained using the Daubechies-6wavelet function.Due to small image width (16pixel),only one coarser scale is analyzed.In all detail images,11texture features are calculated.5.Histogram distribution features :Some of the texture logs (Fig.2)show strong ?uctuations even for a single rock type.This suggests that the distribution of feature values in a de?ned window may be more distinctive for a rock group than the values themselves.To verify this,histograms of texture values are created by binning:the minimal and maximal texture feature values in this window are taken and all values are scaled to the number of bins (bin ànr ?bins =max àmin evalue àmin T).The histogram entries are normalized to approximate the probability distribution p (b )which is used to calculate statis-tical moments (Fig.5).Because of the scaling,the statistical features are independent of the absolute original data values.

Experimentally we found an improvement in the classi?cation rate using the statistical moments of the Haralick-and Wavelet-texture values rather than the values themselves.Good results are achieved using ?ve bins and statistical moments up to the order of 4.In Table 1,all used texture features are summarized.For every FMS depth point,in total 228texture features are calculated.Typically,statically as well as dynamically processed image logs are used to identify rock structures and lithologies.Hence,228features are calculated for both and consequently this doubles the input features.2.2.Classi?cation methods

We use linear discriminant analysis (LDA)to ?nd the decision boundaries for classi?cation.LDA increases the discriminative power of the feature values by projecting the feature values in a way that values belonging to one group are packed as tight as possible (described by the inner-scatter matrix)and

values

Fig.4.Wavelet decomposition up to the scale 2of an octagon using the Daubechies-6wavelet function.In the subimages,directional details becomes visible and in the upper left part,the subsampled image with a quarter the size of the original image

remains.

Fig.5.Creating histograms of the distribution of texture-logs in a moving window.Central statistical moments are derived from the histograms and used as features instead of using the texture values themselves.

M.Jungmann et al./Computers &Geosciences 37(2011)541–553

544

belonging to different groups are separated as far as possible (described by the between-scatter matrix)(Duda et al.,2001).For n-classes,scatter in high-dimensional feature space is sphered (whitening)by a coordinate transformation(Hastie et al.,2001)and then reduced to a(nà1)dimensional feature space using eigenvalue decomposition.An unlabeled feature vector is classi?ed by calculating the distances to the projected means of the group clusters,weighted by their variances.The group label of the nearest cluster is assigned to the vector(Bayes-decision rule).Even though LDA decreases the dimensionality of the feature space,better classi?cation results can be achieved if redundant and non-distinctive features are removed prior to classi?cation(Jain et al.,2000).Two feature extraction methods are used in this work:

1.Principal component analysis(PCA):The PCA is a transformation

of the feature vectors to a new basis given by the direction of the greatest scatter of the feature values(Duda et al.,2001).The feature vectors are recombined in a way that most of the variance of the original data is preserved.The new base vectors are the eigenvectors of the covariance matrix of the features.The reduction of the feature space is achieved by considering only eigenvectors with large eigenvalues.To gain these eigenvectors the sum of all eigenvalues is normed to100%.Eigenvectors are added until the sum of their eigenvalues exceeds90%.

2.Stepwise linear discriminant analysis(SLDA):As described earlier,

LDA increases the discriminative power of the feature values by

analyzing the inner-scatter and between-scatter matrices.The SLDA removes single features(dimensions)and other features are added.By analyzing the change of the inner-scatter and the between-scatter of clusters an optimized feature subset can be extracted.This method can be also directly used for classi?cation(Enslein et al.,1977).

Fig.6shows four examples of classi?cation results using different texture feature sets and the same classi?cation method(PCA-LDA). In each example the percentage is visualized,samples from an user-annotated rock group are assigned to correct and false groups.In Fig.6a,the statistical geometric features(SGF)calculated on dynamic FMS data are used for classi?cation.Pillow and resedimented rocks are very poorly classi?edeo20%https://www.wendangku.net/doc/a213497061.html,yered,breccia and vesicular rocks are slightly better classi?edeo50%T.The best results were received for the massive rock class with a classi?cation rate of70%.In contrast,Fig.6b represents the results for the SGF features derived from the static image data.The classi?cation rate for all groups except massive rocks was increased to60%and70%.Using SGF features from static images,the massive rocks got misclassi?ed as pillow lava!In Fig.6c Zernike moments are calculated on static FMS data.Good results are achieved for layered,resedimented and massive rocks (70–80%),poorer results for the other groups(40–50%).In Fig.6d a combination of Haralick,Zernike and Wavelet features is used.For all groups a good classi?cation result is achieved(60–85%).But for some groups(e.g.pillow)poorer results are obtained than using only a single texture group.These examples show that certain texture

Table1

Overview of all texture features used in this work.

Texture feature group Single features in each group Further processing of a single feature Number of features

Haralick energy For every single feature32

entropy4statistical moments are

contrast calculated

homogeneity

maximal probability

information-correlation

cluster-shade

cluster-prominence

Zernike polynom order n?1m?116

polynom order n?2m?0,2

polynom order n?3m?1,3

polynom order n?4m?0,2,4

polynom order n?5m?1,3,5

polynom order n?6m?0,2,4,6

SGF number of connected regions Every single feature48

mean irregularity of a conn.region is calculated for binary

average displacement from the center0-valued and

1-valued regions

averaged clump inertia

total clump area For every single feature of

mean clump area the binary regions

4statistical features

are calculated:

Maximal value

Averaged value

Sample mean

Standard deviation

Wavelet Haralick energy Every single feature is132

Haralick entropy calculated in3wavelet

Haralick contrast subimages

Haralick homogeneity

Haralick maximal probability

Haralick information-correlation

Haralick cluster-shade For every single feature in the3subimages

Haralick cluster-prominence4statistical moments are calculated

wavelet coef?cient energy

?t parameter a of parametrical model

?t parameter b of parametrical model

M.Jungmann et al./Computers&Geosciences37(2011)541–553545

groups,speci?cally,certain combinations of texture groups have a good discriminative power to separate a certain rock type from the other groups.At the same time,another rock type is not as well-separated by the current texture feature combination as by another texture feature combination.

A similar behavior is discovered for the used classi?cation method.Fig.7shows the classi?cation results using all texture features in combination with the SLDA classi?cation method (Fig.7a)and the PCA reduction with the LDA classi?cation method (Fig.7b).The SLDA totally misclassi?es the massive rocks.The PCA-LDA method obtains much better results for the massive rocks (45%)but shows a slight decline in the accuracy for the other groups.2.3.Problems concerning multi-class classi?ers

LDA uses the between-cluster distance and the within-cluster scatter of the feature values of every group for a projection of the values to a lower dimensional dataset which is used for the classi?cation.LDA assumes that the data for every group cluster are normally distributed and have equal covariances.The perfor-mance of the classi?cation can be drastically decreased if these conditions are not satis?ed (Duda et al.,2001).In a multi-class

context the projection of the values and reduction of the feature dimension can lead to further problems;LDA maximizes the squared distance between pair of classes relative to their scatter.If some clusters have large distances they dominate the eigenvalue decomposition.This means that the optimization is affected by already well-separated classes.As a consequence,LDA leads from a pair of classes with weakly predictive features to a suboptimal classi?cation (Li et al.,2003).This is signi?cant for multi-class rock classi?cation due to the similarity of texture of some rock groups.2.4.Improved learning procedure using binary classi?er

In various experiments we analyzed the effect on the classi?ca-tion accuracy if different texture feature combinations are used for classi?cation.Furthermore different combinations of feature reduction and classi?cation methods were tested.We found out that certain combinations achieved good classi?cation results on certain groups.The classi?cation of other groups was worse with the same combination.To reduce these problems,the multi-class classi?cation problem can be mapped to a combination of 2-class (binary)classi?ers.Different mapping methods are investigated in literature (e.g.Dietterich and Bakiri,1995;Tax and Duin,2002

).

Fig.6.(a–f)Classi?cation results for different texture features and texture feature combinations.

M.Jungmann et al./Computers &Geosciences 37(2011)541–553

546

We focus on the one-against-one and one-against-rest mapping scheme (Kittler et al.,1998).For further improvement of the overall classi?cation accuracy,we used a second user-annotated dataset,the testing dataset,to ?nd the best discriminating texture feature subset and best working classi?cation method.The improved learning procedure is shown in Fig.8a. 1.Datasets :At ?rst,the user has to create two annotated datasets:

Training dataset:This dataset consists of manually selected FMS intervals of known rock types and is used to derive the decision boundaries for classi?cation.

Testing dataset:This dataset consists also of manually selected FMS intervals of known rock types

and

Fig.8.(a and b)Improved learning procedure:The multi-class problem is mapped to a number of binary classi?ers.For every binary classi?er B i the best performing classi?er C k and texture subset T l is selected by evaluating a testing dataset.The binary classi?ers are combined using a distance measure on the coding matrix and performed on the whole dataset of the

borehole.

Fig.7.Classi?cation results using all texture features.(a)Using the SLDA classi?er.(b)Using the PCA for a reduction of the feature space dimension and the LDA for classi?cation in this subspace.

M.Jungmann et al./Computers &Geosciences 37(2011)541–553547

triggers the selection of a classi?er/texture feature subset combination.

2.Group mapping:The n-class classi?cation problem is mapped to

a number of binary problems.In this work two possibilities are

analyzed:

one-against-one:one group is classi?ed against another group.This is done for all groups which leads to nenà1T=2 binary classi?cation problems.

one-against-rest:one group is classi?ed against the combi-nation of all other groups.This leads to n binary classi?cation problems.

3.Selection of texture features and base classi?ers:To optimize the

classi?cation rate of a certain binary classi?er several combina-tions of texture features and classi?cation methods were automatically evaluated on the testing dataset.The previously described texture features and classi?cation methods are used.

This leads to three classi?ers(SLDA-LDA,PCA-LDA,SLDA),called base-classi?er C k.

4.Training of the binary classi?ers:The current texture feature set T l

is calculated on the training data and the base-classi?er C k is trained.Together they form the binary classi?er B i(T l,C k)for a group mapping i.

5.Evaluation for training feedback:The testing dataset is used to

?nd the best generalizing combination of texture features and base-classi?er and therefore the best binary classi?er B i best.The steps(3–5)are repeated for all group mappings.

6.Classi?cation of the whole dataset:All trained binary classi?ers

B i(T l,

C k)are applied to classify a new sample feature vector~x.

Their responses have to be combined to get an overall classi?ca-tion result for the feature vector.In this work,three methods are tested(see Dietterich and Bakiri,1995;James,1998for further details):

Voting is the most simple combining method where the sample~x is labeled to the class with the highest number of votes of the binary classi?ers.This method could only be used for the one-against-one method.

L1-distance:Most classi?cation methods return a quality value of their decision.LDA returns the probability that a sample vector belongs to a certain group.All results are

stored in a vector~B?eB best

i Twith B best

i

A?0,1 .For each row of

the coding matrix the L1-distance with~B is calculated.The sample vector~x gets the group label corresponding to the lowest L1-distance(see Fig.8b).

Centroid:For the L1-distance method the reference point of a classi?ers probability for a certain group is assumed to be0 or 1.After training,each training sample vector can be classi?ed and its probability measure returned by the classi?er can be used to derive a better model of the groups distribution in the probability space.We test two methods: The mean or the median of the probability values of the base-classi?er for a certain group can be calculated on the training data.Thus the0and1entries in the coding matrix in Fig.8b are replaced by the mean/median values of the probabilities.

For a new unlabeled sample vector~x the L2-distance of~B to the mean/median points can be calculated.The sample vector~x gets the group label corresponding to the lowest L2-distance.

7.Result:Visualization of the classi?cation in an overview

lithology.

3.Application to ODP hole1203A

The ODP(Ocean Drilling Program)and the follow-up Integrated Ocean Drilling Program(IODP)are long-term international scien-ti?c projects exploring the history of the oceanic basins and crust.ODP Hole1203A is located at Detroit Seamount in the northwest Paci?c Ocean.It was drilled to a depth of900m below the sea?oor (mbsf)during ODP Leg197.Volcanic basement was reached 457mbsf.On the basis of core specimen(core recovery was low with56.6%)six lithological units were distinguished by Shipboard Scienti?c Party Leg197(Tarduno et al.,2001).

3.1.Choice of texture facies

In ODP and IODP,downhole logging measurements are done in order to petrophysically characterize the drilled formation.These records play a crucial role in providing data where core sections cannot be obtained.A standard downhole application in ODP/IODP is mapping the micro-resistivity at the borehole wall using the Formation MicroScanner(FMS).This tool delivers electrical images at a sampling rate of2.5mm corresponding to a vertical resolution of about5mm(Goldberg,1997).In our study,light and dark gray levels represent resistive and conductive borehole features,respec-tively.According to Linek et al.(2007),we de?ned six lithological texture classes in FMS images recorded in ODP Hole1203A as documented in core description by Shipboard Scienti?c Party Leg 197(Table2).

3.2.Choice of training and testing intervals

To optimize binary classi?er to natural?uctuations in the textural appearance of the rock groups,testing data are generally selected from other depth intervals than the training data.How-ever,due to limited data quality,this cannot be achieved for all rock groups.The caliper data in Fig.10a shows an enlargement of the borehole in intervals corresponding to sedimentary rocks and volcaniclastic interbeds.The quality of the FMS data is?uctuating strongly in these intervals(Einaudi et al.,2001).Thus,image artifacts related to so-called borehole washouts are common within this depth interval,which consequently cannot be used for training or testing.In addition,the four rock groups showing most of the washout artifacts are unfortunately the least existent units in core recovery and reconstructed lithology from log responses namely breccia,layered rocks,resedimented rocks and vesicular rocks.In order to obtain classi?ers despite of poor image quality and poor core recovery for the above groups,training and testing intervals have been chosen the same(Table2).For training, 1000data points are randomly selected within the image training set.Once the classi?ers are determined from the training points, these classi?ers are tested against the total number of data points for each speci?c training class.Hence,taking the same image interval for training and testing does not imply that we use identical data points for both steps.It only means that within the same image interval of at least3200data points for2m of image data(2.5mm sampling and four pads),a small image fraction is only selected for training.Finally,the entire data points of these image groups are chosen to test the binary classi?ers.Since only a small fraction of data points is used for training and these Table2

Training and testing intervals for classi?cation.

Rock type Training interval(mbsf)Testing interval(mbsf) Breccia611–614611–614

Massive535–539635–640

Layered rock567–570567–570 Resedimented rock503–507503–507

Pillow lava572–574,621–625465–491,571–583 Vesicular rock888–890888–890

M.Jungmann et al./Computers&Geosciences37(2011)541–553 548

classi?ers are applied to a much larger image fraction for testing,the estimation for the quality of the classi?cation is still valid.3.3.Classi?cation results on the testing data

Fig.9gives an overview of the classi?cation results achieved on the testing dataset by several classi?cation methods.

Results derived from SLDA and PCA-LDA classi?cation using one base-classi?er correspond to Fig.7.As mentioned before,massive rocks are poorly recognized by SLDA,but by applying the PCA-LDA method an improvement is gained.In the middle of Fig.9,the classi?cation results from the one-against-rest method are shown.No signi?cant changes are achieved for the ?rst ?ve rock types,but the application of this method clearly improves the recognition of massive rocks.Depending on the combination used for classi?cation,we observe a correct classi?cation for massive rocks of about 70%.Besides the improvement for the massive rock type,we observe a higher variance of the classi?cation rate for resedimented rock class.Other groups show more stable results.Finally,by applying the one-against-one method we observe a further improvement for massive and resedimented rocks.Within this method,the combinations (voting,L1-distance,centroid-mean,centroid-median)only show small variations in correct classi?cations.This approach shows the most stable classi?cation results.

4.Classi?cation results for the whole dataset

In Fig.10,we illustrate main lithology reconstruction deter-mined during ODP Leg 197by Shipboard Scienti?c Party from conventional logging data.Next to this lithology column,we show our classi?cation results achieved from textural classi?cation using FMS image data only.We compare results derived from

SLDA,PCA-LDA,one-against-one (centroid)and one-against-rest (centroid)method.Other group-combining methods yield similar or worse results and are not further discussed.The con?guration of the binary classi?ers using the one-against-rest group mapping is shown in Table 3a.Intervals where the lithologies differs are marked by capital letters at the right border.

At ?rst view,the ?uctuations of assigned rock groups for deeper intervals (below 760mbsf)are conspicuous.The SLDA,PCA-LDA and one-against-one method show a similar behavior here.Even though the one-against-one method produces better results on the testing data (Fig.9)the one-against-rest method has more reliable results at deeper regions.It has apparently a better generalization ability on data not used during the training stage.In detail,the differences are:

Interval A [ca.505–515mbsf]Lithology log reports volcaniclastic rocks.The SLDA and PCA-LDA methods recognize massive rocks here.The one-against-rest method classi?es small parts of the interval as massive rocks but a greater fraction as resedimented rock.

Interval B [ca.605–610mbsf]The lithology log describes this interval as massive rock but the SLDA method recognizes only a small fraction as this rock type.The method recognizes pillow lava here.The PCA-LDA classi?es this interval correctly as massive rocks as well as the one-against-rest method.

Interval C [ca.638–643mbsf]A similar behavior of the methods as in interval B is observed.The lithology log shows massive rock.The SLDA method classi?es this interval as pillow lava.The PCA-LDA method recognizes parts as massive rock and some as pillow lava.The one-against-rest method,however,classi?es this interval as massive rock.

Interval D [ca.657–670mbsf]The lithology log describes a big block of massive rocks.The texture-based methods classify this interval completely as pillow lava.The failure of the texture-based methods suggests a change in the texture of massive

rocks.

Fig.9.Classi?cation results for the testing dataset achieved by different classi?cation methods.

M.Jungmann et al./Computers &Geosciences 37(2011)541–553549

Fig.10.Reconstructed lithologies of borehole OPD 1203A by texture-based supervised classi?cation methods:(a)caliper data,(b)lithology log,(c)lithology using the SLDA method,(d)lithology using the PCA-LDA,(e)lithology using the one-against-one method,(f)lithology using the one-against-rest method.

Table 3

Con?guration of the binary classi?er for each rock group using the one-against-rest method.For each classi?er the optimal texture feature subset,the used FMS data type,the feature reduction and classi?cation method is shown.(a)First con?guration (b)Changed testing dataset using pillow lava intervals from deeper depths.Binary class.Texture group

FMS-type

Feature reduction

Learning method

(a )

1(pillow vs rest)Haralick Zernike SGF (192features)dyn.static Identity

LDA 2(massive vs rest)SGF Wavelet (180features)dyn.PCA (28EV)LDA 3(resedim.vs rest)Zernike (16features)static

PCA (6EV)LDA 4(layered vs rest)Zernike (32features)

dyn.static PCA (10EV)

LDA 5(breccia vs rest)Haralick Zernike SGF (192features)

dyn.static SLDA (109features)LDA 6(vesicular vs rest)Haralick Zernike Wavelet (360features)dyn.static Identity LDA (b )

1(pillow vs rest)Haralick Zernike SGF (192features)

dyn.static Identity

LDA 2(massive vs rest)Haralick Zernike Wavelet (360features)dyn.static PCA (68EV)LDA 3(resedim.vs rest)Haralick (32features)dyn.Identity

LDA 4(layered vs rest)Zernike (32features)

dyn.static PCA (10EV)LDA 5(breccia vs rest)Haralick Zernike SGF (192features)

dyn.static Identity LDA 6(vesicular vs rest)

Haralick Zernike Wavelet (360features)

dyn.

static

Identity

LDA

M.Jungmann et al./Computers &Geosciences 37(2011)541–553

550

Fig.11a and b compares the training interval of the massive rocks with a sample of interval D .The training data for massive rock shows more or less homogeneous gray values with fractures of a greater distance apart.The texture of interval D rather corresponds to pillow lava.

[Interval E [ca.691–695mbsf]The lithology log reports pillow lava.The SLDA and PCA-LDA method recognize fractions of massive rock.The one-against-rest classi?er reproduces correctly the lithology log.

Interval F [ca.726–741mbsf]In this interval a similar behavior as in interval E is visible.The lithology log describes pillow lava,the SLDA and PCA-LDA methods detect massive rock and the one-against-rest method reproduces correctly the lithology log.

Interval G [ca.760–821mbsf]In this interval the lithology log describes pillow lava.The SLDA method ?uctuates between breccia,pillow lava and massive rocks.The PCA-LDA method recognizes most of this interval as breccia,only small fractions as pillow and massive rocks.The one-against-rest method classi?es most of the interval as pillow lava only small fractions as massive or resedimented rocks.

Interval H [ca.828–868mbsf]In this interval the lithology log reports pillow lava with a small fraction of massive rock.The SLDA and PCA-LDA methods classify most as resedimented rock.The one-against-rest method recognizes some parts of this interval as resedimented rock,other parts as pillow lava.

The results show that classi?ers trained with pillow lava data taken from lower depth intervals do not properly classify pillow lavas at deeper depths.Fig.11c compares the textural appearance of pillow lava at the training interval and Fig.11d pillow lava structures at greater depths (from interval H).It emphasizes the differences in pillow lobes.

The pillow lava used for training contains a ?ner structure.In addition to the FMS data the reconstructed lithology at a ?ner

resolution is shown.The training data are recognized correctly.Because of the bigger lobe structure of the pillow lava at deeper depths,homogeneous regions are recognized as volcaniclastic rocks.Another reason for the poorer classi?cation rate of pillow lava at greater depths could arise from the change of textural energy of the static FMS data.Textural energy is de?ned as the sum of squared gray values inside a sliding window.The energy texture-log is shown in Fig.12a.The textural energy for pillow lava at the training interval is about 60and decreases to about 45at deeper depths.Hence,the misclassi?cation can be explained by features quantifying the absolute gray values.The question comes up if a change of the testing dataset which triggers the selection of texture features for a certain base-classi?er,could improve the classi?cation rate of pillow lava at greater depths.The testing interval [621–625]mbsf was replaced by [859–861]mbfs.The training dataset remains the same.The lithology obtained using the changed testing dataset is shown in Fig.12d.The one-against-rest classi?er is named one-against-rest(2)below;the classi?er using the ?rst testing dataset one-against-rest(1).Table 3b shows the changed con?guration of the binary classi?ers.It is interesting to note that not the con?guration of the ?rst binary classi?er (pillow-against-rest)but the texture subsets of the second and third classi?er (massive-against-rest,resedimented-against-rest)have changed.

As supposed,the recognition rate of pillow lava was increased in the interval H.But also the classi?cation results in other intervals changed:

Interval A1A slight decline in the recognition rate of pillow lava is

visible.Massive rock is found by the one-against-rest(2)-classi?er.

Interval A2More massive rock fractions are detected.This result

corresponds better to the lithology

log.

Fig.11.Textural reasons for bad classi?cation results:Massive rock inside (a)and outside (b)the training interval.A completely different texture is visible.Pillow lava inside (c)and outside (d)the training interval and the corresponding reconstructed lithologies.A change in the size of the pillow lobe structure is visible.

M.Jungmann et al./Computers &Geosciences 37(2011)541–553551

Interval A3The one-against-rest(2)classi?er shows a decline in

the recognition rate of the volcaniclastic rocks.

Interval D The lithology log reports massive rock.The texture-based

classi?er recognize mainly pillow lava.The one-against-rest(2)-classi?er shows a slight improvement:At the end of the interval massive rock is detected correctly.

Interval E and F The lithology log reports pillow lava.Small fractions

are recognized as massive rock.The one-against-rest(1)classi?er achieved better results here.

Interval G The one-against-rest(1)-classi?er achieved already good

results in this interval.It reproduces pillow lava to a high degree,small fractions were misclassi?ed as rese-dimented rock.The one-against-rest(2)-classi?er further decreases the fractions recognized as resedimented rocks.For the discussion of the classi?cation results we mainly focused on igneous rocks.Intervals described by the lithology log as volcaniclastic rocks are reconstructed by the texture-based

classi?cation methods to a high degree.The methods discriminate between layered and resedimented rocks.The vesicular rock and breccia are only reconstructed at their training intervals.

5.Summary and future work

We successfully applied the method of texture-based super-vised classi?cation to the problem of lithology reconstruction from FMS borehole wall images acquired in ODP Hole 1203A.Different texture features like Haralick features,Zernike moments,structural geometric features (SGF)and wavelet-based features were eval-uated in combination with different classi?cation methods.We found that certain combinations of texture feature subsets in combination with a certain classi?cation method result in a better classi?cation accuracy for certain rock groups,other groups were poorly classi?ed.Especially massive rocks are classi?ed to a high degree as pillow lava due to their similarity in texture

pro?les.

Fig.12.Lithology of the one-against-rest classi?cation method optimized with a different testing dataset for pillow lava.(a)Change of the textural energy for the static FMS data,(b)lithology log,(c)lithology using the one-against-rest(1)method (identical to Fig.10f),(d)lithology using the one-against-rest(2)method with a different testing dataset.

M.Jungmann et al./Computers &Geosciences 37(2011)541–553

552

For this reason we mapped the multi-class problem to a combination of binary classi?ers.Every binary classi?er is optimized by selecting feature subsets and classi?cation methods using a testing dataset. Different schemes for mapping the multi-class classi?cation to a combination of several binary classi?ers were evaluated(one-against-one,one-against-rest)with different methods of combining the binary classi?er(voting,L1-distance,centroid).We found out that the one-against-rest method reconstructed the logging lithology at best.Particularly for deeper intervals we obtained more reliable results compared to a single multi-class classi?cation.In particular, the detection rate of small massive rock intervals was increased.We showed that the application of texture analysis methods permits an automated lithology reconstruction of the borehole wall by analyzing the textural information of the FMS resistivity data of the borehole wall.

In this work,only a small subset of possible classi?cation methods are used.We concentrate on statistical methods?nding linear decision boundaries in the feature space.In literature a great number of further methods are analyzed(e.g.support vector machines,neural nets,etc.).Since these methods create even more complex decision functions,an improved classi?cation could be possible.In principle,other learning methods could be integrated into the presented framework.Further improvement could be achieved by an optimization of the texture selection process:at the moment only groups of features are combined(e.g.Haralick features in combination with Zernike moments).Every group consists of a number of single features.A routine combining and testing of these single features could optimize the classi?cation. However,by increasing the number of possible combinations,the computing time would drastically increase during the training. Acknowledgements

This project was funded in part by the German Science Foundation DFG under operating Grant CL121/19-1to C.Clauser,T.Berlage and R.Pechnig.We used data provided by the Ocean Drilling Program (ODP).The ODP was sponsored by the US National Science Foundation (NSF)and participating countries under management of the Joint Oceanographic Institutions(JOI).We gratefully acknowledge advice from Peter Wisskirchen(FIT Fraunhofer Institute).

References

Akbar,M.,Vissapragada,B.,Alghamdi,A.,Allen,D.,Herron,M.,Carnegie,A.,Dutta,D., Olesen,J.,Chourasiya,R.,Logan,D.,et al.,2001.A snapshot of carbonate reservoir evaluation.Oil?eld Review21.

Albregtsen, F.,1995.Statistical Texture Measures Computed from Gray Level Cooccurrence Matrices.Image Processing Laboratory,Department of Infor-matics,University of Oslo.

Bartetzko,A.,Pechnig,R.,Wohlenberg,J.,2002.Interpretation of well-logging data to study lateral variations in young oceanic crust:DSDP/ODP Holes504B and896A,

Costa Rica Rift.In:Lovell,M.,Parkinson,N.(Eds.),Geological Application of Well Logs-AAPG.Methods in Exploration Series,pp.213–228.

Benaouda,D.,Wadge,G.,Whitmarsh,R.,Rothwell,R.,MacLeod,C.,1999.Inferring the lithology of borehole rocks by applying neural network classi?ers to downhole logs:an example from the Ocean Drilling Program.Geophysical Journal International136(2),477–491.

Borsaru,M.,Zhou,B.,Aizawa,T.,Karashima,H.,Hashimoto,T.,2006.Automated lithology prediction from PGNAA and other geophysical logs.Applied Radiation and Isotopes64(2),272–282.

Chang,H.,Kopaska-Merkel,D.,Chen,H.,Durrans,S.,2000.Lithofacies identi?cation using multiple adaptive resonance theory neural networks and group decision expert https://www.wendangku.net/doc/a213497061.html,puters and Geosciences26(5),591–601.

Daubechies,I.,1992.Ten Lectures on Wavelets.CBMS-NSF Regional Conference Series in Applied Mathematics,Society for Industrial and Applied Mathematics (SIAM),vol.61.Philadelphia,PA,USA,ISBN:0898712742.

Dietterich,T.G.,Bakiri,G.,1995.Solving multiclass learning problems via error-correcting output codes.Journal of Arti?cial Intelligence Research2(1), 263–286.

Duda,R.O.,Hart,P.E.,Stork,D.G.,2001.Pattern Classi?cation,second ed.Wiley, ISBN:0471056693.

Einaudi,F.,Buysch,A.,Stoll,J.,2001.Leg197Logging Summary./http://www.

https://www.wendangku.net/doc/a213497061.html,/science_results/leg_summaries/Leg197/S.

Enslein,K.,Ralston,A.,Wilf,H.S.,1977.Statistical Methods for Digital Computers.

Wiley,New York,ISBN:0471706906.

Goldberg,D.,1997.The role of downhole measurements in marine geology and geophysics.Review of Geophysics35(3),315–342.

Haralick,R.M.,Shapiro,L.G.,https://www.wendangku.net/doc/a213497061.html,puter and Robot Vision,?rst ed.Addison-Wesley Longman Publishing Co.,Inc.,Boston,MA,USA,ISBN:020*******. Hastie,T.,Tibshirani,R.,Friedman,J.,2001.The Elements of Statistical Learning.

Springer Series in Statistics,Springer,New York,USA,ISBN:0387952845. Hu,M.K.,1962.Visual pattern recognition by moment invariants.IRE Transactions on Information Theory8(2),179–187.

Jain, A.,Duin,R.,Mao,J.,2000.Statistical pattern recognition:a review.IEEE Transaction on Pattern Analysis and Machine Intelligence,4–37.

James,G.,1998.Majority Vote Classi?ers:Theory and Applications.Ph.D.Thesis, Stanford University.

Khotanzad,A.,Hong,Y.H.,1990.Invariant image recognition by zernike moments.

IEEE Transaction on Pattern Analysis and Machine Intelligence12(5),489–497. Kittler,J.,Hatef,M.,Duin,R.,Matas,J.,1998.On combining classi?ers.IEEE Transactions on Pattern Analysis and Machine Intelligence,226–239.

Li,T.,Zhu,S.,Ogihara,M.,https://www.wendangku.net/doc/a213497061.html,ing discriminant analysis for multi-class classi?cation.In:Third IEEE International Conference on Data Mining.ICDM 2003,2003,pp.589–592.

Linek,M.,Jungmann,M.,Berlage,T.,Pechnig,R.,Clauser,C.,2007.Rock classi?cation based on resistivity patterns in electrical borehole wall images.Journal of Geophysics and Engineering4(2),171.

Serra,O.,1984.Fundamentals of Well-Log Interpretation.1:The Acquisition of Logging Data.Elsevier:Elf Aquitaine,ISBN:0444421327.

Tarduno,J.,Duncan,R.,Cottrell,R.,Scholl,D.,2001.Motion of Hawaiian hotspot during formation of the emperor seamounts:initial results of ODP Leg197.

In:AGU Fall Meeting Abstracts,p.0915.

Tax,D.,Duin,R.,https://www.wendangku.net/doc/a213497061.html,ing two-class classi?ers for multiclass classi?cation.

In:International Conference on Pattern Recognition,vol.16,pp.124–127. Tuceryan,M.,Jain,A.,1993.Texture analysis.Handbook of Pattern Recognition and Computer Vision,pp.235–276.

van de Wouwer,G.,1998.Wavelets for Multiscale Texture Analysis.Ph.D.Thesis, Universiteit Antwerpen.

Walker,R.,1996.Statistical geometric features—re?nements for cytological texture analysis.Technical Report,University of Queensland,Department of Electrical and Computer Engineering.

Yang,Y.,Aplin,A.,Larter,S.,2004.Quantitative assessment of mudstone lithology using geophysical wireline logs and arti?cial neural networks.Petroleum Geoscience10(2),141–151.

Zernike,F.,1934.Diffraction theory of the cut procedure and its improved form, the phase contrast method.Physica1,689–704.

M.Jungmann et al./Computers&Geosciences37(2011)541–553553

相关文档