当前位置：文档库 › Abstract Mining Control Flow Abnormality for Logic Error Isolation

Abstract Mining Control Flow Abnormality for Logic Error Isolation

Mining Control Flow Abnormality for Logic Error Isolation?Chao Liu Xifeng Yan Jiawei Han

Department of Computer Science

University of Illinois at Urbana-Champaign

{chaoliu,xyan,hanj}@https://www.wendangku.net/doc/da8116904.html,

Abstract

Analyzing the executions of a buggy program is essen-tially a data mining process:Tracing the data generated during program executions may disclose important pat-terns and outliers that could eventually reveal the lo-cation of software errors.In this paper,we investigate program logic errors,which rarely incur memory ac-cess violations but generate incorrect outputs.We show that through mining program control?ow abnormality, we could isolate many logic errors without knowing the program semantics.

In order to detect the control abnormality,we pro-pose a hypothesis testing-like approach that statistically contrasts the evaluation probability of condition state-ments between correct and incorrect executions.Based on this contrast,we develop two algorithms that e?ec-tively rank functions with respect to their likelihood of containing the hidden error.We evaluated these two algorithms on a set of standard test programs,and the result clearly indicates their e?ectiveness. Keywords.software errors,abnormality,ranking

1Introduction

Recent years have witnessed a wide spread of data min-ing techniques into software engineering researches,such as in software speci?cation extraction[2,22,19]and in software testings[5,7,23].For example,frequent item-set mining algorithms are used by Livshits et al.to?nd, from software revision histories,programming patterns that programmers are expected to conform to[22].A closed frequent itemset mining algorithm,CloSpan[27], is employed by Li et al.in boosting the performance of storage systems[17]and in isolating copy-paste program errors[18].Besides frequent pattern-based approaches,?This work was supported in part by the U.S.National Science Foundation NSF ITR-03-25603,IIS-02-09199,and IIS-03-08215. Any opinions,?ndings,and conclusions or recommendations expressed here are those of the authors and do not necessarily re?ect the views of the funding agencies.existing machine learning techniques,such as Support Vector Machines,decision trees,and logistic regression are also widely adopted for various software engineering tasks[5,6,7,23,21].

In this paper,we develop a new data mining algo-rithm that can assist programmers’manual debugging. Although programming language designs and software testings have greatly advanced in the past decade,soft-ware is still far from error(or bug,fault)free[8,16]. As a rough estimation,there are usually1-10errors per thousand lines of code in deployed softwares[11].Be-cause debugging is notoriously time-consuming and la-borious,we wonder whether data mining techniques can help speed up the process.As the initial exploration, this paper demonstrates the possibility of isolating logic errors through statistical inference.

The isolation task is challenging in that(1)stati-cally,a program,even of only hundreds of lines of code, is a complex system because of the inherent intertwined data and control dependencies;(2)dynamically,the ex-ecution paths can vary greatly with di?erent inputs, which further complicates the analysis;and(3)most importantly,since logic errors are usually tricky,the di?erence between incorrect and correct executions is not apparent at all.Therefore,it is almost like looking for a needle in a haystack to isolate logic errors.Due to these characteristics,isolating program errors could be a very important playground for data mining research.

From a data mining point of view,isolating logic errors is to discover suspicious buggy regions through screening bulky program execution traces.For exam-ple,our method?rst monitors the program runtime be-haviors,and then tries to discover the region where the behavior of incorrect executions(or runs)diverges from that of correct ones.Di?erent from conventional soft-ware engineering methods,this data mining approach assumes no prior knowledge of the program semantics, thus providing a higher level of automation.

Before detailing the concepts of our method,let us ?rst take a brief overview of program errors.Based

on their faulty symptoms,we can roughly classify pro-gram errors into two categories:memory errors and logic errors.Memory errors usually manifest themselves through memory abnormalities,like segmentation faults and/or memory leaks.Some tools,like Purify,Valgrind and CCured,are designed for tracking down memory errors.Logic errors,on the other hand,refer to the pro-gram logic incorrectness.Instead of causing programs to abort due to memory violations,programs with logic errors usually exit silently without segmentation faults. The only faulty symptom is its malfunction,like gen-erating none or incorrect outputs.Because logic errors do not behave abnormally at the memory access level, o?-the-shelf memory monitoring tools,like Purify,have little chance to isolate them.Considering the plethora of tools for memory errors,we plan to focus on isolating logic errors in this paper.To illustrate the challenges, let us?rst examine an example.

Program1Buggy Code-Version9of replace

01void dodash(char delim,char*src,int*i,

02char*dest,int*j,int maxset)

03{

04while(...){

05...

06if(isalnum(src[*i-1])&&isalnum(src[*i+1]) 07/*&&(src[*i-1]<=src[*i+1])*/)

08{

09for(k=src[*i-1]+1;k<=src[*i+1];k++)

10junk=addstr(k,dest,j,maxset);

11*i=*i+1;

12}

13...

14(*i)=(*i)+1;

15}

16}

Example1.Program1shows a code segment of replace,a program that performs regular expression matching and substitutions.The replace program has 512lines of non-blank C code,and consists of20func-tions.In the dodash function,one subclause(com-mented by/?and?/)that should have been included in the if condition is missed by the developer.This is a typical“incomplete logic”error.We tested this buggy program through5542test cases and found that130out of them failed to give the correct outputs.Moreover,no segmentation faults happened during executions.

For logic errors as the above,without any memory violations,the developer generally has no clues for debugging.Probably,he will resort to conventional debuggers,like GDB,for a step-by-step tracing and verify observed values against his expectations in mind. To make things worse,if the developer is not the original author of the program,which is very likely in reality, such as maintaining legacy codes,he may not even be able to trace the execution until he at least roughly understands the program.However,understanding other programmers’code is notoriously troublesome because of un-intuitive identi?ers,personalized coding styles and,most importantly,the complex control and data dependencies formulated by the original author.

Knowing the di?culty of debugging logic errors,we are interested in?nding an automated method that can prioritize the code examination for debugging,e.g.,to guide programmers to examine the buggy function?rst. Although it is unlikely to totally unload the debugging burden from human beings,we?nd it possible to min-imize programmers’workloads through machine intelli-gence.For instance,by correctly suggesting the possible buggy regions,one can narrow down the search,reduce the debugging cost,and speed up the development pro-cess.We follow this moderate thought in this study.

In order to facilitate the analysis of program execu-tions,we need to?rst encode each execution in such a way that discriminates incorrect from correct exe-cutions.This is equivalent to a proper extraction of error-sensitive features from program executions.Be-cause the faulty symptom of logic error is the unex-pected execution path,it is tempting to register the ex-act path for each execution.However,this scheme turns out to be hard to manipulate,as well as expensive to register.Noticing that executions are mainly directed by condition statements(e.g.,if,while,for),we?-nally decide to summarize the execution by the evalu-ation frequencies of conditionals.In other words,for each condition statement,we record how many times it is evaluated as true and false respectively in each execution.Although this coding scheme loses much in-formation about the exact execution,it turns out to be easy to manipulate and e?ective.

Based on the above analysis,we treat each condi-tional in the buggy program as one distinct feature and try to isolate the logic error through contrasting the behaviors of correct and incorrect executions in terms of these conditionals.Speci?cally,we regard that the more divergent the evaluation of one conditional in in-correct runs is from that of correct ones,the more likely the conditional is bug-relevant.In order to see why this choice can be e?ective,let us go back to Example1.

For convenience,we declare two boolean variables A and B as follows,

A=isalnum(src[*i-1])&&isalnum(src[*i+1]); B=src[*i-1]<=src[*i+1];

If the program were written correctly,A∧B should have been the guarding condition for the if block,i.e., the control?ow goes into the if block if and only if A∧B is true.However,in the buggy program where B is missing,any control?ow that satis?es A will fall into the block.As one may notice and we will explain in Section2,an execution is logically correct until(A∧?B)is evaluated as true when the control?ow reaches Line6.Because the true evaluation of(A∧?B) only happens in incorrect runs,and it contributes to the true evaluation of A,the evaluation distribution of the if statement should be di?erent between correct and incorrect executions.This suggests that if we monitor the program conditionals,like the A here,their evaluations will shed light on the hidden error and can be exploited for error isolation.While this heuristic can be e?ective,one should not expect it to work with any logic errors.After all,even experienced programmers may feel incapable to debug certain logic errors.Given that few e?ective ways exist for isolating logic errors, we believe this heuristic worths a try.

The above example motivates us to infer about the potential buggy region from the divergent branching probability.Heuristically,the simplest approach is to ?rst calculate the average true evaluation probabilities for both correct and incorrect runs,and treat the nu-meric di?erence as the measure of divergence.However, simple heuristic methods like the above generally suf-fer from weak performance across various buggy pro-grams.In this paper,we develop a novel and more sophisticated approach to quantifying the divergence of branching probabilities,which features a similar ratio-nale to hypothesis testing[15].Based on this quanti?-cation,we further derive two algorithms that e?ectively rank functions according to their likelihood of contain-ing the error.For example,both algorithms recognize the dodash function as the most suspicious in Example 1.

In summary,we make the following contributions.

1.We introduce the problem of isolating logic errors,a

problem not yet well solved in software engineering community,into data mining research.We believe that recent advances in data mining would help tackle this tough problem and contribute to software engineering in general.

2.We propose an error isolation technique that is

based on mining control?ow abnormity in incorrect runs against correct ones.Especially,we choose to monitor only conditionals to bear low overhead while maintaining the isolation quality.

3.We develop a principled statistical approach to

quantify the bug relevance of each condition state-

ment and further derive two algorithms to locate the

possible buggy functions.As evaluated on a set of

standard test programs,both of them achieve an en-

couraging success.

The remaining of the paper is organized as follows.

Section2?rst provides several examples that illustrate

how our method is motivated.The statistical model

and the two derived ranking algorithms are developed

in Section3.We present the analysis of experiment

results in Section4.After the discussion about related

work in Section5,Section6concludes this study.

2Motivating Examples

In this section,we re-visit Example1and explain in de-

tail the implication of control?ow abnormality to logic

errors.For clarity in what follows,we denote the pro-

gram with the subclause(src[*i-1]<=src[*i+1])

commented out as the incorrect(or buggy)program P,

and the one without comments is the correct one,de-

noted as P.Because P is certainly not available when debugging P, P is used here only for illustration pur-poses:It helps illustrate how our method is motivated.

As one will see in Section3,our method collects statis-

tics only from the buggy program P and performs all

the analysis.

Given the two boolean variables A and B as pre-

sented in Section1,let us consider their evaluation com-

binations and corresponding branching actions(either enter or skip the block)in both P and P.Figure1ex-plicitly lists the actions in P(left)and P(right),respec-tively.Clearly,the left panel shows the actual actions taken in the buggy program P,and the right one lists the expected actions if P had no errors.

A?A

B enter skip

?B enter skip

A?A

B enter skip

?B skip skip Figure1:Branching Actions in P and P Di?erences between the above two tables reveal that in the buggy program P,unexpected actions take place if and only if A∧?B evaluates to true.Explicitly, when A∧?B is true,the control?ow actually enters the block,while it is expected to skip.This incorrect control?ow will eventually lead to incorrect outputs. Therefore,for the buggy program P,one run is incorrect if and only if there exist true evaluations of A∧?B at Line6;otherwise,the execution is correct although the program contains a bug.

While the boolean expression B:(A∧?B)=true exactly characterizes the scenario under which incorrect

executions take place,there is little chance for an automated tool to spot B as bug relevant.The obvious reason is that while we are debugging the program P, we have no idea of what B is,let alone its combination with A.Because A is observable in P,we are therefore interested in whether the evaluation of A will give away the error.If the evaluation of A in incorrect executions signi?cantly diverge from that in correct ones,the if statement at line6may be regarded as bug-relevant, which points to the exact error location.

A?A B n AB nˉAB ?B n AˉB=0nˉAˉB

A?A B n AB n ˉ

AB ?B n AˉB≥1n ˉAˉB

Figure2:A Correct and Incorrect Run in P

We therefore contrast how A is evaluated di?erently between correct and incorrect executions of P.Figure 2shows the number of true evaluations for the four combinations of A and B in one correct(left)and incorrect(right)run.The major di?erence is that in the correct run,A∧?B never evaluates true(n AˉB=0)

while n

AˉB must be nonzero for one execution to be

incorrect.Since the true evaluation of A∧?B implies A=true,we therefore expect that the probability for A to be true is di?erent between correct and incorrect executions.As we tested through5,542test cases, the true evaluation probability is0.727in a correct and0.896in an incorrect execution on average.This divergence suggests that the error location(i.e.,Line6) does exhibit detectable abnormal behaviors in incorrect executions.Our method,as developed in Section3, nicely captures this divergence and ranks the dodash function as the most suspicious function,isolating this logic error.

The above discussion illustrates a simple,but moti-vating example where we can use the branching statis-tics to capture the control?ow abnormality.Sometimes when the error does not occur in a condition statement, control?ows may still disclose the error trace.In Pro-gram2,the programmer mistakenly assigns the variable result with i+1instead of i,which is a typical“o?-by-one”logic error.

The error in Program2literally has no relation with any condition statements.However,the error is triggered only when the variable junk is nonzero,which is eventually associated with the if statement at Line 5.In this case,although the branching statement is not directly involved in this o?-by-one error,the abnormal branching probability in incorrect runs can still reveal the error trace.For example,the makepat function is identi?ed as the most suspicious function by our Program2Buggy Code-Version15of replace

01void makepat(char*arg,int start,char delim, 02char*pat)

03{

04...

05if(!junk)

06result=0;

07else

08result=i+1;/*off-by-one error*/

09/*should be result=i*/

10return result;

11}

algorithms.

Inspired by the above two examples,we recognize that given that the dependency structure of a program is hard to untangle,there do exist simple statistics that capture the abnormality caused by hidden errors. Discovering these abnormalities by mining the program executions can provide useful information for error isolation.In the next section,we elaborate on the statistical ranking model that identi?es potential buggy regions.

3Ranking Model

Let T={t1,t2,···,t n}be a test suite for a program P.Each test case t i=(d i,o i)(1≤i≤n)has an input d i and the desired output o i.We execute P on each t i,and obtain the output o i=P(d i).We say P passes the test case t i(i.e.,“a passing case”)if and only if o i is identical to o i;otherwise,P fails on t i(i.e.,“a failing case”).We thus partition the test suite T into two disjoint subsets T p and T f,corresponding to the passing and failing cases respectively,

T p={t i|o i=P(d i)matches o i},

T f={t i|o i=P(d i)does not match o i}.

Since program P passes test case t i if and only if P executes correctly,we use“correct”and“passing”,“in-correct”and“failing”interchangeably in the following discussion.

Given a buggy program P together with a test suite T=T p∪T f,our task is to isolate the suspicious error region by contrasting P’s runtime behaviors on T p and T f.

3.1Feature Preparation

Motivated by previous examples,we decide to instru-ment each condition statement in program P to col-lect the evaluation frequencies at runtime.Speci?cally, we take the entire boolean expression in each condition

statement as one distinct boolean feature.For exam-ple,isalnum(src[*i-1])&&isalnum(src[*i+1])in Program1is treated as one feature.Since a feature can be evaluated zero to multiple times as either true or false in each execution,we de?ne boolean bias to summarize its exact evaluations.

Definition1.(Boolean Bias)Let n t be the number of times that a boolean feature B evaluates true,and n f the number of times it evaluates false in one execution.π(B)=n t?n f

n t+n f

is the boolean bias of B in this execution.

π(B)varies in the range of[?1,1].It encodes the distribution of B’s value:π(B)is equal to1if B always assumes true and?1as it sticks to false,and in between for all other mixtures.If a conditional is never touched during an execution,π(B)is de?ned to be0 because of no evidence favoring either true or false.

3.2Methodology Overview

Before detailed discussions about our method,we?rst lay out the main idea in this subsection.Following the convention of statistics,we use uppercase letters for random variables and lowercases for their realizations. Moreover,f(X|θ)is the probability density function (pdf)of a distribution(or population)family indexed by the parameterθ.

Let the entire test case space be T,which concep-tually contains all possible input and output pairs.Ac-cording to the correctness of P on cases from T,T can be partitioned into two disjoint sets T p and T f for pass-ing and failing cases.Therefore,T,T p,and T f can be thought as random samples from T,T p,and T f respec-tively.Let X be the random variable for the boolean bias of boolean feature B,we use f(X|θp)and f(X|θf) to denote underlying probability model that generates the boolean bias of B for cases from T p and T f respec-tively.

Definition2.(Bug Relevance)A boolean feature B is relevant to the program error if its underlying probability model f(X|θf)diverges from f(X|θp).

The above de?nition relates the probability model f(X|θ)with the hidden program error.The more signi?cantly f(X|θf)diverges from f(X|θp),the more relevant B is to the hidden error.Let L(B)be a similarity function,

(3.1)L(B)=Sim(f(X|θp),f(X|θf)),

and s(B),the bug relevance score of B can be de?ned as

(3.2)s(B)=?log(L(B)).

Using the above measure,we can rank features according to their relevance to the error.The ranking problem boils down to?nding a proper way to quantify the similarity function.This includes two problems:(1) what is a suitable similarity function?and(2)how to compute it while we do not known the closed form of f(X|θp)and f(X|θf)?In the following subsections,we will examine these two problems in detail.

3.3Feature Ranking

Because no prior knowledge of the closed form of f(X|θp)is known,we can only characterize it through general parameters,such as the population mean and variance,μp=E(X|θp)andσ2p=V ar(X|θp),i.e.,we takeθ=(μ,σ2).Whileμandσ2are taken to char-acterize f(X|θ),we do not regard their estimates as su?cient statistics,and hence we do not take normality assumptions on f(X|θ).Instead,the di?erence exhib-ited throughμandσ2is treated as a measure of the model di?erence.

Given an independent and identically distributed (i.i.d.)random sample X=(X1,X2,···,X n)from f(X|θp),μp andσ2p can be estimated by the sample mean X and sample variance S n,

μp=X=

n i=1X i

and

σ2p=S n=

n?1

i=1

(X i?X)2,

where X i represents the boolean bias of B from the i th passing case.The population meanμf and varianceσ2f of f(X|θf)can be estimated in a similar way.

Given the estimations,it may be appealing to take the di?erences of both mean and variance as the bug relevance score.For example,s(B)can be de?ned as s(B)=α?|μp?μf|+β?|σ2p?σ2f|(α,β≥0). However,heuristic methods like the above usually su?er from the dilemma of parameter settings:no guidance is available to properly set the parameters,like theαandβin the above formula.In addition,parameters that work perfectly with one program error may not generalize to other errors or programs.Therefore,in the following,we develop a principled statistical method to quantify the bug relevance,which,as one will see,is parameter-free.

Our method takes an indirect approach to quantify-ing the divergence between f(X|θp)and f(X|θf),which is supported by a similar rationale to hypothesis testing [15].Instead of writing a formula that explicitly spec-i?es the di?erence,we?rst propose the null hypothesis

H0thatθp=θf(i.e.,no divergence exists),and then under the null hypothesis,we derive a statistic Y(X) that conforms to a speci?c known distribution.Given the realized random sample X=x,if Y(x)corresponds to a small probability event,the null hypothesis H0is invalidated,which immediately suggests that f(X|θp) and f(X|θf)are divergent.Moreover,the divergence is proportional to the extent to which the null hypothesis is invalidated.

Speci?cally,the null hypothesis H0is

(3.3)μp=μf andσp=σf.

Let X=(X1,X2,···,X m)be an i.i.d.random sample from f(X|θf).Under the null hypothesis,we have E(X i)=μf=μp and V ar(X i)=σ2f=σ2p.Because X i∈[?1,1],E(X i)and V ar(X i)are both?nite. According to the Central Limit Theorem,the following statistic

(3.4)Y= m i=1X i

converges to the normal distribution N(μp,σ2p

m )as m→

+∞.

Let f(Y|θp)be the pdf of N(μp,σ2p

m ),then the

likelihood ofθp given the observation of Y is

(3.5)L(θp|Y)=f(Y|θp).

A smaller likelihood implies that H0is less likely to hold and this,in consequence,indicates that a larger divergence exists between f(X|θp)and f(X|θf). Therefore,we can reasonably set the similarity function in Eq.(3.1)as the likelihood function,

(3.6)L(B)=L(θp|Y).

According to the property of normal distribution, the normalized statistic

Z=Y?μp σp/√m

conforms to the standard normal distribution N(0,1), and

(3.7)f(Y|θp)=√m

σp

?(Z),

where?(Z)is the pdf of N(0,1),

Combining Eq.(3.2),(3.6),(3.5),and(3.7),we ?nally have the bug relevance score for boolean feature

B as

(3.8)s(B)=?log(L(B))=log(

σp

√m?(Z)).

According to Eq.(3.8),we can rank all condition state-

ments of the buggy function[20].However,we regard

that a ranking of suspicious functions is preferable to

that of individual statements because a highly relevant

condition statement is not necessarily the error root.

For example,the error in Program2does not take place

in the“if(!junk)”statement.Moreover,we generally

have higher con?dence in the quality of function rank-

ing than that of individual statements in that function

abnormality is assessed by considering all of its compo-

nent features.In the following section,we discuss how

to combine individual s(B)to a global score s(F)for

each function F.

3.4Function Ranking

Suppose a function F encompasses k boolean features

B1,···,B k,and there are m failing test cases.Let X ij

denote the boolean bias of the i th boolean feature in the

j th test case,we tabulate the statistics in the following

table.

t1t2...t m

B1X11X12...X1m X1Y1Z1

B2X21X22...X2m X2Y2Z2

........................

B k X k1X k2...X km X k Y k Z k

In this table,X i=(X i1,X i2,...,X im)represents

the observed boolean bias from the m failing runs and

Y i and Z i are similarly derived from X i as in Section

3.3.For each feature B i,we propose the null hypothesis

H i0:θi f=θi p,and obtain

f(Y i|θi p)=

√m

σi p

?(Z i).

(3.9)

Given that the function F has k features,we denote the

parameters in a vector form:

?→H0= H1

,H20,···,H k0 ,

?→θ

= θ1p,θ2p,···,θk p ,

?→Y= Y1,Y2,···,Y

k .

Under the null hypothesis?→H0,similar arguments sug-

gest that the bug relevance score s(F)can be chosen

(3.10)s(F)=?log(f(?→Y|?→θp)),

where f(?→Y|?→θp)is the joint pdf of Y i’s(i=1,2,···,k).

However,the above scoring function Eq.(3.10)does

not immediately apply because f(?→Y|?→θp)is a multivari-

ate density function.Because neither the closed forms

of f(Y i|θi p)nor the dependencies among Y i’s are avail-able,it is impossible to calculate the exact value of s(F). Therefore,in the following discussion,we propose two simple ways to approximate it.

3.4.1CombineRank

One conventional approach to untangling joint distri-bution is through the independence assumption.If we assume that boolean features B i’s are mutually indepen-dent,the population of Y i is also mutually independent with each other.Therefore,we have

(3.11)f(?→Y|?→θp)=

i=1

f(Y i|θi p)=

i=1

√m

σi p

?(Z i)

Following Eq.(3.10),

s(F)=?log(f(?→Y|?→θp))=

i=1

log(

σi p

√m?(Z

)

(3.12)

i=1

s(B i)

We name the above scoring schema(Eq.(3.12)) CombineRank as it sums over the bug relevance score of each individual condition statement with the func-tion.We note that the independence assumption is practically unrealistic because condition statements are usually intertwined,such as nested loops or the if state-ments inside while loops.However,given that no as-sumptions are made about the probability densities,one should not expect for a devices more magic than the independence assumption in decomposing the joint dis-tribution.

From the other point of view,CombineRank does make good sense in that the abnormal branchings at one condition statement may likely trigger the abnormal executions of other branches(like an avalanche)and Eq.(3.12)just captures this systematical abnormality and encode it into the?nal bug relevance score for the function F.As we will see in the experiments, CombineRank works really well in locating buggy functions.

3.4.2UpperRank

In this subsection,we propose another approach to approximating f(?→Y|?→θp),and end up with another score schema UpperRank.The main idea is based on the following apparent inequality

(3.13)f(?→Y|?→θp)≤min

1≤i≤k f(Y i|θi p)=min

1≤i≤k

√m

σi p

?(Z i).

Therefore,if we adopt the upper bound as an alternative

for f(?→Y|?→θp),s(F)can be derived as

s(F)=?log(min

1≤i≤k

√m

σi p

?(Z i))=max

1≤i≤k

s(B i).

(3.14)

This scoring schema,named UpperRank,essen-

tially picks the most suspicious feature as the“repre-

sentative”of the function.This is meaningful because if

one boolean expression is extremely abnormal,the func-

tion containing it is very likely to be abnormal.How-

ever,since UpperRank uses the upper bound as the

approximation to f(?→Y|?→θp),its ranking quality is infe-

rior to that of CombineRank when multiple peak ab-

normalities exist,as illustrated in the following example.

Figure3:CombineRank vs.UpperRank

Example2.Figure3visualizes the bug relevance score

of boolean features in both function dodash and omatch,

calculated on the faulty Version5of the replace program.

From left to right,the?rst six stubs represent scores

for the six boolean features in dodash and the following

nine are for features in omatch.In this example,

the logic error is inside dodash.As one can see,

UpperRank will rank omatch over dodash due to the

maximal peak abnormality.However,it might be better

to credit the abnormality of each feature for function

ranking,as implemented by CombineRank.Therefore,

CombineRank correctly ranks dodash as the most

bug relevant.For this reason,we generally prefer

CombineRank to UpperRank,as is also supported

by experiments.

4Experiment Results

In this section,we evaluate the e?ectiveness of both

CombineRank and UpperRank in isolating logic er-

rors.We implemented these two algorithms using C++

and Matlab and conducted the experiments on a3.2GHz

Intel Pentium4PC with1GB physical memory that

runs Fedora Core2.

Version Buggy Function Fail Runs combineRank upperRank Error Description:How to Fix Cat.

1dodash6821change*i to*i-1 2dodash3711add one if branch 3subline13033add one&&subclause 4subline143811change i to lastm 5dodash27113change=to> 7in set28354change c==ANY to c==EOL 8in set25432add one||subclause 9dodash3011add one&&subclause 10dodash2312add one&&subclause 11dodash3011change>to<= 12Macro30955change50to100in define MAXPAT50 13subline17555add one if branch 14omatch13711add one&&subclause 15makepat6011change i+1to i in result=i+1 16in set28354remove one||subclause 17esc2411change result=NEWLINE to=ESCAPE 18omatch21023add one&&subclause 19change3158rewrite the function change 20esc2211change result=ENDSTR to=ESCAPE 21getline3125rewrite the function getline 22getccl1973move one statement into if branch 23esc2212change s[*i]to s[*i+1] 24omatch17024add one if branch 25omatch322change<=to== 26omatch19866change j to j+1 28in set214243remove one||subclause 29in set26465remove one||subclause 30in set228411remove one||subclause 31omatch21023change>=to!= Error Category Legend :Incorrect Branch Expression :Misuse of Variables or Constants :O?-By-One :Misc.

Table1:Summary of Buggy Versions and Ranking Results

4.1Subject Programs

We experimented on a package of standard test pro-grams,called Siemens programs1.This package was originally prepared by Siemens Corp.Research in study of test adequacy criteria[12].The Siemens programs consist of seven programs:print tokens,print tokens2, replace,schedule,schedule2,tcas,and tot info.For each program,the Siemens researchers manually injected multiple errors,obtaining multiple faulty versions,with each version containing one and only one error.Be-cause these injected errors rightly represent common mistakes made in practice,the Siemens programs are widely adopted as a benchmark in software engineering 1A variant is available at https://www.wendangku.net/doc/da8116904.html,/aristo-tle/Tools/subjects.research[12,25,9,10].Because these injected errors are mainly logic faults,we choose them as the benchmark to evaluate our algorithms.

Except tcas(141lines),the size of these programs ranges from292to512lines of C code,excluding blanks. Because debugging tcas is pretty straightforward due to its small size,we focus the experiment study on the other six programs.In Section4.2,we?rst analyze the e?ectiveness of our algorithms on the30faulty versions of the replace program.The replace program deserves detailed examination in that(1)it is the largest and most complex one among the six programs, and(2)it covers the most varieties of logic errors. After the examination of replace,we discuss about the experiments on the other?ve programs in Section4.3.

4.2Experimental Study on Replace

The replace program contains32versions in total, among which Version0is error free.Each of the other 31faulty versions contains one and only one error in comparison with Version0.In this setting,Version0 serves as the oracle in labelling whether a particular execution is correct or incorrect.

Table1lists the error characteristics for each faulty version and the ranking results provided by CombineRank and UpperRank.Because we focus on isolating logic errors that do not incur segmenta-tion faults,Version27is excluded from examination. In Table1,the second column lists the name of the buggy function for each version and the third column shows the number of failing runs out of the5542test cases.The?nal ranks of the buggy function provided by CombineRank and UpperRank are presented in the fourth and?fth columns respectively.Taking ver-sion1for an example,the buggy function is dodash and the error causes incorrect outputs for68out of the 5542test cases.For this case,CombineRank ranks the dodash function at the second place and UpperRank identi?es it as the most suspicious.Finally,the last two columns brie?y describe each error and the category the error belongs to(Section4.2.2).Because we cannot list the buggy code for each error due to the space limit,we concisely describe how to?x each error in the“Error De-scription”column.Interested readers are encouraged to download“Siemens Programs”from the public domain for detailed examination.

4.2.1Overall E?ectiveness

The rankings in Table1indicate that both CombineRank and UpperRank work very well with the replace program.Among the30faulty versions under examination,both methods rank the buggy function within the top?ve for24versions except Versions4,19,21,22,26,and29.This implies that the developer can locate24out of the30errors if he only examines the top?ve functions of the ranked list.Although this proposition is drawn across various errors on a speci?c subject program,we believe that this does shed light on the e?ectiveness of the isolation algorithms.

In order to quantify the isolation quality of the algorithms,we choose to measure how many errors are located if a programmer examines the suggested ranked list from the top down.For example,because CombineRank ranks the buggy function at the?rst place for11faulty versions,the programmer will locate 11out of the30errors if he only examines the?rst func-tion of the ranked list for each faulty version.Moreover,because the buggy function of another?ve versions is ranked at the second place by CombineRank,the pro-grammer can locate16out of the30errors if he takes the top-2functions seriously.In this way,Figure4plots the number of located errors with respect to the top-k examination.

0 2 4 6 8 10 12 14 16 18 20

Top-k Functions

Located Errors within Top-k Functions

CombineRank

UpperRank

RandomRank

Figure4:Isolation Quality:replace

In order to assess the e?ectiveness of the algorithms, we also plot the curve for random rankings in Figure4. Because a random ranking puts the buggy function at any place with equal probability,the buggy function has a probability of k

to be within the top-k of the entire m functions.Furthermore,this also suggests that given n faulty versions,k

n errors are expected to be located if the programmer only examines the top-k functions. For the replace program under study,where m=20 and n=30,only1.5errors are expected to be located if top-1function is examined.In contrast,UpperRank and CombineRank locates9and11errors respectively. Furthermore,when top-2functions are examined,the number for UpperRank and CombineRank is lever-aged to13and16respectively whereas it is merely3for random rankings.Considering the tricky nature of these logic errors,we regard that the result shown in Figure4 is excitingly good because our method infers about the buggy function purely from the execution statistics,and assumes no more program semantics than the random ranking.As one will see in Section4.3,similar results are also observed on other programs.

4.2.2E?ectiveness for Various Errors

After the discussion about the overall e?ectiveness,one may also?nd it instructive to explore for what kinds of errors our method works well(or not).We thus break down errors in Table1into the following four categories and examine the e?ectiveness of our method on each of them.

1.Incorrect Branch Expression(IBE):This cat-

egory generally refers to errors that directly in?u-

ence program control?ows.For example,it includes omitted(e.g.,in Versions3,9,10etc.)or un-wanted

(e.g.,in Version16)logic subclauses,and misuses of

comparators(e.g.,in Versions5,6etc.).These er-rors usually sneak in when the programmer fails to think fully of the program logic.

2.Misuse of Variables or Constants(MVC):

Since similar data structures can exist in programs, developers may misuse one variable for another.

One typical scenario of introducing this kind of error is through the notorious copy-pastes:when

a programmer copies and pastes one segment of

code,he may forget to change certain variables for the destination contexts.Some tools,like CP-Miner[18],can automatically detect such copy-paste errors.

3.O?-By-One(OBO):A variable is used with the

value one more or less than it is expected.For example,in Version15,result=i is mistakenly written as result=i+1.

4.Misc.(MIS):Other types of errors that include,

but not limited to,function reformulations(e.g., in Versions19and21)and moving one statement around(e.g.,in Version22).

We label the category that each faulty ver-sion belongs to in Table1.The isolation quality of CombineRank for each category on replace is listed in Table2.Notice that the third column only lists the number of faulty versions where the buggy function is ranked at the top by CombineRank.

Error Category Error#Located(top-1)

IBE187

MVC52

OBO42

MIS30

Table2:Isolation Quality by Categories Table2also indicates that CombineRank ranks the buggy function at the top not only for errors that literally reside on condition statements but for errors that involve variable values as well.Similar conclusions can be also drawn for UpperRank based on Table1. This result rea?rms our belief that value related errors may also incur control?ow abnormality,which justi?es our choice of branching abnormality for error isolation.

Finally,we note that there do exist some logic errors that are extremely hard for machines to isolate.Con-sidering the error description in Table1,we believe that it is also a non-trivial task for experienced programmers to isolate these errors.Therefore,we cannot unpracti-cally expect one method to work well for all kinds of errors.

4.2.3CombineRank vs.UpperRank

In terms of the isolation quality,an overview of Fig-ure4may suggest that UpperRank outperforms CombineRank because the curve of UpperRank is above CombineRank for a large portion of k’s.How-ever,we believe that considering the utility to pro-grammers,the top-5ranking is less appealing than the top-2or even top-1ranking.From this point of view, CombineRank is apparently better than UpperRank: in terms of top-1,CombineRank identi?es11while UpperRank gets9,and CombineRank locates in16 within top-2’s whereas UpperRank hits13.Results from other subject programs as examined in Section4.3 also support this conclusion.

4.3More Subject Programs

We also tested our algorithms on the other?ve programs in the Siemens program package.They are two lexical analyzers print tokens and print tokens2,two priority schedulers schedule and schedule2,and tot info that computes the statistics of given input data.Although print tokens and print tokens2merely di?er a little bit in name,they are two completely di?erent programs,and so are schedule and schedule2.

Program Functions Error#Located(top-2)

replace203016

print tokens1873

print tokens219107

schedule1854

schedule216102

tot info72316

Table3:Isolation Quality across Programs

Table3lists the isolation quality of CombineRank across the six programs.The second column gives the total number of functions in each program.With the third column telling how many errors are under examination,the last column tabulates how many of them are actually located within the top-2’s.Overall, except on schedule2,CombineRank achieves similar or better quality than it does on replace.Especially, the isolation quality on print tokens2and schedule is amazingly good.However,echoing the saying that“no silver bullets exist”,CombineRank also fail to work well on certain programs.For instance,CombineRank

only locates 2out of the 10errors in schedule2within the top-1,and fail to work on the other 8errors.

0 2 4 6

0 2 4 6 8 10 12 14 16 18

N u m . o f L o c a t e d E r r o r s

Top-k Functions

Located Errors within Top-k Functions

CombineRank

UpperRank

Figure 5:Isolation Quality:print tokens2We also compared the isolation quality between CombineRank and UpperRank on these ?ve pro-grams.Results suggest that they are generally com-parable,and CombineRank tends to outperform UpperRank at the top rank range.Figure 5plots their comparison on print tokens2,in which we can see a pat-tern similar to Figure 4.

5Related Work

As the software becomes increasingly bulky in size,its

complexity has exceeded the capability of human un-derstandings.This stimulates the wide applications of

data mining techniques in solving software engineering

problems.For example,Livshits et al.apply frequent

itemset mining algorithms to software revision histories,and relate mining results with programming patterns

that programmers should conform to [22].Going one

step further,Li et al.construct the PR-Miner [19],which detects programming rules (PR)more general than that from [22].PR-Miner ?rst tokenizes the program source code and then applies CloSpan [27]to discover program-ming rules implicitly embedded in source code.Di?er-ent from the above static analysis approaches,whose analysis is based on the static entities,like the source codes and revision history,our method in this paper at-tempts to isolate logic errors from the program dynamic behaviors in executions.

For tracking down program logic errors,there are two conventional methods:software testing [4,13]and model checking [26].The aim of testings is to ?nd test cases that induce unexpected behaviors.Once a fail-ing case is found,it is the developer’s responsibility to

trace and eradicate the hidden error.By prioritizing the potential buggy functions,our method actually comple-ments the necessary human e?orts in debugging.On the other hand,model checking can detect logic errors based on a well-speci?ed program model.Theoretically,

when the model is accurate and complete,model check-ing can detect all hidden errors,regardless of the time it may take.However,quality models are usually too

expensive to construct in practice [26].In comparison,

our method does not rely on any semantic speci?cation from human beings,and automatically infers about the error location from program runtime statistics.

There exist some automated debugging tools that

do not rely on speci?cations either.AskIgor [28],which

is available online,tries to outline the cause-e?ect chains of program failures by narrowing down the di?erence between an incorrect and a correct execution.However,since its analysis relies on memory abnormality,AskIgor does not work with logic errors,as is con?rmed by our

practice.

Finally,our method also relates to outlier detection

[24,1,14]or deviation detection [3]in general.Di?erent from previous methods that directly measure the devi-ation within a certain metric spaces,like the Euclidean space,our method quanti?es the abnormality through an indirect approach that is supported by a similar ra-tionale to hypothesis testing.Since this method makes

no assumption of the metric space structure,nor re-quires users to set any parameters,it is more principled and alleviates users from the common dilemma of pa-rameter settings.To the best of our knowledge,this is

the ?rst piece of work that calibrates deviations in a hy-pothesis testing alike approach.In addition,this study

also demonstrates the power and usefulness of this ap-proach at detecting subtle deviations in complex and

intertwined environments,like the softwares.

6Conclusions

In this paper,we investigate the capability of data mining methods in isolating suspicious buggy regions through abnormality analysis of program control ?ows.For the ?rst time,through mining software trace data,we are able to,if not fully impossible,locate logic errors without knowing the program semantics.It is observed from our study that the abnormality of program traces from incorrect executions can actually provide inside information that could reveal the error location.The

statistical approach,together with two ranking algo-rithms we developed,can successfully detect and score suspicious buggy regions.A thorough examination of our approach on a set of standard test programs clearly demonstrates the e?ectiveness of this approach.

There are still many unsolved issues related to soft-ware error isolation,such as what if multiple errors exist in a program and/or only a few test cases are available.Moreover,it could be more systematic and enlightening if the problem of error localization is viewed from fea-

ture selection point of view.These are interesting issues in our future research.

Acknowledgements

We would like to thank the anonymous reviewers for their insightful and constructive suggestions.We also appreciate Zheng Shao,a?nalist of TopCoder Contest, for his participation in our user study.

References

[1] C.C.Aggarwal and P.S.Yu.Outlier detection for high

dimensional data.In Proc.of the2001ACM SIGMOD Int.Conf.on Management of Data,pages37–46,2001.

[2]G.Ammons,R.Bodik,and https://www.wendangku.net/doc/da8116904.html,rus.Mining speci?ca-

tions.In Proc.of the29th ACM SIGPLAN-SIGACT Symp.on Principles of Programming Languages,pages 4–16,2002.

[3] A.Arning,R.Agrawal,and P.Raghavan.A linear

method for deviation detection in large databases.

In Proc.of the10th ACM SIGKDD Int.Conf.on Knowledge Discovery and Data Mining,1995.

[4] B.Beizer.Software Testing Techniques.International

Thomson Computer Press,second edition,1990. [5]J.F.Bowring,J.M.Rehg,and M.J.Harrold.Active

learning for automatic classi?cation of software behav-ior.In Proc.of the2004Int.Symp.on Software testing and analysis,pages195–205,2004.

[6]Y.Brun and M.Ernst.Finding latent code errors via

machine learning over program executions.In Proc.

of the26th Int.Conf.on Software Engineering,pages 480–490,2004.

[7]W.Dickinson,D.Leon,and A.Podgurski.Finding

failures by cluster analysis of execution pro?les.In Proc.of the23rd Int.Conf.on Software Engineering, pages339–348,2001.

[8]European Space Agency.Arianne-5?ight501inquiry

board report.In http://ravel.esrin.esa.it/docs/esa-x-1819eng.pdf.

[9]P.Frankl and O.Iakounenko.Further empirical studies

of test e?ectiveness.In Proc.of the6th ACM Int.

Symp.on Foundations of Software Engineering,pages 153–162,1998.

[10]M.Harrold,G.Rothermel,R.Wu,and L.Yi.An

empirical investigation of program spectra.In Proc.of the ACM SIGPLAN-SIGSOFT workshop on Program Analysis for Software Tools and Engineering,pages83–90,1998.

[11]G.J.Holzmann.Economics of software veri?cation.In

Proc.of the2001ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering, pages80–89,2001.

[12]M.Hutchins,H.Foster,T.Goradia,and T.Os-

trand.Experiments of the e?ectiveness of data?ow-and control?ow-based test adequacy criteria.In Proc.

of the16th Int.Conf.on Software Engineering,pages 191–200,1994.[13] C.Kaner,J.Falk,and H.Q.Nguyen.Testing Com-

puter Software.Wiley,second edition,1999.

[14] E.M.Knorr,R.T.Ng,and V.Tucakov.Distance-

based outliers:algorithms and applications.The VLDB Journal,8(3-4):237–253,2000.

[15] E.Lehmann.Testing Statistical Hypotheses.Springer,

second edition,1997.

[16]N.Leveson and C.Turner.An investigation of the

https://www.wendangku.net/doc/da8116904.html,puter,26(7):18–41,1993. [17]Z.Li,Z.Chen,and Y.Zhou.Mining block correla-

tions to improve storage performance.Trans.Storage, 1(2):213–245,2005.

[18]Z.Li,S.Lu,S.Myagmar,and Y.Zhou.CP-Miner:A

tool for?nding copy-paste and related bugs in oper-ating system code.In Proceedings of the Sixth Sympo-sium on Operating System Design and Implementation, pages289–302,2004.

[19]Z.Li and Y.Zhou.PR-Miner:Automatically extract-

ing implicit programming rules and detecting viola-tions in large software code.In13th ACM SIGSOFT Symposium on the Foundations of Software Engineer-ing,Sept2005.

[20] C.Liu,X.Yan,L.Fei,J.Han,and S.P.Midki?.Sober:

statistical model-based bug localization.In Proc.of the10th European Software Engineering Conference held jointly with13th ACM SIGSOFT Int.Symp.on Foundations of Software Engineering(ESEC/FSE’05), pages286–295,2005.

[21] C.Liu,X.Yan,H.Yu,J.Han,and P.S.Yu.Mining

behavior graphs for”backtrace”of noncrashing bugs.

In In Proc.2005SIAM Int.Conf.on Data Mining (SDM’05),2005.

[22] B.Livshits and T.Zimmermann.DynaMine:Finding

common error patterns by mining software revision histories.In13th ACM SIGSOFT Symp.on the Foundations of Software Engineering,Sept2005. [23] A.Podgurski, D.Leon,P.Francis,W.Masri,

M.Minch,J.Sun,and B.Wang.Automated support for classifying software failure reports.In Proc.of the 25th Int.Conf.on Software Engineering,pages465–475,2003.

[24]S.Ramaswamy,R.Rastogi,and K.Shim.E?cient

algorithms for mining outliers from large data sets.

In Proc.of the2000ACM SIGMOD Int.Conf.on Management of Data,pages427–438,2000.

[25]G.Rothermel and M.J.Harrold.Empirical studies of

a safe regression test selection technique.IEEE Trans.

on Software Engineering,24(6):401–419,1998. [26]W.Visser,K.Havelund,G.Brat,and S.Park.Model

checking programs.In Proc.of the15th IEEE Int.

Conf.on Automated Software Engineering,2000. [27]X.Yan,J.Han,and R.Afshar.CloSpan:Mining

closed sequential patterns in large databases.In Proc.

of the Third SIAM Int.Conf.on Data Mining,2003.

[28] A.Zeller.Isolating cause-e?ect chains from computer

programs.In Proc.of ACM10th Int.Symp.on the Foundations of Software Engineering,2002.

有限元网格划分心得

有限元网格划分的基本原则划分网格是建立有限元模型的一个重要环节，它要求考虑的问题较多，需要的工作量较大，所划分的网格形式对计算精度和计算规模将产生直接影响。为建立正确、合理的有限元模型，这里介绍划分网格时应考虑的一些基本原则。 1网格数量网格数量的多少将影响计算结果的精度和计算规模的大小。一般来讲，网格数量增加，计算精度会有所提高，但同时计算规模也会增加，所以在确定网格数量时应权衡两个因数综合考虑。图1中的曲线1表示结构中的位移随网格数量收敛的一般曲线，曲线2代表计算时间随网格数量的变化。可以看出，网格较少时增加网格数量可以使计算精度明显提高，而计算时间不会有大的增加。当网格数量增加到一定程度后，再继续增加网格时精度提高甚微，而计算时间却有大幅度增加。所以应注意增加网格的经济性。实际应用时可以比较两种网格划分的计算结果，如果两次计算结果相差较大，可以继续增加网格，相反则停止计算。图1位移精度和计算时间随网格数量的变化在决定网格数量时应考虑分析数据的类型。在静力分析时，如果仅仅是计算结构的变形，网格数量可以少一些。如果需要计算应力，则在精度要求相同的情况下应取相对较多的网格。同样在响应计算中，计算应力响应所取的网格数应比计算位移响应多。在计算结构固有动力特性时，若仅仅是计算少数低阶模态，可以选择较少的网格，如果计算的模态阶次较高，则应选择较多的网格。在热分析中，结构内部的温度梯度不大，不需要大量的内部单元，这时可划分较少的网格。 2网格疏密网格疏密是指在结构不同部位采用大小不同的网格，这是为了适应计算数据的分布特点。在计算数据变化梯度较大的部位(如应力集中处)，为了较好地反映数据变化规律，需要采用比较密集的网格。而在计算数据变化梯度较小的部位，为减小模型规模，则应划分相对稀疏的网格。这样，整个结构便表现出疏密不同的网格划分形式。图2是中心带圆孔方板的四分之一模型，其网格反映了疏密不同的划分原则。小圆孔附近存在应力集中，采用了比较密的网格。板的四周应力梯度较小，网格分得较稀。其中图b中网格疏密相差更大，它比图a中的网格少48个，但计算出的孔缘最大应力相差1%，而计算时间却减小了36%。由此可见，采用疏密不同的网格划分，既可以保持相当的计算精度，又可使网格数量减小。因此，网格数量应增加到结构的关键部位，在次要部位增加网格是不必要的，也是不经济的。

_基于ANSYS的有限元法网格划分浅析

文章编号:1003-0794(2005)01-0038-02 基于ANSYS的有限元法网格划分浅析杨小兰,刘极峰,陈旋 (南京工程学院,南京210013) 摘要:为提高有限元数值的计算精度和对复杂结构力学分析的准确性,针对不同分析类型采用了不同的网格划分方法,结合实例阐述了ANSYS有限元网格划分的方法和技巧,指出了采用ANSYS有限元软件在网格划分时应注意的技术问题。关键词:ANSYS;有限元;网格;计算精度中图号:O241 82;TP391 7文献标识码:A 1 引言 ANSYS有限元分析程序是著名的C AE供应商美国ANSYS公司的产品,主要用于结构、热、流体和电磁四大物理场独立或耦合分析的CAE应用,功能强大,应用广泛,是一个便于学习和使用的优秀有限元分析程序。在ANSYS得到广泛应用的同时,许多技术人员对ANSYS程序的了解和认识还不够系统全面,在工作和研究中存在许多隐患和障碍,尤为突出的是有限元网格划分技术。本文结合工程实例,就如何合理地进行网格划分作一浅析。 2 网格划分对有限元法求解的影响有限元法的基本思想是把复杂的形体拆分为若干个形状简单的单元,利用单元节点变量对单元内部变量进行插值来实现对总体结构的分析,将连续体进行离散化即称网格划分,离散而成的有限元集合将替代原来的弹性连续体,所有的计算分析都将在这个模型上进行。因此,网格划分将关系到有限元分析的规模、速度和精度以及计算的成败。实验表明:随着网格数量的增加,计算精确度逐渐提高,计算时间增加不多;但当网格数量增加到一定程度后,再继续增加网格数量,计算精确度提高甚微,而计算时间却大大增加。在进行网格划分时,应注意网格划分的有效性和合理性。 3 网格划分的有效性和合理性 (1)根据分析数据的类型选择合理的网格划分数量在决定网格数量时应考虑分析数据的类型。在静力分析时,如果仅仅是计算结构的变形,网格数量可以少一些。如果需要计算应力,则在精度要求相同的情况下取相对较多的网格。同样在响应计算中,计算应力响应所取的网格数应比计算位移响应多。在计算结构固有动力特性时,若仅仅是计算少数低阶模态,可以选择较少的网格。如果计算的模态阶次较高,则应选择较多的网格。在热分析中,结构内部的温度梯度不大,不需要大量的内部单元,可划分较少的网格。 (2)根据分析数据的分布特点选择合理的网格疏密度在决定网格疏密度时应考虑计算数据的分布特点,在计算固有特性时,因为固有频率和振型主要取决于结构质量分布和刚度分布,采用均匀网格可使结构刚度矩阵和质量矩阵的元素不致相差很大,可减小数值计算误差。同样,在结构温度场计算中也趋于采用均匀的网格形式。在计算数据变化梯度较大的部位时,为了更好地反映数据变化规律,需要采用比较密集的网格,而在计算数据变化梯度较小的部位,为了减小模型规模,则应划分相对稀疏的网格,这样整个结构就表现出疏密不同的网格划分形式。以齿轮轮齿的有限元分析模型为例,由于分析的目的是求出齿轮啮合传动过程中齿根部分的弯曲应力,因此,分析计算时并不需要对整个齿轮进行计算,可根据圣文男原理将整个区域缩小到直接参与啮合的轮齿。虽然实际上参与啮合的齿数总大于1,但考虑到真正起作用的是单齿,通常只取一个轮齿作为分析对象,这样作可以大大节省计算机内存。考虑到轮齿应力在齿根过渡圆角和靠近齿面处变化较大,网格可划分得密一些。在进行疏密不同网格划分操作时可采用ANSYS提供的网格细化工具调整网格的疏密,也可采用分块建模法设置网格疏密度。图1所示即为采用分块建模法进行网格划分。图1(a)为内燃机中重要运动零件连杆的有限元应力分析图,由于连杆结构对称于其摆动的中间平面,其厚度方向的尺寸远小于长度方向的尺寸,且载荷沿厚度方向近似均匀分布,故可按平面应力分析处 38 煤矿机械 2005年第1期

CATIA有限元高级划分网格教程

CATIA有限元高级网格划分教程盛选禹李明志 1.1进入高级网格划分工作台（1）打开例题中的文件Sample01.CATPart。（2）点击主菜单中的【开始】→【分析与模拟】→【Advanced Meshing Tools】(高级网格划分工具），就进入【Advanced Meshing Tools】（高级网格划分工具）工作台，如图1－1所示。进入工作台后，生成一个新的分析文件，并且显示一个【New Analysis Case】（新分析算题）对话框，如图1－2所示。图1－1【开始】→【分析与模拟】→【Advanced Meshing Tools】(高级网格划分工具）（3）在【New Analysis Case】（新分析算题）对话框内选择【Static Analysis】（静力分析）选项。如果以后打开该对话框的时候均希望是计算静力分析，可以把对话框内的【Keep as default starting analysis case】（在开始时保持为默认选项）勾选。这样，下次进入本工作台时，将自动选择静力分析。（4）点击【新分析算题】对话框内的【确定】按钮，关闭对话框。 1.2定义曲面网格划分参数本节说明如何定义一个曲面零件的网格类型和全局参数。（1）点击【Meshing Method】(网格划分方法)工具栏内的【高级曲面划分】按钮

，如图1－3所示。需要在【Meshing Method】(网格划分方法)工具栏内点击中间按钮的下拉箭头才能够显示出【高级曲面划分】按钮。图1－2【New Analysis Case】（新分析算题）对话框图1－3【高级曲面划分】按钮

有限元网格划分

有限元网格划分摘要:总结近十年有限元网格划分技术发展状况。首先,研究和分析有限元网格划分的基本原则；其次,对当前典型网格划分方法进行科学地分类,结合实例,系统地分析各种网格划分方法的机理、特点及其适用范围,如映射法、基于栅格法、节点连元法、拓扑分解法、几何分解法和扫描法等；再次,阐述当前网格划分的研究热点,综述六面体网格和曲面网格划分技术；最后,展望有限元网格划分的发展趋势。关键词:有限元网格划分；映射法；节点连元法;拓扑分解法；几何分解法；扫描法；六面体网格 1 引言有限元网格划分是进行有限元数值模拟分析至关重要的一步，它直接影响着后续数值计算分析结果的精确性。网格划分涉及单元的形状及其拓扑类型、单元类型、网格生成器的选择、网格的密度、单元的编号以及几何体素。在有限元数值求解中，单元的等效节点力、刚度矩阵、质量矩阵等均用数值积分生成，连续体单元以及壳、板、梁单元的面内均采用高斯（Gauss）积分，而壳、板、梁单元的厚度方向采用辛普生（Simpson）积分。 2 有限元网格划分的基本原则有限元方法的基本思想是将结构离散化,即对连续体进行离散化,利用简化几何单元来近似逼近连续体,然后根据变形协调条件综合求解。所以有限元网格的划分一方面要考虑对各物体几何形状的准确描述,另一方面也要考虑变形梯度的准确描述。为正确、合理地建立有限元模型,这里介绍划分网格时应考虑的一些基本原则。 2.1 网格数量网格数量直接影响计算精度和计算时耗,网格数量增加会提高计

算精度,但同时计算时耗也会增加。当网格数量较少时增加网格，计算精度可明显提高,但计算时耗不会有明显增加;当网格数量增加到一定程度后,再继续增加网格时精度提高就很小,而计算时耗却大幅度增加。所以在确定网格数量时应权衡这两个因素综合考虑。 2.2 网格密度为了适应应力等计算数据的分布特点,在结构不同部位需要采用大小不同的网格。在孔的附近有集中应力,因此网格需要加密;周边应力梯度相对较小,网格划分较稀。由此反映了疏密不同的网格划分原则:在计算数据变化梯度较大的部位,为了较好地反映数据变化规律,需要采用比较密集的网格;而在计算数据变化梯度较小的部位,为减小模型规模,网格则应相对稀疏。 2.3 单元阶次单元阶次与有限元的计算精度有着密切的关联,单元一般具有线性、二次和三次等形式,其中二次和三次形式的单元称为高阶单元。高阶单元的曲线或曲面边界能够更好地逼近结构的曲线和曲面边界,且高次插值函数可更高精度地逼近复杂场函数,所以增加单元阶次可提高计算精度。但增加单元阶次的同时网格的节点数也会随之增加,在网格数量相同的情况下由高阶单元组成的模型规模相对较大,因此在使用时应权衡考虑计算精度和时耗。 2.4 单元形状网格单元形状的好坏对计算精度有着很大的影响,单元形状太差的网格甚至会中止计算。单元形状评价一般有以下几个指标: (1)单元的边长比、面积比或体积比以正三角形、正四面体、正六面体为参考基准。 (2)扭曲度:单元面内的扭转和面外的翘曲程度。 (3)节点编号:节点编号对于求解过程中总刚矩阵的带宽和波前因数有较大的影响,从而影响计算时耗和存储容量的大小 2.5 单元协调性单元协调是指单元上的力和力矩能够通过节点传递给相邻单元。为保证单元协调,必须满足的条件是: (1)一个单元的节点必须同时也是相邻点,而不应是内点或边界

有限元网格划分和收敛性

一、基本有限元网格概念 1．单元概述?几何体划分网格之前需要确定单元类型.单元类型的选择应该根据分析类型、形状特征、计算数据特点、精度要求和计算的硬件条件等因素综合考虑。为适应特殊的分析对象和边界条件,一些问题需要采用多种单元进行组合建模。? 2.单元分类选择单元首先需要明确单元的类型，在结构有限元分析中主要有以下一些单元类型：平面应力单元、平面应变单元、轴对称实体单元、空间实体单元、板单元、壳单元、轴对称壳单元、杆单元、梁单元、弹簧单元、间隙单元、质量单元、摩擦单元、刚体单元和约束单元等。根据不同的分类方法,上述单元可以分成以下不同的形式。?3。按照维度进行单元分类根据单元的维数特征，单元可以分为一维单元、二维单元和三维单元。?一维单元的网格为一条直线或者曲线。直线表示由两个节点确定的线性单元。曲线代表由两个以上的节点确定的高次单元,或者由具有确定形状的线性单元。杆单元、梁单元和轴对称壳单元属于一维单元,如图1~图3所示。 ?二维单元的网格是一个平面或者曲面，它没有厚度方向的尺寸.这类单元包括平面单元、轴对称实体单元、板单元、壳单元和复合材料壳单元等,如图4所示。二维单元的形状通常具有三角形和四边形两种，在使用自动网格剖分时,这类单元要求的几何形状是表面模型或者实体模型的边界面。采用薄壳单元通常具有相当好的计算效率。

??三维单元的网格具有空间三个方向的尺寸，其形状具有四面体、五面体和六面体,这类单元包括空间实体单元和厚壳单元，如图5所示．在自动网格划分时，它要求的是几何模型是实体模型(厚壳单元是曲面也可以）。 ? 4.按照插值函数进行单元分类根据单元插值函数多项式的最高阶数多少，单元可以分为线性单元、二次单元、三次单元和更高次的单元。线性单元具有线性形式的插值函数,其网格通常只具有角节点而无边节点，网格边界为直线或者平面.这类单元的优点是节点数量少，在精度要求不高或者结果数据梯度不太大的情况下，采用线性单元可以得到较小的模型规模.但是由于单元位移函数是线性的，单元内的位移呈线性变化,而应力是常数,因此会造成单元间的应力不连续,单元边界上存在着应力突变，如图6所示。

比较PageRank算法和HITS算法的优缺点

题目：请比较PageRank算法和HITS算法的优缺点，除此之外，请再介绍2种用于搜索引擎检索结果的排序算法，并举例说明。答： 1998年，Sergey Brin和Lawrence Page[1]提出了PageRank算法。该算法基于“从许多优质的网页链接过来的网页，必定还是优质网页”的回归关系，来判定网页的重要性。该算法认为从网页A导向网页B的链接可以看作是页面A对页面B的支持投票，根据这个投票数来判断页面的重要性。当然，不仅仅只看投票数，还要对投票的页面进行重要性分析，越是重要的页面所投票的评价也就越高。根据这样的分析，得到了高评价的重要页面会被给予较高的PageRank值，在检索结果内的名次也会提高。PageRank是基于对“使用复杂的算法而得到的链接构造”的分析，从而得出的各网页本身的特性。 HITS 算法是由康奈尔大学( Cornell University ) 的JonKleinberg 博士于1998 年首先提出。Kleinberg认为既然搜索是开始于用户的检索提问，那么每个页面的重要性也就依赖于用户的检索提问。他将用户检索提问分为如下三种：特指主题检索提问（specific queries，也称窄主题检索提问）、泛指主题检索提问（Broad-topic queries，也称宽主题检索提问）和相似网页检索提问（Similar-page queries）。HITS 算法专注于改善泛指主题检索的结果。 Kleinberg将网页（或网站）分为两类，即hubs和authorities，而且每个页面也有两个级别，即hubs（中心级别）和authorities（权威级别）。Authorities 是具有较高价值的网页，依赖于指向它的页面；hubs为指向较多authorities的网页，依赖于它指向的页面。HITS算法的目标就是通过迭代计算得到针对某个检索提问的排名最高的authority的网页。通常HITS算法是作用在一定范围的，例如一个以程序开发为主题的网页，指向另一个以程序开发为主题的网页，则另一个网页的重要性就可能比较高，但是指向另一个购物类的网页则不一定。在限定范围之后根据网页的出度和入度建立一个矩阵，通过矩阵的迭代运算和定义收敛的阈值不断对两个向量authority 和hub值进行更新直至收敛。从上面的分析可见，PageRank算法和HITS算法都是基于链接分析的搜索引擎排序算法，并且在算法中两者都利用了特征向量作为理论基础和收敛性依据。

ANSYS有限元网格划分的基本要点

ANSYS有限元网格划分的基本要点 1引言 ANSYS有限元网格划分是进行数值模拟分析至关重要的一步，它直接影响着后续数值计算分析结果的精确性。网格划分涉及单元的形状及其拓扑类型、单元类型、网格生成器的选择、网格的密度、单元的编号以及几何体素。从几何表达上讲，梁和杆是相同的，从物理和数值求解上讲则是有区别的。同理，平面应力和平面应变情况设计的单元求解方程也不相同。在有限元数值求解中，单元的等效节点力、刚度矩阵、质量矩阵等均用数值积分生成，连续体单元以及壳、板、梁单元的面内均采用高斯（Gauss）积分，而壳、板、梁单元的厚度方向采用辛普生（Simpson）积分。辛普生积分点的间隔是一定的，沿厚度分成奇数积分点。由于不同单元的刚度矩阵不同，采用数值积分的求解方式不同，因此实际应用中，一定要采用合理的单元来模拟求解。 2ANSYS网格划分的指导思想 ANSYS网格划分的指导思想是首先进行总体模型规划，包括物理模型的构造、单元类型的选择、网格密度的确定等多方面的内容。在网格划分和初步求解时，做到先简单后复杂，先粗后精，2D单元和3D单元合理搭配使用。为提高求解的效率要充分利用重复与对称等特征，由于工程结构一般具有重复对称或轴对称、镜象对称等特点，采用子结构或对称模型可以提高求解的效率和精度。利用轴对称或子结构时要注意场合，如在进行模态分析、屈曲分析整体求解时，则应采用整体模型，同时选择合理的起点并设置合理的坐标系，可以提高求解的精度和效率，例如，轴对称场合多采用柱坐标系。有限元分析的精度和效率与单元的密度和几何形状有着密切的关系，按照相应的误差准则和网格疏密程度，避免网格的畸形。在网格重划分过程中常采用曲率控制、单元尺寸与数量控制、穿透控制等控制准则。在选用单元时要注意剪力自锁、沙漏和网格扭曲、不可压缩材料的体积自锁等问题 ANSYS软件平台提供了网格映射划分和自由适应划分的策略。映射划分用于曲线、曲面、实体的网格划分方法，可使用三角形、四边形、四面体、五面体和六面体，通过指定单元边长、网格数量等参数对网格进行严格控制，映射划分只用于规则的几何图素，对于裁剪曲面或者空间自由曲面等复杂几何体则难以

ANSYS有限元分析中的网格划分

ANSYS有限元分析中的网格划分有限元分析中的网格划分好坏直接关系到模型计算的准确性。本文简述了网格划分应用的基本理论，并以ANSYS限元分析中的网格划分为实例对象，详细讲述了网格划分基本理论及其在工程中的实际应用，具有一定的指导意义。作者: 张洪才关键字: CAE ANSYS 网格划分有限元 1 引言 ANSYS有限元网格划分是进行数值模拟分析至关重要的一步，它直接影响着后续数值计算分析结果的精确性。网格划分涉及单元的形状及其拓扑类型、单元类型、网格生成器的选择、网格的密度、单元的编号以及几何体素。从几何表达上讲，梁和杆是相同的，从物理和数值求解上讲则是有区别的。同理，平面应力和平面应变情况设计的单元求解方程也不相同。在有限元数值求解中，单元的等效节点力、刚度矩阵、质量矩阵等均用数值积分生成，连续体单元以及壳、板、梁单元的面内均采用高斯（Gauss）积分，而壳、板、梁单元的厚度方向采用辛普生（Simpson）积分。辛普生积分点的间隔是一定的，沿厚度分成奇数积分点。由于不同单元的刚度矩阵不同，采用数值积分的求解方式不同，因此实际应用中，一定要采用合理的单元来模拟求解。 2 ANSYS网格划分的指导思想 ANSYS网格划分的指导思想是首先进行总体模型规划，包括物理模型的构造、单元类型的选择、网格密度的确定等多方面的内容。在网格划分和初步求解时，做到先简单后复杂，先粗后精，2D单元和3D单元合理搭配使用。为提高求解的效率要充分利用重复与对称等特征，由于工程结构一般具有重复对称或轴对称、镜象对称等特点，采用子结构或对称模型可以提高求解的效率和精度。利用轴对称或子结构时要注意场合，如在进行模态分析、屈曲分析整体求解时，则应采用整体模型，同时选择合理的起点并设置合理的坐标系，可以提高求解的精度和效率，例如，轴对称场合多采用柱坐标系。有限元分析的精度和效率与单元的密度和几何形状有着密切的关系，按照相应的误差准则和网格疏密程度，避免网格的畸形。在网格重划分过程中常采用曲率控制、单元尺寸与数量控制、穿透控制等控制准则。在选用单元时要注意剪力自锁、沙漏和网格扭曲、不可压缩材料的体积自锁等问题ANSYS软件平台提供了网格映射划分和自由适应划分的策略。映射划分用于曲线、曲面、实体的网格划分方法，可使用三角形、四边形、四面体、五面体和六面体，通过指定单元边长、网格数量等参数对网格进行严格控制，映射划分只用于规则的几何图素，对于裁剪曲面或者空间自由曲面等复杂几何体则难以控制。自由网格划分用于空间自由曲面和复杂实体，采用三角形、四边形、四面体进行划分，采用网格数量、边长及曲率来控制网格的质量。 3 ANSYS网格划分基本原则 3.1 网格数量网格数量的多少将影响计算结果的精度和计算规模的大小。一般来讲，网格数量增加，计算精度会有所提高，但同时计算规模也会增加，所以在确定网格数量时应权衡两个因数综合考虑。图1 位移精度和计算时间随网格数量的变化图1中的曲线1表示结构中的位移随网格数量收敛的一般曲线，曲线2代表计算时间随

有限元网格划分方法与基本原理

结构有限元分析中的网格划分技术及其应用实例结构有限元分析中的网格划分是否直接关系到解算的效果。本文简述了网格划分应用的基本理论，并以空间自由曲面覆盖件和大型整体网络钢筋壳体产品的有限元分析中的网格划分为实例对象，详细讲述了空间自由和三维实体的网格划分基本理论及其在工程中的实际应用，非常具有现实意义和借鉴价值。一、前言有限元网格划分是进行有限元数值模拟分析至关重要的一步，它直接影响着后续数值计算分析结果的精确性。网格划分涉及单元的形状及其拓扑类型、单元类型、网格生成器的选择、网格的密度、单元的编号以及几何体素。从几何表达上讲，梁和杆是相同的，从物理和数值求解上讲则是有区别的。同理，平面应力和平面应变情况设计的单元求解方程也不相同。在有限元数值求解中，单元的等效节点力、刚度矩阵、质量矩阵等均用数值积分生成，连续体单元以及壳、板、梁单元的面内均采用高斯（Gauss）积分，而壳、板、梁单元的厚度方向采用辛普生（Simpson）积分。辛普生积分点的间隔是一定的，沿厚度分成奇数积分点。由于不同单元的刚度矩阵不同，采用数值积分的求解方式不同，因此实际应用中，一定要采用合理的单元来模拟求解。 CAD软件中流行的实体建模包括基于特征的参数化建模和空间自由曲面混合造型两种方法。Pro/E和SoildWorks是特征参数化造型的代表，而 CATIA与Unigraphics等则将特征参数化和空间自由曲面混合造型有机的结合起来。现有CAD软件对表面形态的表示法已经大大超过了CAE软件，因此，在将CAD实体模型导入CAE软件的过程中，必须将CAD模型中其他表示法的表面形态转换到CAE软件的表示法上，转换精度的高低取决于接口程序的好坏。在转换过程中，程序需要解决好几何图形（曲线与曲面的空间位置）和拓扑关系（各图形数据的逻辑关系）两个关键问题。其中几何图形的传递相对容易实现，而图形间的拓扑关系容易出现传递失败的情况。数据传递面临的一个重大挑战是，将导入CAE程序的CAD模型改造成适合有限元分析的网格模型。在很多情况下，导入CAE程序的模型可能包含许多设计细节，如细小的孔、狭窄的槽，甚至是建模过程中形成的小曲面等。这些细节往往不是基于结构的考虑，保留这些细节，单元数量势必增加，甚至会掩盖问题的主要矛盾，对分析结果造成负面影响。 CAD模型的“完整性”问题是困扰网格剖分的障碍之一。对于同一接口程序，数据传递的品质取决于CAD模型的精度。部分CAD模型对制造检测来说具备足够的精度，但对有限元网格剖分来说却不能满足要求。值得庆幸的是，这种问题通常可通过CAD软件的“完整性检查”来修正。改造模型可取的办法是回到CAD系统中按照分析的要求修改模型。一方面检查模型的完整性，另一方面剔除对分析无用的细节特征。但在很多情况下，这种“回归”很难实现，模型的改造只有依靠 CAE软件自身。CAE中最直接的办法是依靠软件具有的“重构”功能，即剔除细部特征、缝补面和将小面“融入”大曲面等。有些专用接口在模型传递过程中甚至允许自动完成这种工作，并且通过网格剖分器检验模型的“完整性”，如发现“完整性”不能满足要求，接口程序可自动进行“完整性”修复。当几何模型距 CAE分析的要求相差太大时，还可利用CAE程序的造型功能修正几何模型。“布尔运算”是切除细节和修理非完整特征的有效工具之一。目前数据传递一般可通过专用数据接口，CAE程序可与CAD程序“交流”后生成与CAE 程序兼容的数据格式。另一种方式是通过标准图形格式如IGES、 SAT和ParaSolid传递。现有的CAD平台与通用有限元平台一般通过IGES、STL、Step、Parasolid等格式来数据交

有限元网格划分

本文讨论了有限元网格的重要概念，包括单元的分类、有限元误差的分类与影响因素;并讨论分析结果的收敛性控制方法，并由实例说明了网格质量及收敛性对取得准确分析结果的重要性。同时讨论了一些重要网格控制的建议及其他网格设定的说明。一、基本有限元网格概念 1.单元概述几何体划分网格之前需要确定单元类型。单元类型的选择应该根据分析类型、形状特征、计算数据特点、精度要求和计算的硬件条件等因素综合考虑。为适应特殊的分析对象和边界条件，一些问题需要采用多种单元进行组合建模。 2.单元分类选择单元首先需要明确单元的类型，在结构有限元分析中主要有以下一些单元类型：平面应力单元、平面应变单元、轴对称实体单元、空间实体单元、板单元、壳单元、轴对称壳单元、杆单元、梁单元、弹簧单元、间隙单元、质量单元、摩擦单元、刚体单元和约束单元等。根据不同的分类方法，上述单元可以分成以下不同的形式。 3.按照维度进行单元分类根据单元的维数特征，单元可以分为一维单元、二维单元和三维单元。一维单元的网格为一条直线或者曲线。直线表示由两个节点确定的线性单元。曲线代表由两个以上的节点确定的高次单元，或者由具有确定形状的线性单元。杆单元、梁单元和轴对称壳单元属于一维单元，如图1～图3所示。

二维单元的网格是一个平面或者曲面，它没有厚度方向的尺寸。这类单元包括平面单元、轴对称实体单元、板单元、壳单元和复合材料壳单元等，如图4所示。二维单元的形状通常具有三角形和四边形两种，在使用自动网格剖分时，这类单元要求的几何形状是表面模型或者实体模型的边界面。采用薄壳单元通常具有相当好的计算效率。三维单元的网格具有空间三个方向的尺寸，其形状具有四面体、五面体和六面体，这类单元包括空间实体单元和厚壳单元，如图5所示。在自动网格划分时，它要求的是几何模型是实体模型(厚壳单元是曲面也可以)。 4.按照插值函数进行单元分类根据单元插值函数多项式的最高阶数多少，单元可以分为线性单元、二次单元、三次单元和更高次的单元。线性单元具有线性形式的插值函数，其网格通常只具有角节点而无边节点，网格边界为直线或者平面。这类单元的优点是节点数量少，在精度要求不高或

有限元网格划分和收敛性

一、基本有限元网格概念 1.单元概述几何体划分网格之前需要确定单元类型。单元类型的选择应该根据分析类型、形状特征、计算数据特点、精度要求和计算的硬件条件等因素综合考虑。为适应特殊的分析对象和边界条件，一些问题需要采用多种单元进行组合建模。 2.单元分类选择单元首先需要明确单元的类型，在结构有限元分析中主要有以下一些单元类型：平面应力单元、平面应变单元、轴对称实体单元、空间实体单元、板单元、壳单元、轴对称壳单元、杆单元、梁单元、弹簧单元、间隙单元、质量单元、摩擦单元、刚体单元和约束单元等。根据不同的分类方法，上述单元可以分成以下不同的形式。 3.按照维度进行单元分类根据单元的维数特征，单元可以分为一维单元、二维单元和三维单元。一维单元的网格为一条直线或者曲线。直线表示由两个节点确定的线性单元。曲线代表由两个以上的节点确定的高次单元，或者由具有确定形状的线性单元。杆单元、梁单元和轴对称壳单元属于一维单元，如图1～图3所示。二维单元的网格是一个平面或者曲面，它没有厚度方向的尺寸。这类单元包括平面单元、轴对称实体单元、板单元、壳单元和复合材料壳单元等，如图4所示。二维单元的形状通常具有三角形和四边形两种，在使用自动网格剖分时，这类单元要求的几何形状是表面模型或者实体模型的边界面。采用薄壳单元通常具有相当好的计算效率。

三维单元的网格具有空间三个方向的尺寸，其形状具有四面体、五面体和六面体，这类单元包括空间实体单元和厚壳单元，如图5所示。在自动网格划分时，它要求的是几何模型是实体模型(厚壳单元是曲面也可以)。 4.按照插值函数进行单元分类根据单元插值函数多项式的最高阶数多少，单元可以分为线性单元、二次单元、三次单元和更高次的单元。线性单元具有线性形式的插值函数，其网格通常只具有角节点而无边节点，网格边界为直线或者平面。这类单元的优点是节点数量少，在精度要求不高或者结果数据梯度不太大的情况下，采用线性单元可以得到较小的模型规模。但是由于单元位移函数是线性的，单元内的位移呈线性变化，而应力是常数，因此会造成单元间的应力不连续，单元边界上存在着应力突变，如图6所示。

pagerank算法实验报告

PageRank算法实验报告一、算法介绍 PageRank是Google专有的算法，用于衡量特定网页相对于搜索引擎索引中的其他网页而言的重要程度。它由Larry Page 和Sergey Brin在20世纪90年代后期发明。PageRank实现了将链接价值概念作为排名因素。 PageRank的核心思想有2点： 1.如果一个网页被很多其他网页链接到的话说明这个网页比较重要，也就是pagerank值会相对较高； 2.如果一个pagerank值很高的网页链接到一个其他的网页，那么被链接到的网页的pagerank值会相应地因此而提高。若页面表示有向图的顶点，有向边表示链接，w(i,j)=1表示页面i存在指向页面j的超链接，否则w(i,j)=0。如果页面A存在指向其他页面的超链接，就将A 的PageRank的份额平均地分给其所指向的所有页面，一次类推。虽然PageRank 会一直传递，但总的来说PageRank的计算是收敛的。实际应用中可以采用幂法来计算PageRank，假如总共有m个页面，计算如公式所示： r=A*x 其中A=d*P+（1-d)*(e*e'/m) r表示当前迭代后的PageRank，它是一个m行的列向量,x是所有页面的PageRank初始值。 P由有向图的邻接矩阵变化而来，P'为邻接矩阵的每个元素除以每行元素之和得到。 e是m行的元素都为1的列向量。二、算法代码实现

三、心得体会在完成算法的过程中，我有以下几点体会： 1、在动手实现的过程中，先将算法的思想和思路理解清楚，对于后续动手实现有很大帮助。 2、在实现之前，对于每步要做什么要有概念，然后对于不会实现的部分代码先查找相应的用法，在进行整体编写。 3、在实现算法后，在寻找数据验证算法的过程中比较困难。作为初学者，对于数据量大的数据的处理存在难度，但数据量的数据很难寻找，所以难以进行实例分析。

有限元分析的基本步骤

一个典型的ANSYS分析过程可分为以下6个步骤： 1定义参数 2创建几何模型 3划分网格 4加载数据 5求解 6结果分析 1定义参数 1.1指定工程名和分析标题启动ANSYS软件，选择File→Change Jobname命令选择File→Change Title菜单命令 1.2定义单位 (2) 设置计算类型 ANSYS Main Menu: Preference→Material Props →Material Models →Structural →OK (3) 定义分析类型 ANSYS Main Menu: Preprocessor →Loads →Analysis Type →New Analysis→STATIC →OK 1.3定义单元类型选择Main Menu→Preprocessor→Element Type→Add/Edit/Delete命令单击[Options]按钮，在[Element behavior]下拉列表中选择[Plane strs w/thk]选项，单击确定 1.4定义单元常数在ANSYS程序主界面中选择Main Menu→Preprocessor→Real Constants→Add/Edit/Delete命令单击[Add]按钮，进行下一个[Choose Element Type]对话框 1.5定义材料参数在ANSYS程序主界面，选择Main Menu→Preprocessor→Material Props→Material Models命令 (1)选择对话框右侧Structural→Linear→Elastic→Isotropic命令，并单击[Isotropic]选项，接着弹出如下所示[Linear Isotropic Properties for Material Number 1]对话框。在[EX]文本框中输入弹性模量“200000”，在[PRXY]文本框中输入泊松比“0.3”，单击OK 2创建几何模型在ANSYS程序主界面，选择Main Menu→Preprocessor→Modeling→Creat→Areas→Rectangle →By 2Corners命令选择Main Menu→Preprocessor→Modeling→Creat→Areas→Circle→Solid Circle命令 3网格划分(之前一定要进行材料的定义和分配) 选择Main Menu→Preprocessor→Modeling→Operate→Booleans→Subtract→Arears Circle命令选择Main Menu→Preprocessor→Meshing→Mesh→Areas→Free命令，弹出实体选择对话框，单击[Pick All]按钮，得到如下所示网格 4加载数据 (1)选择Main Menu→Preprocessor→Loads→Define Loads→Apply→Structural→Displacement→On Lines命令，出现如下所示对话框，选择约束[ALL DOF]选项，并设置[Displacement value]为0，单击OK。

PageRank算法的核心思想

如何理解网页和网页之间的关系，特别是怎么从这些关系中提取网页中除文字以外的其他特性。这部分的一些核心算法曾是提高搜索引擎质量的重要推进力量。另外，我们这周要分享的算法也适用于其他能够把信息用结点与结点关系来表达的信息网络。今天，我们先看一看用图来表达网页与网页之间的关系，并且计算网页重要性的经典算法：PageRank。 PageRank 的简要历史时至今日，谢尔盖·布林（Sergey Brin）和拉里·佩奇（Larry Page）作为Google 这一雄厚科技帝国的创始人，已经耳熟能详。但在1995 年，他们两人还都是在斯坦福大学计算机系苦读的博士生。那个年代，互联网方兴未艾。雅虎作为信息时代的第一代巨人诞生了，布林和佩奇都希望能够创立属于自己的搜索引擎。1998 年夏天，两个人都暂时离开斯坦福大学的博士生项目，转而全职投入到Google 的研发工作中。他们把整个项目的一个总结发表在了1998 年的万维网国际会议上（WWW7，the seventh international conference on World Wide Web）（见参考文献[1]）。这是PageRank 算法的第一次完整表述。 PageRank 一经提出就在学术界引起了很大反响，各类变形以及对PageRank 的各种解释和分析层出不穷。在这之后很长的一段时间里，PageRank 几乎成了网页链接分析的代名词。给你推荐一篇参考文献[2]，作为进一步深入了解的阅读资料。

PageRank 的基本原理我在这里先介绍一下PageRank 的最基本形式，这也是布林和佩奇最早发表PageRank 时的思路。首先，我们来看一下每一个网页的周边结构。每一个网页都有一个“输出链接”（Outlink）的集合。这里，输出链接指的是从当前网页出发所指向的其他页面。比如，从页面A 有一个链接到页面B。那么B 就是A 的输出链接。根据这个定义，可以同样定义“输入链接”（Inlink），指的就是指向当前页面的其他页面。比如，页面C 指向页面A，那么C 就是A 的输入链接。有了输入链接和输出链接的概念后，下面我们来定义一个页面的PageRank。我们假定每一个页面都有一个值，叫作PageRank，来衡量这个页面的重要程度。这个值是这么定义的，当前页面I 的PageRank 值，是I 的所有输入链接PageRank 值的加权和。那么，权重是多少呢？对于I 的某一个输入链接J，假设其有N 个输出链接，那么这个权重就是N 分之一。也就是说，J 把自己的PageRank 的N 分之一分给I。从这个意义上来看，I 的PageRank，就是其所有输入链接把他们自身的PageRank 按照他们各自输出链接的比例分配给I。谁的输出链接多，谁分配的就少一些；反之，谁的输出链接少，谁分配的就多一些。这是一个非常形象直观的定义。

CATIA有限元高级划分网格教程

C A T I A有限元高级划分网格教程 -CAL-FENGHAI.-(YICAI)-Company One1

CATIA有限元高级网格划分教程盛选禹李明志 1.1进入高级网格划分工作台（1）打开例题中的文件。（2）点击主菜单中的【开始】→【分析与模拟】→【Advanced Meshing Tools】(高级网格划分工具），就进入【Advanced Meshing Tools】（高级网格划分工具）工作台，如图1－1所示。进入工作台后，生成一个新的分析文件，并且显示一个【New Analysis Case】（新分析算题）对话框，如图1－2所示。图1－1【开始】→【分析与模拟】→【Advanced Meshing Tools】(高级网格划分工具）（3）在【New Analysis Case】（新分析算题）对话框内选择【Static Analysis】（静力分析）选项。如果以后打开该对话框的时候均希望是计算静力分析，可以把对话框内的【Keep as default starting analysis case】（在开始时保持为默认选项）勾选。这样，下次进入本工作台时，将自动选择静力分析。（4）点击【新分析算题】对话框内的【确定】按钮，关闭对话框。 1.2定义曲面网格划分参数本节说明如何定义一个曲面零件的网格类型和全局参数。（1）点击【Meshing Method】(网格划分方法)工具栏内的【高级曲面划分】按钮，如图1－3所示。需要在【Meshing Method】(网格划分方法)工具栏内点击中间按钮的下拉箭头才能够显示出【高级曲面划分】按钮。

有限元网格划分的基本原则

有限元网格划分的基本原则杜平安《机械设计与制造》划分网格是建立有限元模型的一个重要环节，它要求考虑的问题较多，需要的工作量较大，所划分的网格形式对计算精度和计算规模将产生直接影响。为建立正确、合理的有限元模型，这里介绍划分网格时应考虑的一些基本原则。 1网格数量网格数量的多少将影响计算结果的精度和计算规模的大小。一般来讲，网格数量增加，计算精度会有所提高，但同时计算规模也会增加，所以在确定网格数量时应权衡两个因数综合考虑。图1中的曲线1表示结构中的位移随网格数量收敛的一般曲线，曲线2代表计算时间随网格数量的变化。可以看出，网格较少时增加网格数量可以使计算精度明显提高，而计算时间不会有大的增加。当网格数量增加到一定程度后，再继续增加网格时精度提高甚微，而计算时间却有大幅度增加。所以应注意增加网格的经济性。实际应用时可以比较两种网格划分的计算结果，如果两次计算结果相差较大，可以继续增加网格，相反则停止计算。图1位移精度和计算时间随网格数量的变化在决定网格数量时应考虑分析数据的类型。在静力分析时，如果仅仅是计算结构的变形，网格数量可以少一些。如果需要计算应力，则在精度要求相同的情况下应取相对较多的网格。同样在响应计算中，计算应力响应所取的网格数应比计算位移响应多。在计算结构固有动力特性时，若仅仅是计算少数低阶模态，可以选择较少的网格，如果计算的模态阶次较高，则应选择较多的网格。在热分析中，结构内部的温度梯度不大，不需要大量的内部单元，这时可划分较少的网格。 2网格疏密网格疏密是指在结构不同部位采用大小不同的网格，这是为了适应计算数据的分布特点。在计算数据变化梯度较大的部位(如应力集中处)，为了较好地反映数据变化规律，需要采用比较密集的网格。而在计算数据变化梯度较小的部位，为减小模型规模，则应划分相对稀疏的网格。这样，整个结构便表现出疏密不同的网格划分形式。图2是中心带圆孔方板的四分之一模型，其网格反映了疏密不同的划分原则。小圆孔附近存在应力集中，采用了比较密的网格。板的四周应力梯度较小，网格分得较稀。其中图b中网格疏密相差更大，它比图a中的网格少48个，但计算出的孔缘最大应力相差1%，而计算时间却减小了36%。由此可见，采用疏密不同的网格划分，既可以保持相当的计算精度，又可使网格数量减小。因此，网格数量应增加到结构的关键部位，在次要部位增加网格是不必要的，也是不经济的。

Abstract Mining Control Flow Abnormality for Logic Error Isolation

有限元网格划分心得

_基于ANSYS的有限元法网格划分浅析

CATIA有限元高级划分网格教程

有限元网格划分

有限元网格划分和收敛性

比较PageRank算法和HITS算法的优缺点

ANSYS有限元网格划分的基本要点

ANSYS有限元分析中的网格划分

有限元网格划分方法与基本原理

最新ANSYS有限元网格划分的基本原则汇总

有限元网格划分

有限元网格划分和收敛性

pagerank算法实验报告

有限元分析的基本步骤

PageRank算法的核心思想

CATIA有限元高级划分网格教程

有限元网格划分的基本原则