当前位置：文档库 › 2_Duffey_ICONE_19._Osaka2011.ver01.duffey

2_Duffey_ICONE_19._Osaka2011.ver01.duffey

Fukushima and Macondo are parallel happenings in two major energy industries with similar themes and consequences

Character: extreme events with large public reaction

Risk and Impacts: all underestimated beforehand

Perceived: major disasters harming environment Consequence: massive damage to reputation and companies Reaction: new safety requirements and inspections

Enhanced: emergency preparedness

Implications : widespread

We need to be able to explain data, improve safety, reduce risk and make predictions

Some simple questions to pose and try to answer:

?What is the risk of a major accident ?

?When technology or designs change how does safety change?

?What do the past events imply?

?How can predictions be made ?

?What risks are tolerable or acceptable?

?How are these risks to be managed ?

?How safe are the operating crews?

?What is a cost effective improvement?

?What about unknowns?

?What is the present knowledge?

?What if anything should be done differently in the future?

?How can or should the industry operate?

?……………..?

Fundamental idea and postulate

?Risk is caused by uncertainty, and the measure of uncertainty is

probability

?Modern systems and structural failures do not just involve mechanics,

components and statistics

?All modern systems include people whose contribution dominates, thus

making failures complex , while barriers will be penetrated

?To understand and predict failures it is essential to include people: their

actions, mistakes, skills, decisions, responses, learning and motivation ?Therefore ,we must explicitly include learned behavior(s) with increasing

experience and risk exposure

?Based on systems outcome data, we developed a unifying emergent

theory of learning thus avoiding excessive complication

?Treat all outcomes as occurring with some uncertainty (probability ) and

hence predictability

?Also treat rare events, “fat tails” and unknowns as a minimum attainable ?Aim is to predict and hence manage future risk and their consequences

UNRESTRICTED / ILLIMITé

Managing Risk:

Elements of a General Emergent Theory

?All failures include the human contribution, and we all (systems and

individuals ) follow a learning curve

?“Rare” events occur or re-occur on average at about the same maximum

interval achieved by all other modern systems (universality of failure )

?It is all about predicting probability, where the “Fat Tail” is due to the

human contribution

?Failure predictions, including rare or unknown events, can be described

with the same methods and measures used for all existing and known homo-technological systems

?With future (increasing) risk exposure/experience, extrapolations of

standard statistical, “power laws” and Pareto distributions will grossly under predict risk (missing unknowns, black swans and the risk

plateau)

?The relevant risk exposure and experience measures must be chosen to

provide relative predictions of risk(uncertainty) , failure and learning tren

What about catastrophic failures: Random? Human? Tolerable? Avoidable? Predictable?

What do such unexpected failures all have in common -apart from costing billions?

All failures include the inseparable human element-we design systems to assumed failure modes, safety margins and accident scenarios, with added safety precautions , and then operate them until some unforeseen failure occurs-why are we then surprised?

The black balls are observed outcomes–what can we learn from the rare and the unexpected?

Risk is measured by our uncertainty-the measure of uncertainty is probability

Are there “Tolerable Risk” Boundaries ?

The Learning Hypothesis

Human learn from their mistakes, continually correcting errors and their mental “rules” based on experience, as

an inseparable part of the total system.

The rate of decrease of the rate of outcomes with experience or risk exposure is taken as proportional to the rate so,

∝

With always a finite minimum rate, and a learning constant, k,

Integrating gives the solution to the Minimum Error Rate Equation as a rate that decreases exponentially as,

The Paradoxes of Learning Lessons ?Paradox 1

Without having the events we want to avoid -we cannot learn

?Paradox 2

All events are preventable -but only afterwards

?Paradox 3

All events are acceptable to society -until they actually occur

?Paradox 4

Rare events do not allow prior learning -so surprise us all

Corollary

Systems and societies “behave” and reflect the humans

learning , rule revising and error correcting within them -but

regrettably having no “perfect learning” affects risk perception

Predicting failure: measure for experience and risk exposure varies with the system

System/ Technology Experience or Risk

Exposure

Outcomes

Commercial Aircraft Flight hours Fatal crashes and

Near Misses

Offshore Oil Rigs Production

amounts Spills, fires and explosions

Power Grids Outage duration Probability and

time of non-

recovery Rocket Launches Launch count X

Burn time

Launch failure

Software/ Procedures Testing number or

time

Faults and errors

Manufacturing and Market Share Production or

sales quantity

Product cost or

price reduction

Commercial Aircraft Near Miss Rates

1 per 200,000h

350

Reported Near Miss Rates

R a t e p e r 100,000h (I R )

(US 1987-1998 Canada 1989-1998 UK 1990-1999)

100

150

200

250

300

Accumulated Experience (MFhrs)

US NMAC

Canada Airprox UK Airprox

NMAC learning curve model

Data Sources: FAA,CAA and TSB

The Learning Hypothesis describes the Universal Learning Curve that the Data Show

E* = exp-3N*

See paper and references for details and list of systems studied

Need surrogate for experience and risk measure

Reflects what we know about our risk exposure and learning

Identical to Laws of Practice, so systems reflect people within them

Syllables

Ball tosses

Upsidedown writing

Typesetting

Coding

0.60.70.80.91

Non-dimensional practice ,t*

Therefore, Practice = Experience, and repeated trials, t ≡ε

Hence, external system outcomes reflect individual learning

Learning from experience: knowing the failure rate, the prior probability of failure uses standard reliability definitions

The outcome probability is just the cumulative distribution function, CDF, conventionally written as F(τ), the fraction that fails by τ, so:

p(τ) ≡ F(τ) = 1-exp -∫λdτ

where the failure rate λ(τ) = h(τ) = f(τ)/R(τ) = {1/(1-F)}dF/d τ, where f(τ) = dF/d τ.

Carrying out the integration from an initial experience, ε

0, to any interval, ε, we

obtain the probability of an outcome as the double exponential:

p(τ) = 1 –exp {(λ-λ

m )/k –λ

τ)}

where, from the minimum error rate equation (the MERE), the failure rate is

λ(τ) = λ

m + (λ

-λ

) exp -kτ

Now λ

m is the lowest achievable rate, and λ(τ

) = λ

at the initial experience, ε

accumulated up to or at the initial outcome(s), and

= 1/τ for the very first, rare or initial outcome, like an inverse “power law”. In the usual engineering reliability terminology, for n failures out of N total: Failure probability,p(τ) = (1 -R(τ)) = # failures/total number = n/N, and the frequency is known if n and N are known (and generally N is not known).

The prior learning Human Bathtub

p(τ) = 1 –exp {(λ -λ

)/k -λ

(τ –τ

)} Probability of an organizational failure or an individual error

Log scale

Increasing Experience (for the homo-technological system (HTS)) A

(

)

The initial or first event has a

purely random

occurrence

We descend the curve by

learning from experience

thus reducing the chance

The bathtub bottom or minimum

risk is eventually achieved

Eventually, when very large

experience is attained, we

climb up the curve again

because we must

have an event

? R B Duffey

& J W Saull 2004

or risk

Challenge: Predicting failures with little or no data Now as experience is gained and learning occurs, the failure rate falls to the minimum

achievable, λm, and eventually we reach a lifetime probability or service lifetime as the risk exposure measure, τ→T .

For illustrative convenience, we take the maximum lifetime, T, as corresponding to an equal 50:50 chance of the certainty of failure or survival. This half lifetime is given when p(T) ~ 0.5, or p(T) = 1 –exp -λm T ,

The equal chance of failure or survival then occurs when exp -λm T = 0.5, or λm T = -ln 0.5 =

0.693, or at a service half-life or accumulated risk exposure of

T ~ 0.69/ λm

The maximum half lifetime, T, or “likely service life” before failure , is therefore expressed as proportional to the inverse of the minimum attainable failure rate per past unit experience.

So all we have to do is provide a lower bound estimate for the failure rate, λm, and if and as additional failure data are gathered, the known achievable or attainable failure rate can always be updated to reflect this additional experience and/or risk exposure.

To determine the minimum failure rate, λm, we can adopt the classical approach of using data from analogous systems with human involvement.

Because of the common and dominant human contribution, the failure rate of modern systems is inherently applicable to other similar systems, and can be used as a basis for prediction based on what we already know..

Tanker Spills (S/Mtoe)

Spills>1000gals (S/Mtoe)

Tanker Spills (S/Mtoe)

Spills>1000gals (S/Mtoe)

MERE, IR (S/Mtoe)=0.018+0.4 exp-(accMtoe/21600)Data sources:

US Coast Guard Polluting IncidentCompendium,2003;US DOE EIA, US Trade Overview,2003