文档库 最新最全的文档下载
当前位置:文档库 › (2011)Correlated-Input Secure Hash Functions

(2011)Correlated-Input Secure Hash Functions

Correlated-Input Secure Hash Functions

Vipul Goyal1,Adam O’Neill2?,and Vanishree Rao3?

1Microsoft Research,India,vipul@https://www.wendangku.net/doc/c316693354.html,

2University of Texas at Austin,adamo@https://www.wendangku.net/doc/c316693354.html,

3University of California,Los Angeles,vanishri@https://www.wendangku.net/doc/c316693354.html, Abstract.We undertake a general study of hash functions secure under

correlated inputs,meaning that security should be maintained when the

adversary sees hash values of many related high-entropy inputs.Such

a property is satis?ed by a random oracle,and its importance is illus-

trated by study of the“avalanche e?ect,”a well-known heuristic in cryp-

tographic hash function design.One can interpret“security”in di?erent

ways:e.g.,asking for one-wayness or that the hash values look uniformly

and independently random;the latter case can be seen as a generalization

of correlation-robustness introduced by Ishai et al.(CRYPTO2003).We

give speci?c applications of these notions to password-based login and

e?cient search on encrypted data.Our main construction achieves them

(without random oracles)for inputs related by polynomials over the in-

put space(namely Z p),based on corresponding variants of the q-Di?e

Hellman Inversion assumption.Additionally,we show relations between

correlated-input secure hash functions and cryptographic primitives se-

cure under related-key https://www.wendangku.net/doc/c316693354.html,ing our techniques,we are also able to

obtain a host of new results for such related-key attack secure crypto-

graphic primitives.

1Introduction

In practice is often useful to view a cryptographic hash function like a random oracle,as formalized in the random oracle model[6].However,as random oracles do not exist in reality(and indeed,in general the random oracle model may lead to insecure schemes[13]),an important line of research suggested by[13]seeks to formalize various useful properties satis?ed by a random oracle and construct hash functions meeting them under standard assumptions.In this paper,we do so for what we call correlated-input security,meaning that(various notions of)security should be maintained when the adversary sees hash values of many related high-entropy inputs.

The importance of correlated-input security in practice is illustrated by the so-called avalanche e?ect,a well-known heuristic in cryptographic hash function design.(The name“avalanche e?ect”was coined by Feistel[17],although the idea goes back to Shannon’s notion of di?usion[26].)Roughly,the avalanche e?ect states that making any change to an input should result in a drastically ?Work done in part while at Microsoft Research,India.

di?erent hash value.Clearly,such a hash function should satisfy a notion of correlated-input security.Our results help to shed light on whether or not this is feasible from a theoretical perspective.

1.1Notions of Correlated-Input Security

We get di?erent speci?c notions of correlated-input security depending on how we interpret“security.”We?rst discuss the di?erent interpretations we consider and and how we formalize the resulting notions.

Three Notions The?rst and most basic interpretation we consider is“one-wayness.”To formalize one-wayness under correlated-inputs,we consider a hash function H and circuits C1,...,C n,where each C i takes as input some random coins and outputs a point in the domain of H.The adversary is given hash values H(x1),...,H(x n)where each x i is the output of C i(r)for random coins r.Note that each C i is run on the same random coins.Therefore the x i are correlated.4 The adversary’s goal is to output any one of the x https://www.wendangku.net/doc/c316693354.html,rmally,we say that H is one-way under correlated inputs for a class of circuits{C}if for any n and any choice of C1,...,C n from{C},any e?cient adversary succeeds with at most negligible probability.

The next interpretation we consider is“unpredictability.”To formalize unpre-dictabililty under correlated-inputs,we consider a hash function H and circuits C1,...,C n+1,where each C i is as before.Now the adversary is given hash values H(x1),...,H(x n)and tries to output H(x n+1),where each x i is the output of C i(r)as before.The notion is de?ned for a class of circuits{C}analogously to the one-wayness case.It mainly serves as a stepping-stone to our?nal notion.

Finally,the last interpretation we consider is“pseudorandomness.”To for-malize pseudorandomness under correlated-inputs,we consider a hash func-tion H and circuits C1,...,C n+1,each C i is as before.Now the adversary is given hash values H(x1),...,H(x n)as well as a“challenge”value that is either H(x n+1)or a random string of appropriate length,where each x i is the output of C i(r)as before.(This of course requires the circuits to have distinct outputs.) Again,the notion is de?ned for a class of circuits{C}analogously. Discussion We make a few observations about these notions.One is that they are only achievable for a class of circuits{C}such that C(r)for random r has su?cient min-entropy for any C in the class.In fact,it is not hard to show that a random oracle satis?es our notions for the class of all such circuits.However,in the standard model they are in general only achievable by a keyed hash function H.To see this,?x an unkeyed hash function H and consider circuits C1,C2where C1(r)outputs r and C1(r)outputs H(r).Clearly,no?xed H is even one-way under correlated-inputs for these circuits.By a similar argument,the circuits must not depend on the choice of the hash key.We stress that in our notions the hash function key is public.Similar counter-examples show that in general 4For example,the x

’s might agree in most bit positions but vary in the others.It

i

may even be the case that a single input x i completely determines the rest.

pseudorandom generators and functions do not meet our notions(note that the latter has a secret key);we give details in the full version.

Another point is that when considering non-uniform inputs,these notions are non-trivial to achieve even in the case of a single input.However,this can be done;for example[27]considers one-way functions for a non-uniform input and[16]considers pseudorandom generators(that expand the input,which we do not require)for non-uniform seed.Additionally,we note for any a priori bounded number of circuits,pseudorandomness under correlated-input can be met even under information-theoretic indistinguishability(where the key-size depends on the bound).This follows from a generalization of the Leftover Hash Lemma[24, Lemma6.1](which follows[22,Lemma3.2]).On the other hand,the focus of our work is on correlations among an unbounded number of inputs.

1.2Applications

We now discuss some speci?c practical applications for our new notions. Password-based login An application of our notion of one-wayness under correlated-inputs is password-based login.For example,UNIX maintains a“password”?le that,for each user in the system,stores a hash of their password that is com-pared against the hash of a candidate password supplied at login by someone claiming to be this user.The goal is to prevent an adversary with access to the password?le from gaining the ability to impersonate a https://www.wendangku.net/doc/c316693354.html,rmally,it is often said that the property of the hash function needed to ensure this is one-wayness.But the standard notion of one-wayness is obviously insu?cient here. Passwords,while they should contain entropy,are certainly not uniformly ran-dom.Moreover,passwords are typically correlated,both across di?erent users, and across the same user on di?erent systems(and the adversary may recover the password?le for multiple systems).

This issue seems to be largely ignored in prior work.A paper(which we already mentioned)that considers the relevance of one-way functions for high entropy inputs to this application is[27];however,they do not consider multi-ple related inputs and relations among them.Our notion of one-wayness under correlated input seems to be an appropriate security notion for this application.5 E?cient search on encrypted data An application of our notion of pseudoran-domness under correlated-inputs is e?cient search on encrypted data.It is be-coming increasingly common for companies to store large amounts of data re-motely on servers maintained by an untrusted third party.To provide privacy for 5For simplicity,this ignores“salting”the passwords,which can be viewed as consider-ing a randomized hash function.This may make the problem easier for an approach based on the Leftover Hash Lemma(using a similar argument to[24,Lemma6.1]), but as discussed in[27]such an approach is impractical due to its“entropy loss.”

For our approach it does not make the problem signi?cantly easier,and anyway it circumvents the core issue that in practice a(deterministic)cryptographic hash function is assumed to satisfy correlated-input security;indeed,salting passwords is only meant to slow down dictionary attacks.

the client,the data should be encrypted.However,we still want to allow search

on the data without retrieving and decrypting the entire database.Techniques

like public-key encryption with keyword search[11]make search possible,but it

takes linear time in the database size.On the other hand,practitioners require

search time to be comparable to that for unencrypted data.

This problem was?rst studied from a cryptographic perspective by Bellare

et al.[2],who introduced deterministic encryption and the more general concept

of e?ciently searchable encryption(ESE)as a solution.The basic idea is to

attach a hash of each keyword to an encrypted?le.Keywords are obviously not

uncorrelated,and thus our notions are natural to apply.In fact,if a hash function

meets our notion of pseudorandomness under correlated-inputs,then it can be

“bootstrapped”to hide all partial information information by encrypting a hash

value under the one-way trapdoor permutation based deterministic encryption

scheme of[4](actually,a one-way permutation without a trapdoor su?ces here,

since we do not need to decrypt).

1.3Our Construction and its Security

Next we turn to whether our security de?nitions can be achieved and under what

cryptographic assumptions.

Our Construction We propose the following construction:Letting G be a group

of prime order p,the hash key is a random generator g∈G and random c∈Z p,

and the evaluation of the hash on input x∈Z p is g1/(x+c)where1/(x+c)

denotes the inverse of x+c modulo p.

Security Analysis We show that this construction is secure under each of our

three notions of security assuming(appropriate variants of)the q-Di?e-Hellman

inversion assumption(q-DHI).Roughly speaking,this assumption says that given gα,gα2,...,gαq,it is hard to compute g1/α.An assumption of this form

was?rst introduced by Boneh and Boyen[9],who considered it in groups with a

a bilinear map(pairing)e and asked that it be hard to compute(or distinguish

from random)e(g,g)1/x instead of g1/x.However,since our hash function is de-

terministic and thus automatically publically veri?able,we do not need bilinear

maps here.

The class of circuits we consider in our proofs are the ones that are(ef-

?ciently)representable by a polynomial over Z p(the input space of the hash

function).In other words,each x i=p i(x,y,z,...)where p i is a polynomial over Z p the x,y,z,...are randomly chosen but?xed for all i.In our main theorems, we treat the case of univariate polynomials p i over Z p,but it is easy to extend

our proofs to multivariate polynomials as well.This is quite a broad class,in par-

ticular just considering univariate polynomials and taking p1to be the identity

polynomial covers the well-known attacks on RSA[15].

First,we show that our inversion hash is one-way under correlated inputs

for this class of circuits assuming the q-strong discrete logarithm(q-SDL)as-

sumption(which is weaker than q-DHI in that it need only be hard to compute

αitself).6For unpredictability and psuedorandomness,we require an additional assumption that each input to the hash function is individually uniform.7How-ever we stress that given other inputs,an input may even be fully computable. For this class of polynomials,we show the inversion hash is unpredictable un-der correlated inputs assuming https://www.wendangku.net/doc/c316693354.html,ing standard hardcore bit techniques this already gives a construction with small output length achieving pseudo-randomness under correlated inputs with the same assumption.However,we directly show pseudorandomness under correlated inputs of our construction for the same class of polynomials assuming the decisional version of q-DHI. Discussion One may notice the similarity of our construction to the Boneh-Boyen short signature scheme[10].Our proof techniques build on those intro-duced in[10],but note that,as opposed to the latter,we focus on a setting where there is no secret key,and we use the q-DHI assumption instead of the q-SDH assumption of[10].Indeed,our proofs use some new ideas.In particular,the role of c in our reductions for unpredictability and pseudorandomness is completely di?erent from that of the message in[10].

We also note that our security proofs are under a notion of“selective”security where the circuits that sample the inputs do not depend on the public hash key. As we mentioned,in general this restriction is inherent.However,for restricted classes of circuits(such as arithmetic circuits we consider)which are not able to e?ciently compute the hash function in question,it may be possible to achieve an adaptive notion of security even by using an unkeyed hash function.We discuss a positive result for this case below.

Finally,we note that our construction as de?ned is not compressing.How-ever,once we obtain a construction meeting any of our notions,it is easy to obtain one which is also compressing.In the case of one-wayness we can apply a collision-resistant hash to the output,and in the case of pseudorandomness we can truncate the output.It can be shown that the resulting(compressing)hash function retains correlated-input security.

1.4Relations to Related-Key Attacks

Security under related-key attacks(RKA),?rst formalized by[5]in the context of pseudorandom functions/permutations,is a well-established notion that,like correlated-input security,asks for security to be maintained under related val-ues of a“secret”input.We explore elations between pseudorandomness under correlated-inputs secure hash(which we simply call CI-secure hash below)and RKA-security of various cryptographic primitives.

6In fact,for this result the c component of the hash key can be any?xed element in Z p(for instance,set c=0).

7This translates to the requirement that the polynomials individually have uniform output on uniform input(e.g.,this is the case for permutation polynomials).By making non-standard assumptions,it may be possible to drop this restriction.How-ever,considering individually uniform but correlated inputs to the hash function is natural,and we focus on results under standard assumptions.

Equivalence to One-Input RKA-Scure wPRF We?rst observe that a CI-secure hash is in some sense equivalent to what we call a“one-input RKA-secure weak pseudorandom function(wPRF).”To de?ne such a wPRF F,the adversary is given an input-output pair x,F(K,x)for random x and may query for for other outputs of the form F(?(K),x)for relations?of its choosing.(Note that the same x is re-used each time.)Following[20],we note that as compared to a CI-secure hash,the role of the key and the input are simply“switched.”8 Note that a one-input RKA-secure wPRF is implied by an RKA-secure PRF; in fact,if we start with an RKA-secure PRF(rather than wPRF),the resulting CI-secure hash does not need a public key.The latter is signi?cant because[3] gives RKA-secure PRFs,in particular one under the decisional Di?e-Hellman assumption for adaptively chosen,group-induced relations(i.e.,multiplication by a constant).We thus obtain an unkeyed adaptively-secure CI-secure hash for the corresponding class of circuits.

A General Transformation for RKA Security Addtionally,we propose a trans-formation to“bootstrap”any cryptographic primitive to one that is RKA-secure: simply hash the coins used to generate the secret key for the former.(This can be seen as replacing a RO in this transformation with a CI-secure hash.)Note,how-ever,that in the case the CI-secure hash has a public key,an authentic version of the latter is then needed by any algorithm that uses the secret key(e.g.,the signing algorithm for a signature scheme),which may not always be practical. Additionally,while our main construction of CI-secure hash is selectively secure, leading to selective security under RKA(i.e.,relations chosen before seeing the public parameters).However,by using our techniques in a non-blackbox way, we can sometimes achieve adaptive security instead and without any public key. In particular,we show how to do this for RKA-secure symmetric encryption,a primitive introduced concurrently to our work in[1];9.In this case,we are even able to handle the entire class of(non-zero)polynomial relations.

1.5Other Related Work

Our work is related to several other lines of research,as we now discuss. Realizing Random Oracles As we mentioned,our work can be seen as extending a research agenda proposed by Canetti,Halevi,and Goldriech[13],in which one identi?es and realizes useful properties of a random oracle(RO).Indeed, by using the techniques of[2,Theorem5.1]one can show that a RO meets all the security de?nitions we consider(which is why we have sought realizations under standard cryptographic assumptions).Other useful properties of a RO that have undergone a similar treatment include perfect one-wayness[12,14] and non-malleability[7]We note that none of these works address security under multiple correlated inputs.

8Note,however,that CI-input security is more general than RKA-security in that it considers inputs sampled from a common“history.”

9We obtained this result after seeing[1],but the rest of our work was concurrent.

In fact,a signi?cant prior work of Ishai et al.[21,De?nition1]considered a notion of“correlation-robustness”for pseudorandom hash functions,with the motivation of instantiatng the RO in their oblivious transfer protocols.Their notion is more restrictive than our notion of pseudorandomness under correlated-input,as the former is de?ned only for hash functions that output a single bit and considers inputs obtained by computing the exclusive-or’s of a“master”random input s with public random values.

Finally,we note that while realizing the“avalanche e?ect”satis?ed by a RO forms a major motivation for considering correlated-input security,it is not the only way the latter could be formalized.In particular,it talks only about the change in the output behavior relative to any change to an unknown input.The notion of“(multiple input)correlation intractability”due to[13]is a possible formalization the e?ect without this restriction.On the other hand,the latter notion seems harder to work with and more di?cult to achieve.

Deterministic Encryption Our security notions can be viewed as relaxations or variants of the notions of privacy proposed for deterministic encryption(DE) in[2,8,4,24](that seek to hide partial information)in the case of hash functions rather than encryption schemes.Indeed,the results of of[2,8,4,24]show that in some sense the“hard part”of realizing DE without random oracles is dealing with correlations among the inputs.Our work studies this issue at a more basic level,asking whether it is feasible even without supporting decryption and for weaker security notions like one-wayness.

Related Security Notions Recently,Rosen and Segev[25]introduced the notion of correlated-product secure trapdoor functions(TDFs).Correlated-product se-curity was later considered for hash functions in[20].Correlated-product security di?ers from our notions in that the former refers to security when related inputs are evaluated under independent instances of the function;in other words,there does not exist a single function which is evaluated on related inputs(as is the case for correlated-input security).Indeed,our techniques are quite di?erent and unrelated to those in[25,20].

The recent work of Goldenberg and Liskov[19]also considers a form of correlated-input security(which they call“related-secret”security)for various primitives.While their work has some similarities to ours(for example,they con-sider“related-secret”one-way functions),there are some important di?erences. Namely,they focus on hardcore bits and pseudorandom functions rather than hash functions,and they follow the de?nitional framework for RKA-security in-troduced in[5].As mentioned above,this de?nitional framework is less general than ours.Additionally,their results are mainly negative.

2Preliminaries

Notations.Let x$←?X denote the operation of selecting a random element x from X.Let by x←y denote the assignment of a value y to x.Let|X|denote the size of the set X,and|x|denote the length of the string x.Letλdenote the security parameter.Let FF(K,D,R)be the set of all families of functions

F:K×D?→R.For sets X,Y,let Fun(X,Y)be the set of all functions mapping X to Y.For brevity,we say that an algorithm outputs a set/function as a shorthand to mean that it outputs their descriptions.

Complexity Assumptions.We?rst state our complexity assumptions,namely q-DHI[9,10],which is weaker than q-BDHI[9]as well as q-SDH[10],and we introduce what we call the q-Strong Discrete Logarithm(q-SDL)assumption which is weaker than all of q-DHI,q-BDHI,and q-SDH assumptions.

Let GrpGen be a PPT algorithm that takes as input the security parameter 1λand outputs parameters for some cyclic multiplicative group G,including the group order p which is a poly(λ)-bit integer,a generator g,and an e?cient algorithm(e.g.,circuit)for multiplication(and thus also exponentiation).We denote it as(G,p,g)←GrpGen(1λ).

q-Strong Discrete Logarithm(q-SDL)Problem The q-SDL problem in G is de-?ned as follows:given a(q+1)-tuple(g,g x,g x2,...,g x q)∈(G?)q+1for some unknown x∈Z?p,output x.

An algorithm A solves the q-SDL problem in the group G with advantage?if SDL Adv A,q:=Pr[A(g,g x,g x2,...,g x q)=x]≥?

where the probability is over the random choice of generator g∈G?,the random choice of x∈Z?p,and the random bits consumed by A.

De?nition1.We say that the(q,t,?)-SDL assumption holds in G(or Grp-Gen satis?es the(q,t,?)-SDL assumption)if no probabilistic t-time algorithm has advantage at least?in solving the q-SDL problem in G.

It is easy to see that the1-SDL assumption is equivalent to the standard Discrete Logarithm assumption.

q-Di?e-Hellman Inversion(q-DHI)Problem The q-DHI problem[9,10]in G is de?ned as follows:given a(q+1)-tuple(g,g x,g x2,...,g x q)∈(G?)q+1for some unknown x∈Z?p,output g1x∈G.

An algorithm A solves the q-DHI problem in the group G with advantage?if DHI Adv A,q:=Pr[A(g,g x,g x2,...,g x q)=g1x]≥?

where the probability is over the random choice of generator g∈G?,the random choice of x∈Z?p,and the random bits consumed by A.

De?nition2.We say that the(q,t,?)-DHI assumption holds in G(or Grp-Gen satis?es the(q,t,?)-DHI assumption)if no probabilistic t-time algorithm has advantage at least?in solving the q-DHI problem in G.

q-DHI Problem can be equivalently stated as follows[10]:given a(q+2)-tuple(g,g x,g x2,...,g x q,c)∈(G?)q+1×Z p\{x}for some unknown x∈Z?p, output g1x+c∈G.This de?nition also clearly points out the distinction between the q-DHI problem and the q-SDH problem:in case of the q-DHI problem,the value of c is prescribed in the problem instance itself,whereas in case of the q-SDH problem,the solver is free to choose c∈Z p and output a pair,(c,g1x+c). Obviously,the q-DHI assumption is weaker than the q-SDH assumption.

Decisional q-Di?e-Hellman Inversion(Decisional q-DHI)Problem The deci-

sional q-DHI problem in G is de?ned as follows:given a(q+1)-tuple(g,g x,g x2,...,g x q)∈(G?)q+1for some unknown x∈Z?p,distinguish between g1x and a random ele-ment R$←?G.

An algorithm A solves the decisional q-DHI problem in G with advantage?if

DDHI Adv A,q:=|Pr[A(g,g x,g x2,...,g x q,g1x)=1]

?Pr[A(g,g x,g x2,...,g x q,R)=1]|≥?

where the probability is over the random choice of generator g∈G?,the random choice of x∈Z?p,the random choice of R∈G?,and the random bits consumed

by A.The distribution on the left is referred to as P DDHI and the distribution

on the right as R DDHI.

De?nition3.We say that the decisional(q,t,?)-DHI assumption holds in G

(or GrpGen satis?es the decisional(q,t,?)-DHI assumption)if no probabilistic

t-time algorithm has advantage at?in solving the decisional q-DHI problem in

G.

3Our Model:Correlated Input Security

In this section,we de?ne our new notion of security for cryptographic hash func-tions.We would be interested in preserving various properties of hash functions

(like one-wayness and pseudo-randomness)when the function maybe evaluated

on a tuple of inputs which maybe be correlated in an arbitrary way.Standard notions of security do not provide any guarantee in such a setting.In the full version,we discuss examples of functions which are secure in the standard sense

but may be completely insecure when evaluated on multiple inputs which are correlated.

Before we go further,we?rst discuss how we represent correlations among a

tuple of inputs(m1,...,m n).In general,such an input tuple maybe generated by

a polynomial-size sampler circuit Samp.In other words,Samp takes a random

tape r(of appropriate length)as input such that(m1,...,m n)←Samp(r).Note

such a sampler circuit can generate the input tuple for any type of polynomial-

time computable correlations.Equivalently,one can generate the(correlated)

tuple of inputs using a tuple of polynomial-size circuits(C1,...,C n)when ini-tialized on the same random tape.In other words,?x a random string r and set

m i←C i(r).It is easy to see that both these sampling procedures are equivalent.

For the rest of the paper,we shall stick to the latter mode of using a tuple of circuits(for convenience,as will be clear later on).

Also,it will be understood that the range of every circuit considered is a subset of the input-space(or keyspace)in question.We refer to an adversary

as{C}-restricted,if its correlated-input circuits are restricted to the class of circuits{C}.

We?rst de?ne the syntax for a general function family(or a hash function

family if the functions are compressing).We will then move on to formalize the

various security properties such a function family might satisfy.

De?nition4(Function Family).A family of deterministic functions H is

speci?ed by a PPT algorithm Gen.The algorithm Gen,given input1λ,outputs a

parameter set I h,domain D h,and range R h,and outputs c∈I h as a description

of a function h c:D h?→R h.The sizes of the domain and range sets are each exponential in the security parameter.

Now,we shall discuss our?rst notion of security called correlated-input one-

https://www.wendangku.net/doc/c316693354.html,rmally,we consider a function h(·)such that given(h(m1),...,h(m n)),

where inputs(m1,...,m n)maybe correlated,it is hard for any PPT adversary

to output any valid preimage m i.This can be viewed as a generalization of the

standard notion of one-way functions.We allow the adversary to specify the

correlations to the challenger by giving a tuple of circuits(C1,...,C n),where

each circuit is from a class of correlated-input circuits{C}.Note that,for this

de?nition to be satis?able,each circuit C i,∈{C}individually should have high

min-entropy output distribution for uniform random input distribution.10Thus,

we quantify only over such circuits in our de?nition.We discuss it under both

selective and adaptive security notions.More details follow.

In the following we shall only formalize the selective notion,while we refer

the reader to the full version for de?nitions of the adaptive notions.

The Selective Correlated-Input Inverting experiment Exp sCI?inv

A,H,{C}.For a family of

deterministic functions H,an adversary A,and a family of e?ciently-computable correlated-input circuits{C},we de?ne the following game between a challenger and the adversary A.

–Setup Phase1.Challenger runs the Gen algorithm of H for a security parameter input1λand gets h c:D h?→R h.Challenger gives D h to A.

–Query Phase.A chooses a positive integer n(=poly(λ)),and gives to the challenger n circuits{C i}i∈[n]?{C}.

–Setup Phase2.Challenger gives h c(·)to A and chooses r,a uniform random string of appropriate length.

–Response Phase.?i∈[n],challenger responds via h c(C i(r)).

–Invert Phase.A outputs(?k,?y)for?k∈[n]and?y∈D h.

The output of the experiment is de?ned to be1if h c(?y)=h c(C?

k (r))and0

otherwise.

We de?ne the advantage of an adversary A in the above game as:

Adv sCI?inv

A,H,{C}(λ)=Pr[Exp sCI?inv

A,H,{C}

=1]

The probability is over the random bits used by the challenger and the adversary.

10This requirement is similar to one in the standard notion of one-way functions.If the input does not have su?cient min-entropy,it is easy to see that an adversary can guess a preimage and succeed with noticeable probability.

De?nition5.A family of functions H is said to be selective correlated-input one-way with respect to a family of correlated-input circuits{C},if for all A∈PPT there exists a negligible function negl,such that:

Adv sCI?inv

A,H,{C}

(λ)≤negl(λ)

We now consider two more correlated-input security notions where we talk about the unpredictability of the output as opposed to that of the https://www.wendangku.net/doc/c316693354.html,rmally, we consider a function h c:D h?→R h with the following properties.Consider a tuple of correlated inputs(m1,...,m n+1).The adversary is given the function outputs(h c(m1),...,h c(m n))and it tries to compute h c(m n+1).In the?rst security notion called correlated-input unpredictability(CI-unpredictability),we require that it should be hard for the adversary to output h c(m n+1).In the next notion called correlated-input pseudorandomness(CI-pseudorandomness), we require that the adversary should not be able to distinguish h c(m n+1)from a random element in R h,given(h c(m1),...,h c(m n)).It is easy to show that this notion of CI-pseudorandomness is equivalent to a notion where an adversary gets either(h c(m1),...,h c(m n+1))or n+1independent random elements in R h and is required to distinguish the two cases.Note that,for any of these notions to be satis?able,besides the requirement that each circuit C i∈{C}individually should have high min-entropy output distribution for uniform random input distribution,we also require that,for every two distinct circuits C i and C j in {C},and for a uniform random input r of appropriate length,C i(r)=C j(r) happens only with negligible probability over the choice of r.

In trying to give more power to the adversary(thus making our de?nition stronger),we allow the adversary to specify the correlation by giving a tuple of circuits(C1,...,C n+1),where C i∈{C},to the challenger,as before.In addition,for the selective case,for a tuple of inputs(m1,...,m n+1),we allow the adversary to get outputs on n adaptively chosen input indices of its choice before trying to predict the remaining output.11More details follow.

The de?nitions of the selective CI-pseudorandomness and adaptive notions of all the three notions are given in the full version.

The Selective Correlated-Input Predicting experiment Exp sCI?pred

A,H,{C}.For a family of

deterministic functions H,an adversary A,and a family of e?ciently-computable correlated-input circuits{C},we de?ne the following game between a challenger and the adversary A.

–Setup Phase1.Challenger runs the Gen algorithm of H for a security parameter input1λand gets h c:D h?→R h.Challenger gives D h to A.

11By incurring a security loss of a factor of n,this de?nition can actually be shown to be equivalent to a weaker de?nition where the adversary is required to predict the output speci?cally on input m n+1?xed after it presents its queries but before it reveives the responses.However,working directly with this de?nition might lead to better concrete security guarantees in the real world.

–Query Phase .A chooses a positive integer n (=poly (λ)),and gives to the challenger n +1distinct circuits {C i }i ∈[n +1]?{C }.–Setup Phase 2.Challenger gives h c (·)to A and chooses r ,a uniform random string of appropriate length.

–Partially Adaptive Query -Response Phase .A presents n queries,where an i th query is k i ∈[n +1].Challenger responds to it via h c (C k i (r )).–Predict Phase .The adversary outputs ?y ∈R h .

Let k n +1∈[n +1]be such that k n +1=k i ?i ∈[n ].The output of the experiment is de?ned to be 1if ?y =h c (C k n +1(r ))and 0otherwise.We de?ne the advantage of an adversary A in the above game as:

Adv sCI ?pred A ,H ,{C }(λ)=Pr[Exp sCI ?pred A ,H ,{C }=1]

The probability is over the random bits used by the challenger and the adversary.De?nition 6.A family of functions H is said to be selective correlated-input unpredictable with respect to a family of correlated-input circuits {C },if for all A ∈PPT there exists a negligible function negl ,such that:

Adv sCI ?pred A ,H ,{C }(λ)≤negl (λ)

4Proposed Construction

In the sequel,we give the construction of our function and prove that it is correlated-input secure for a class of polynomials over Z p (where p is a prime number)in the sense of each of the three selective security models de?ned above.

Our construction is given in Figure 1.Gen (1λ).Run GrpGen :(G ,p,g )←GrpGen (1λ),where p is a prime number.

Gen uniformly samples a random element c from Z p and a random generator

g of group G .It outputs g ,c and a function h :Z p ?→G de?ned by,

h (m ):=g 1m +c

for any m ∈Z p (where 1

m +c is computed mod p ).

Fig.1.Our Construction

Our proposed function is extremely simple and e?cient to compute.The cost of computation is dominated by a single exponentiation operation.The construc-tion can be seen as similar to a short signature scheme by Boneh and Boyen [10].Our main novelty can be seen in the proofs of security.Indeed,interestingly,our

proofs show that the original signature scheme of Boneh and Boyen is secure even if an adversary is allowed to obtain messages signed by various correlated secret signing keys,where the correlations are from a set of polynomials over Z p.We refer the reader to the full version for a detailed description of this implication.

4.1Analysis of the above Construction.

We prove that our construction is selectively secure against a class of correlations computable by polynomials over Z p.In what follows,we introduce some more notations that will be used in the rest of the paper and then discuss the three security games with correlated-input circuits computing polynomials over Z p. Notations Let deg[f(X)]denote the degree of a polynomial f(X)over Z p.If the output distribution of a polynomial is uniform in Z p(or,in other words,if the range of the polynomial is Z p itself),then we refer to the polynomial as uniform-output polynomial.On the other hand,if the output distribution of a polynomial has high min-entropy in Z p,then we refer to the polynomial as high-min-entropy-output polynomial(any non-zero polynomial of degree polynomial in the security parameter has a range of size exponential in the security parameter).We shall only consider polynomials of degree is at least1and polynomial inλ.We only state our theorems in the following and give the proofs in the full version.

4.2Selective Correlated-Input one-wayness

Theorem1.Suppose that(q,t′,?)-SDL assumption holds in G.Let{C}be a set of non-zero polynomials over Z p.Then,for H as in Figure1,there exists no

probabilistic t-time adversary A for which Adv sCI?inv

A,H,{C}(λ)is at least?provided

that

d≤q and t≤t′?Θ(nqτ)

where d=d(λ)upper bounds the sum of the degrees of the polynomials that A queries upon andτis the maximum time for an exponentiation in G and Z p.

4.3Selective Correlated-Input Unpredictability

Theorem2.Suppose that(q,t′,?′)-DHI assumption holds in G.Let{C}be a set of uniform-output polynomials over Z p.Then,for H as in Figure1,there

exists no probabilistic t-time adversary A for which Adv sCI?pred

A,H,{C}is at least?

provided that

d≤q+1,?≥2(n+1)?′and t≤t′?Θ(nqτ)

where d=d(λ)upper bounds the sum of the degrees the polynomials that A queries upon andτis the maximum time for an exponentiation in G and Z p.

The proof for the above theorem and for CI-pseudorandomness is given in the full version.

5Relations between CI-Security and RKA-Security

We conclude by examining relations beween correlated-input secure hash func-

tions and security under related-key attacks,whose formal treatement was ini-

tiated by[5].The latter asks that security of a cryptographic primitive(e.g.,

a pseudorandom function)maintains security when used with related secret

keys.In this section,we only consider our notion of pseudorandomness under

correlated-inputs,so“CI-security”below refers by default to this notion.

5.1Relations to RKA-Secure Weak PRFs

We start by showing an equivalence between CI-secure hash functions and some

form of RKA-security for weak PRFs we introduce.The idea,following[20],is

to“switch”the input and key for these primitives.

RKA-Secure Weak PRFs Recall that weak PRFs[23],as opposed to normal

ones,handle only random inputs.In de?ning RKA-security for this primitive,

a modeling choice we need to consider is whether,when the adversary queries

for a value of the function under a related key,a new random input is chosen

(or the previous random-input re-used).We give general de?nitions that capture

the possibilities.(However,for our results we only use a notion where the same

random input is re-used.)We also consider both“selective”and“adaptive”

security;in the former,the adversary chooses the relations applied to the secret

key before receiving any responses.

In de?ning RKA security for various primitives,we use a non-standard for-

malization based on our framework of circuits,which in this case sample keys.

This is for ease of comparison to our notion of CI-security.For example,by {C}-RKA-PRF we refer to an RKA-PRF where the secret keys are sampled ac-cording to circuits in{C}(executed on a common random input).Note that,in

this framework,RKA-security corresponds to a special case of CI-security where

the?rst circuit samples a random key K and and the remaining circuits operate

only on K(and not the coins used to sample it)to produce a related key.(The

latter is su?cient in the context of RKA-security.)

We also note that a PRF function family is speci?ed by an e?cient proba-

bilistic parameter-generation algorithm Gen prf which takes as input a security

parameter1λand outputs the description of a function including a description

of its keyspace,domain and range.However,for simplicity of exposition,we only

consider a single PRF function in most part of the following discussion as long

as there is no ambiguity.

De?nition7(q input?{C}-aRKA-wPRF.).Let F:K×D?→R be an e?-ciently computable function.Let{C}?Fun(K,K)be a set of RKD circuits.F is said to be q input?{C}-aRKA-wPRF,if,?A∈PPT:

Adv aRKA?wP RF

A,F,{C},q input (λ):=|Pr[k$←?K:A O weak

F(k,·)

(·,·)?→1]?Pr[k$←?K,G$←?

FF(K,D,R):A O weak

G(k,·)

(·,·)?→1]|

is negligible inλ,where the related-key-wprf oracle O weak

f(k,·)

(·)takes as input

(index i,C i)∈([q input],{C}),and outputs(x index

i ,f(C i(k),x index

i

)),where

x index

i

$

←?D.

The de?nition for selective version appears in the full version.

The Equivalence In the following,for a class of circuits{C}we show an equiv-alence between a one-input RKA-Secure wPRF for{C}and a CI-secure hash function for{C}.(The equivalence holds respectively in the cases of selective and adaptive security notions.)First,we construct a family of functions{F} from a family of functions H and show that if H is a{C}-aCI-pseudorandom function family(resp.{C}-sCI-pseudorandom function family)then{F}is a

1?{C}-aRKA-wPRF(resp.1?{C}-sRKA-wPRF).

Let H be a family of functions speci?ed by Gen.A family of functions{F}is

de?ned by the following parameter-generation algorithm:

Gen prf(1λ).Gen prf runs Gen(1λ)that outputs the description of a parameter

set D,c∈D,and a function h c:K?→R.Gen prf outputs K,D and R for

keyspace,domain and range,respectively,of a function F de?ned by,

F(k,x):=h x(k)

for any k∈K and x∈D.

Fig.2.Construction of1?{C}-(s/a)RKA-wPRF Family from{C}-(s/a)CI-pseudorandom Function Family

Theorem3.{C}-aCI-pseudorandom function family(resp.{C}-sCI-pseudorandom function family)implies1?{C}-aRKA-wPRF(resp.1?{C}-sRKA-wPRF).

The proof is given in the full version.

In the full version,we also give a construction of1?{C}-aRKA-wPRF(resp.

1?{C}-sRKA-wPRF)from{C}-aCI-pseudorandom function family(resp.{C}-

sCI-pseudorandom function family).

New CI-Secure Hash Functions As an application of the above equivalence,we obtain new CI-secure hash functions.In particular,note that an adaptive one-input RKA-secure wPRF for a class of circuits{C}is trivially implied by an RKA-Secure PRF for{C}.(Here,the latter is de?ned as expected,namely as in prior work except cast in our framework of circuits;see the full version for more details.)We can therefore use the recent constructions of RKA-Secure PRFs

by Bellare and Cash[3]to obtain adaptive CI-secure hash https://www.wendangku.net/doc/c316693354.html,ly,

the latter are secure under the standard DDH assumption for class of circuits computing multiplication by a group element or under exponential-hardness of DDH for addition by a group element.

Though the resulting CI-secure hash functions are secure for much weaker classes of relations as compared to our main construction,they are remarkable

in that they are both adaptively secure and do not need a public key.(They do

not even need any randomly-generated global parameters,as the constructions

of[3]work in a?xed group.)The latter is because in the case that we start with

an RKA-secure PRF(rather than wPRF),our construction of CI-secure hash

can be modi?ed by applying the PRF to any?xed value in the domain of the latter(still using the input as the key).

5.2CI-secure Functions Imply Other RKA-secure Primitives

In this section,we discuss a more general technique for building RKA-secure cryptographic primitives from a CI-secure hash function.The basic idea is to hash the coins used to generate keys for the the former,using a CI-secure hash function.

Informally,letΨbe a scheme for a cryptographic primitive.Let KeyGen be

a PPT algorithm forΨ,and let l(λ)be the length of the random string used

by it.Our transformation uses a{C}-pseudorandom function family H speci?ed

by Gen,and the transformation involves modifying KeyGen(1λ;r)where r$←?{0,1}l(λ)to KeyGen(1λ;h(r′))where r′$←?{0,1}t(λ)and(h:{0,1}t(λ)?→{0,1}l(λ))←Gen(1λ).The resulting scheme is then expected to be“{C}-RKA-secure”besides preserving the security properties of the underlying untrans-formed scheme.

More concretely,we exemplify the above technique for digital signatures.We give our formalization of RKA-security for signatures in the full version.

In what follows,we show that{C}-aCI-pseudorandom function family(resp. {C}-sCI-pseudorandom function family)implies{C}-aRKA-unforgeable scheme (resp.{C}-sRKA-unforgeable scheme).The transformation is given in Figure3.

Let H be a function family speci?ed by Gen.LetΣ′=(KeyGen′,Sign′,Verify′)

be a signature scheme and let l(λ)be the length of the randomness used in

the KeyGen′.The signature schemeΣ=(KeyGen,Sign,Verify)is de?ned by:

–KeyGen(1λ):(h:{0,1}t(λ)?→{0,1}l(λ))←Gen(1λ);sk$←?{0,1}t(λ);

(sk′,pk′)←KeyGen′(1λ;h(sk));output sk as the secret key,and

pk:=(h,pk′)as the public key.

–Sign(sk,m):Run KeyGen′(1λ,h(sk))to obtain sk′.The signature on

message m is set asσ←Sign′(sk′,m).

–Verify(pk,m,σ):Output valid if Verify′(pk′,m,σ)=valid and output invalid otherwise.

Fig.3.Construction of RKA-secure Signature Scheme from CI-secure Pseudorandom Functions

Theorem4.{C}-aCI-pseudorandom function family(resp.{C}-sCI-pseudorandom function family)implies{C}-aRKA-unforgeable scheme(resp.{C}-sRKA-unforgeable scheme).

The proof is given in the full version.

Discussion In the case that the starting CI-secure hash function has a public key,the above transformation results in a cryptographic primitive for which al-gorithms operating on the secret key also need to access an authentic public key. In some scenarios,e.g.smart cards,this may not always be practical.Moreover, our main construction of CI-secure hash is only selectively-secure,resulting in a selectively RKA-secure the cryptographic primitive.On the other hand,it is sometimes possible to use our techniques(in a“non-blackbox”way)to design an RKA-secure scheme without a public key and that is adaptively secure.In particular,we show this for RKA-secure symmetric-key encryption recently in-troduced in[1]in the full version.We also mention that using the CI-secure hash functions derived from the Bellare-Cash RKA-secure PRFs[3]avoid these issues,but for a much weaker class of relations.

Relation to Tampering Attacks We also note that RKA-security for a crypto-graphic primitive can also be used to to protect against tampering attacks[18], where,for instance,the secret key stored by a smart card is tampered with and its behavior is observed while it acts using the tampered secret,with an objec-tive of gaining advantage against the security of the functionality of the smart card when using the original secret.However,as discussed in[1],security against tampering attacks is easier to achieve in general,through some kind of“sanity check”on the secret key(for instance,by including a signature on the secret key as a part of the public key,which is veri?ed by any algorithm using the former); although,as discussed above,this approach may not always be practical.This does not work for RKA-security,since we actually want related secret keys to function like independently generated ones.

Acknolwedgements

We thank the anonymous reviewers of TCC2011for detailed comments and suggestions.The second author thanks David Cash for discussions about related-key security,and was supported in part by Brent Waters’grants NSF CNS-0915361and CNS-0952692.

References

1.Applebaum,B.,Harnik,D.,Ishai,Y.:Semantic security under related-key at-

tacks and applications.Cryptology ePrint Archive,Report2010/544(2010), https://www.wendangku.net/doc/c316693354.html,/

2.Bellare,M.,Boldyreva,A.,O’Neill,A.:Deterministic and e?ciently searchable

encryption.In:CRYPTO.pp.535–552(2007)

3.Bellare,M.,Cash,D.:Pseudorandom functions and permutations provably secure

against related-key attacks.In:CRYPTO.pp.666–684(2010)

4.Bellare,M.,Fischlin,M.,O’Neill,A.,Ristenpart,T.:Deterministic encryption:

De?nitional equivalences and constructions without random oracles.In:CRYPTO.

pp.360–378(2008)

5.Bellare,M.,Kohno,T.:A theoretical treatment of related-key attacks:Rka-prps,

rka-prfs,and applications.In:EUROCRYPT.pp.491–506(2003)

6.Bellare,M.,Rogaway,P.:Random oracles are practical:A paradigm for design-

ing e?cient protocols.In:ACM Conference on Computer and Communications Security.pp.62–73(1993)

7.Boldyreva,A.,Cash,D.,Fischlin,M.,Warinschi,B.:Foundations of non-malleable

hash and one-way functions.In:ASIACRYPT.pp.524–541(2009)

8.Boldyreva,A.,Fehr,S.,O’Neill,A.:On notions of security for deterministic en-

cryption,and e?cient constructions without random oracles.In:CRYPTO.pp.

335–359(2008)

9.Boneh.,D.,Boyen,X.:E?cient Selective-ID Secure Identity Based Encryption

Without Random Oracles.In:Advances in Cryptology–Eurocrypt.LNCS,vol.

3027,pp.223–238.Springer(2004)

10.Boneh,D.,Boyen,X.:Short signatures without random oracles.In:Cachin,C.,

Camenisch,J.(eds.)EUROCRYPT.Lecture Notes in Computer Science,vol.3027, pp.56–73.Springer(2004)

11.Boneh,D.,Crescenzo,G.D.,Ostrovsky,R.,Persiano,G.:Public key encryption

with keyword search.In:EUROCRYPT.pp.506–522(2004)

12.Canetti,R.:Towards realizing random oracles:Hash functions that hide all partial

information.In:CRYPTO.pp.455–469(1997)

13.Canetti,R.,Goldreich,O.,Halevi,S.:The random oracle methodology,revisited.

J.ACM51(4),557–594(2004)

14.Canetti,R.,Micciancio,D.,Reingold,O.:Perfectly one-way probabilistic hash

functions(preliminary version).In:STOC.pp.131–140(1998)

15.Coppersmith,D.,Franklin,M.K.,Patarin,J.,Reiter,M.K.:Low-exponent rsa with

related messages.In:EUROCRYPT.pp.1–9(1996)

16.Dziembowski,S.,Pietrzak,K.:Leakage-resilient cryptography.In:FOCS.pp.293–

302(2008)

17.Feistel,H.:Cryptography and computer privacy.Scienti?c American228(5)(1973)

18.Gennaro,R.,Lysyanskaya, A.,Malkin,T.,Micali,S.,Rabin,T.:Algorithmic

tamper-proof(atp)security:Theoretical foundations for security against hardware tampering.In:TCC.pp.258–277(2004)

19.Goldenberg,D.,Liskov,M.:On related-secret pseudorandomness.In:TCC.pp.

255–272(2010)

20.Hemenway,B.,Lu,S.,Ostrovsky,R.:Correlated product security from any one-way

function and the new notion of decisional correlated product security.Cryptology ePrint Archive,Report2010/100(2010),https://www.wendangku.net/doc/c316693354.html,/

21.Ishai,Y.,Kilian,J.,Nissim,K.,Petrank,E.:Extending oblivious transfers e?-

ciently.In:CRYPTO.pp.145–161(2003)

22.Kiltz, E.,Pietrzak,K.,Stam,M.,Yung,M.:A new randomness extraction

paradigm for hybrid encryption.In:EUROCRYPT.pp.590–609(2009)

23.Naor,M.,Reingold,O.:Synthesizers and their application to the parallel construc-

tion of pseudo-random https://www.wendangku.net/doc/c316693354.html,put.Syst.Sci.58(2),336–375(1999) 24.O’Neill, A.:Deterministic public-key encryption revisited.Cryptology ePrint

Archive,Report2010/533(2010),https://www.wendangku.net/doc/c316693354.html,/

25.Rosen,A.,Segev,G.:Chosen-ciphertext security via correlated products.In:TCC.

pp.419–436(2009)

26.Shannon,C.E.:Communication theory of secrecy systems.Bell System Technical

Journal28(4),656–715(1949)

27.Wagner,D.,Goldberg,I.:Proofs of security for the unix password hashing algo-

rithm.In:ASIACRYPT.pp.560–572(2000)

哈希表应用

附件4: 北京理工大学珠海学院 课程设计任务书 2010 ~2011学年第二学期 学生姓名:专业班级: 指导教师:工作部门: 一、课程设计题目 哈希表应用 二、课程设计内容(含技术指标) 【问题描述】 利用哈希表进行存储。 【任务要求】 任务要求:针对一组数据进行初始化哈希表,可以进行显示哈希表,查找元素,插入元素,删除元素,退出程序操作。 设计思想:哈希函数用除留余数法构造,用线性探测再散列处理冲突。 设计目的:实现哈希表的综合操作 简体中文控制台界面:用户可以进行创建哈希表,显示哈希表,查找元素,插入元素,删除元素。 显示元素:显示已经创建的哈希表。 查找元素:查找哈希表中的元素,分为查找成功和查找不成功。 插入元素:在哈希表中,插入一个元素,分为插入成功和失败。 删除元素:在已有的数据中,删除一个元素。 退出系统:退出程序。 【测试数据】 自行设定,注意边界等特殊情况。

三、进度安排 1.初步设计:写出初步设计思路,进行修改完善,并进行初步设计。 2.详细设计:根据确定的设计思想,进一步完善初步设计内容,按要求编写出数据结构类型定义、各算法程序、主函数。编译分析调试错误。 3.测试分析:设计几组数据进行测试分析,查找存在的设计缺陷,完善程序。 4.报告撰写:根据上面设计过程和结果,按照要求写出设计报告。 5.答辩考核验收:教师按组(人)检查验收,并提出相关问题,以便检验设计完成情况。 四、基本要求 1.在设计时,要严格按照题意要求独立进行设计,不能随意更改。若确因条件所限,必须要改变课题要求时,应在征得指导教师同意的前提下进行。 2.在设计完成后,应当场运行和答辩,由指导教师验收,只有在验收合格后才能算设计部分的结束。 3.设计结束后要写出课程设计报告,以作为整个课程设计评分的书面依据和存档材料。设计报告以规定格式的电子文档书写、打印并装订,报告格式严格按照模板要求撰写,排版及图、表要清楚、工整。 从总体来说,所设计的程序应该全部符合要求,问题模型、求解算法以及存储结构清晰;具有友好、清晰的界面;设计要包括所需要的辅助程序,如必要的数据输入、输出、显示和错误检测功能;操作使用要简便;程序的整体结构及局部结构要合理;设计报告要符合规范。 课程负责人签名: 年月日

Hash表的构建和冲突解决

哈希表概念及构建方法 一、哈希表的概念及作用 一般的线性表,树中,记录在结构中的相对位置是随机的,即和记录的关键字之间不存在确定的关系,因此,在结构中查找记录时需进行一系列和关键字的比较。这一类查找方法建立在“比较“的基础上,查找的效率依赖于查找过程中所进行的比较次数。 理想的情况是能直接找到需要的记录,因此必须在记录的存储位置和它的关键字之间建立一个确定的对应关系f,使每个关键字和结构中一个唯一的存储位置相对应。 哈希表最常见的例子是以学生学号为关键字的成绩表,1号学生的记录位置在第一条,10号学生的记录位置在第10条... 如果我们以学生姓名为关键字,如何建立查找表,使得根据姓名可以直接找到相应记录呢? a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 1 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 2 1 2 2 2 3 24 25 26 刘丽刘宏英吴军吴小艳李秋梅陈伟... 姓名中各字拼音首字母ll lhy wj wxy lqm cw ... 用所有首字母编号值相加求 和 24 46 33 72 42 26 ... 最小值可能为3 最大值可能为78 可放75个学生 用上述得到的数值作为对应记录在表中的位置,得到下表:

成绩一成绩二... 3 ... ...... 24 刘丽82 95 25 ... 26 陈伟 ... ... 33 吴军 ... ... 42 李秋梅 ... ... 46 刘宏英 ... ... 72 吴小艳 ... ... 78 ... 上面这张表即哈希表。 如果将来要查李秋梅的成绩,可以用上述方法求出该记录所在位置: 李秋梅:lqm 12+17+13=42 取表中第42条记录即可。 问题:如果两个同学分别叫刘丽刘兰该如何处理这两条记录? 这个问题是哈希表不可避免的,即冲突现象:对不同的关键字可能得到同一哈希地址。 二、哈希表的构造方法 1、直接定址法

该程序实现的哈希表构造哈希函数的方法为除留余数法(

一、该程序实现的哈希表:构造哈希函数的方法为除留余数法(函数modhash),处理哈希冲突的方法为链地址法。 二、对哈希表的操作:插入(函数hash_table_insert)、移除(函数hash_table_remove)、 查找(函数hash_table_lookup)、整个哈希表的释放(函数hash_table_delete)、 整个哈希表的输出(函数hash_table_print)。 三、哈希表的最大长度可以由HASHMAXLEN设置(我设为1000)。 四、输入哈希表的名称拼音字符是长度为10—20(长度可由STR_MAX_LEN和STR_MIN_LEN)的小写字母组成。这些名字字符串是我用函数rand_str随机产生的。 五、名称拼音字符(关键字)到关键字值的转换方法:先把名称的拼音字符转换对应的ASCII,累加后作为关键字值。我是用函数str_to_key实现的。 六、异常情况包括: 1、在对哈希表进行插入操作时,若哈希表的实际长度超过了哈希表的最大长度,我就输出“out of hash table memory!”,然后直接跳出插入子函数,不进行插入操作。 2、在对哈希表进行插入操作时,若插入的元素在哈希表中已经存在,我就输出“******already exists !”,然后直接跳出插入子函数,不进行插入操作。 3、在对哈希表进行查找操作时,若查到则返回其地址,若没查到则返回空地址。 4、在对哈希表进行移除操作时,对同义词元素的删除,分为表头和表中两种情况处理。 七、开发平台:DEV-C++,用c语言实现。 在哈希表程序中我比较注重整个代码风格,希望能形成很好的代码风格!如果有什么可以改进的,希望老师能跟我说说!

哈希表冲突处理方法浅析

龙源期刊网 https://www.wendangku.net/doc/c316693354.html, 哈希表冲突处理方法浅析 作者:叶军伟 来源:《科技视界》2014年第06期 【摘要】哈希表的理想情况是无需比较一次存取便能找到所查的记录,但是在实际应用中,哈希表通常存在冲突的情况,这就需要反复查找处理冲突。各种处理冲突的方法都有其适用范围及优缺点,需要根据实际情况灵活的选择适当的冲突处理方法。 【关键词】哈希表;冲突;处理方法 0 引言 在哈希表中,哈希函数的设置是非常灵活的,只要能使任一关键字由此所得的哈希地址都分布在哈希表允许的范围内就可以了。因此常常会出现不同的关键字值对应到同一个存储地址的现象,这就叫冲突。即关键字key1≠key2,但H(key1)= H(key2)。 适当的选择分布均匀的哈希函数能有效地减少冲突的发生,但是不能不免冲突。发生冲突后,必须解决,也即必须寻找下一个可用的地址。因此哈希表的建立通常为如下步骤:第一步,取出一个数据元素的关键字key,根据哈希函数计算其在哈希表中的存储地址D,若地址为D的存储空间还没有被占用,则将该数据元素存入,否则发生冲突,执行下一步;第二 步,根据规定的冲突处理方法,计算关键字为key的数据元素的下一个存储地址,若该地址的存储空间没有被占用,则存入,否则继续执行第二步,直到找出一个空闲的存储空间为止。由此可见,如何处理冲突是哈希表不可缺少的部分。 1 开放定址法 这是应用最为广泛的一种冲突处理方法。其公式描述为:Hi=(H(key)+di) MOD L i=1,2,…,k(k 其中:H(key)为哈希函数,L为哈希表的表长,di为增量序列。 根据增量序列取值方法的有三种:(1)线性探测再散列di=1,2,3,…,m-1;(2)二次探测再散列di=12,-12,22,-22,32,...,k2,(k 用线性探测再散列处理冲突可以保证做到,只要哈希表未满,总能找到不发生冲突的地址,但是容易发生二次聚集的情况,即在处理同义词的冲突过程中又添加了非同义词的冲突,效率不高。比如当哈希表中k,k+1,k+2位置上已存放有数据时,下一个哈希地址为k, k+1,k+2和k+3的数据都将填入k+3的位置,这样原本不冲突的哈希地址在经过冲突处理后,反而发生冲突,这种现象对查找不利。

哈希表基本操作

一,哈希表(Hashtable)简述 在.NET Framework中,Hashtable是System.Collections命名空间提供的一个容器,用于处理和表现类似key/value的键值对,其中key通常可用来快速查找,同时key是区分大小写;value用于存储对应于key的值。Hashtable中key/value键值对均为object 类型,所以Hashtable可以支持任何类型的key/value键值对. 二,哈希表的简单操作 在哈希表中添加一个key/value键值对:HashtableObject.Add(key,value); 在哈希表中去除某个key/value键值对:HashtableObject.Remove(key); 从哈希表中移除所有元素:HashtableObject.Clear(); 判断哈希表是否包含特定键key:HashtableObject.Contains(key); 下面控制台程序将包含以上所有操作: using System; using System.Collections; //使用Hashtable时,必须引入这个命名空间 class hashtable { public static void Main() { Hashtable ht=new Hashtable(); //创建一个Hashtable实例 ht.Add("E","e");//添加key/value键值对 ht.Add("A","a"); ht.Add("C","c"); ht.Add("B","b"); string s=(string)ht["A"]; if(ht.Contains("E")) //判断哈希表是否包含特定键,其返回值为true或false Console.WriteLine("the E key:exist"); ht.Remove("C");//移除一个key/value键值对 Console.WriteLine(ht["A"]);//此处输出a ht.Clear();//移除所有元素 Console.WriteLine(ht["A"]); //此处将不会有任何输出 } } 三,遍历哈希表 遍历哈希表需要用到DictionaryEntry Object,代码如下: for(DictionaryEntry de in ht) //ht为一个Hashtable实例 { Console.WriteLine(de.Key);//de.Key对应于key/value键值对key Console.WriteLine(de.Value);//de.Key对应于key/value键值对value

散列表(哈希表)

1. 引言 哈希表(Hash Table)的应用近两年才在NOI(全国青少年信息学奥林匹克竞赛)中出现,作为一种高效的数据结构,它正在竞赛中发挥着越来越重要的作用。 哈希表最大的优点,就是把数据的存储和查找消耗的时间大大降低,几乎可以看成是常数时间;而代价仅仅是消耗比较多的内存。然而在当前可利用内存越来越多的情况下,用空间换时间的做法是值得的。另外,编码比较容易也是它的特点之一。 哈希表又叫做散列表,分为“开散列” 和“闭散列”。考虑到竞赛时多数人通常避免使用动态存储结构,本文中的“哈希表”仅指“闭散列”,关于其他方面读者可参阅其他书籍。 2. 基础操作 2.1 基本原理 我们使用一个下标范围比较大的数组来存储元素。可以设计一个函数(哈希函数,也叫做散列函数),使得每个元素的关键字都与一个函数值(即数组下标)相对应,于是用这个数组单元来存储这个元素;也可以简单的理解为,按照关键字为每一个元素“分类”,然后将这个元素存储在相应“类”所对应的地方。 但是,不能够保证每个元素的关键字与函数值是一一对应的,因此极有可能出现对于不同的元素,却计算出了相同的函数值,这样就产生了“冲突”,换句话说,就是把不同的元素分在了相同的“类”之中。后面我们将看到一种解决“冲突”的简便做法。 总的来说,“直接定址”与“解决冲突”是哈希表的两大特点。 2.2 函数构造 构造函数的常用方法(下面为了叙述简洁,设h(k) 表示关键字为k 的元素所对应的函数值): a) 除余法: 选择一个适当的正整数p ,令h(k ) = k mod p ,这里,p 如果选取的是比较大

的素数,效果比较好。而且此法非常容易实现,因此是最常用的方法。 b) 数字选择法: 如果关键字的位数比较多,超过长整型范围而无法直接运算,可以选择其中数字分布比较均匀的若干位,所组成的新的值作为关键字或者直接作为函数值。 2.3 冲突处理 线性重新散列技术易于实现且可以较好的达到目的。令数组元素个数为S ,则当h(k)已经存储了元素的时候,依次探查(h(k)+i) mod S , i=1,2,3…… ,直到找到空的存储单元为止(或者从头到尾扫描一圈仍未发现空单元,这就是哈希表已经满了,发生了错误。当然这是可以通过扩大数组范围避免的)。 2.4 支持运算 哈希表支持的运算主要有:初始化(makenull)、哈希函数值的运算(h(x))、插入元素(i nsert)、查找元素(member)。设插入的元素的关键字为x ,A 为存储的数组。初始化比较容易,例如: const empty=maxlongint; // 用非常大的整数代表这个位置没有存储元素 p=9997; // 表的大小 procedure makenull; var i:integer; begin for i:=0 to p-1 do A[i]:=empty; End; 哈希函数值的运算根据函数的不同而变化,例如除余法的一个例子:

Delphi 使用哈希表 (键值对 key)

Delphi 使用哈希表(键值对key) 以往在软件开发中经常需要用哈希表保存一些数据结构,C#下的哈希表可以快速检索数据,其实Delphi也提供了对哈希表的支持,下面我就将我在用Delphi开发中使用Hash表的方法写出来,希望对大家有一定的帮助! 在Borland Delphi中有一个THashedStringlist类,使用这个类可以实现Hash表的操作.使用这个类需要引用IniFiles单元. 例如:我们定义的数据结构是: 以下是引用片段: MyHashTest = record Key:Integer; Name:String[20]; Sex:Boolean; Age:Integer; end; PTest = ^MyHashTest ; 1:创建Hash表. ScHash:=THashedStringlist.Create; 2:将数据结构加入Hash表中. var

Index:Integer; p_Test:PTest; Index:=ScHash.IndexOf(IntToStr(p_Test.Key)); if Index=-1 then begin ScHash.AddObject(IntToStr(p_Test.Key),TObject(Integer( p_Test))); end; 在加入Hash表的时候,首先我们检查看这个Key是否在Hash表中,如果Index=-1则说明此Key不在Hash表中,则我们将这个结构指针加入到Hash表中. 3:将数据结构从Hash表中删除. 以下是引用片段: var Index:Integer; t_Object: TObject; Index:=ScHash.IndexOf(IntToStr(p_Test.Key)); if Index -1 then begin t_Object:=ScHash.Objects[Index]; ScHash.Delete(Index);

哈希表

一.问题描述 1问题描述 针对某个集体(比如你所在的班级)中的“人名”设计一个哈希表,使得平均查找长度不超过R,完成相应的建表和查表程序。 2.基本要求 假设人名为中国人姓名的汉语拼音形式。待填入哈希表的人名共有30个,取平均查找长度的上限为2。哈希函数用除留余数法构造,用伪随机探测再散列发处理冲突。 二. 需求分析 (1)针对某个集体中的人名设计一个哈希表,使得平均查找长度不超过R,完成相应的建立和查表程序。 (2)人名为汉语拼音形式,最长不超过19个字符(如:庄双双zhuangshuangshuang)。 (3)假设待填入哈希表的人名有30个,平均查找长度的上限为2。哈希表用除留余数法构造,用伪随机探测在散列法处理冲突。 (4)在输入人名过程中能自动识别非法输入,并给与非法输入的反馈信息要求重新输入。 (5)查找成功时,显示姓名及关键字,并计算和输出查找成功的平均查找长度 三.程序设计 1 .存储结构设计 typedef struct { char *py; //名字的拼音 int k; //拼音所对应的整数 }NAME; typedef struct //哈希表 { char *py; //名字的拼音 int k; //拼音所对应的整数 int si; //查找长度 }HASH; 2 .主要算法设计

(1)姓名(结构体数组)初始化 名字以拼音的形式够成字符串,将字符串的各个字符所对应的ASCII码相加,所得的整数做为哈希表的关键字。 void InitNameList() { char *f; int r,s0,i; NameList[0].py="chenliang";//陈亮 NameList[1].py="chenyuanhao";//陈元浩 NameList[2].py="chengwenliang";//程文亮 NameList[3].py="dinglei";//丁磊 NameList[4].py="fenghanzao";//冯汉枣 NameList[5].py="fuzongkai";//付宗楷 NameList[6].py="hujingbin";//胡劲斌 NameList[7].py="huangjianwu";//黄建武 NameList[8].py="lailaifa";//赖来发 NameList[9].py="lijiahao";//李嘉豪 NameList[10].py="liangxiaocong";//梁晓聪 NameList[11].py="linchunhua";//林春华 NameList[12].py="liujianhui";//刘建辉 NameList[13].py="luzhijian";//卢志健 NameList[14].py="luonan";//罗楠 NameList[15].py="quegaoxiang";//阙高翔 NameList[16].py="sugan";//苏淦 NameList[17].py="suzhiqiang";//苏志强 NameList[18].py="taojiayang";//陶嘉阳 NameList[19].py="wujiawen";//吴嘉文 NameList[20].py="xiaozhuoming";//肖卓明 NameList[21].py="xujinfeng"; //许金峰 NameList[22].py="yanghaichun";//杨海春 NameList[23].py="yeweixiong";//叶维雄 NameList[24].py="zengwei";//曾玮 NameList[25].py="zhengyongbin";//郑雍斌 NameList[26].py="zhongminghua";//钟明华 NameList[27].py="chenliyan";//陈利燕 NameList[28].py="liuxiaohui";//刘晓慧 NameList[29].py="panjinmei";//潘金梅 for(i=0;i

哈希表的操作

哈希表操作 一目的 1.巩固和加深对哈希表的创建、查找、插入等方法理论知识的理解。 2.掌握建立哈希表的办法,本实验是采用的是除留余数法创建。 3.掌握哈希表解决冲突的办法,本实验用的是线性探测再散列的方法。 4.巩固对程序模块化设计的要求。 二需求分析 1.对于哈希表的基本操作首先是要创建一个哈希表,哈希表的创建思想是由哈希函 数得到,本实验就采用了除留余数法创建哈希表。 2.创建好哈希表就需要在哈希表中插入元素,本实验是需要插入单词,所以需要调 用string函数库,通过每个单词的地址数来进行下一步的查找计划。当插入单词地址已经存在时,就产生了冲突,因此需要采用线性探测再散列的方式来解决冲突。 3.当哈希表插入单词完成之后便可以显示哈希表的存储情况,因此需要输出整个哈 希表。 4.要想计算平均查找长度首先要对哈希表中的元素进行查找,当所有单词查找结 束,查找长度也得出。 5.要实现上诉需求,程序需要采用模块化进行设计。 三概要设计 1.基本操作: void Initwordlist(int n) 初始化哈希表 操作结果:以字符形式插入单词,将字符串的各个字符所对应的ASCII码相加,所得的整数做为哈希表的关键字。

void Createhashlist(int n) 创建哈希表,并插入单词 操作结果: (1)用除留余数法构建哈希函数; (2)用线性探测再散列处理冲突。 void find() 查找哈希表中的单词 操作结果:在哈希表中进行查找,输出查找的结果和关键字,并计算和输出查找成功的平均查找长度。 void printhash() 显示哈希表 操作结果:显示哈希表的存储情况:位置%d\t\t关键字%-6d\t\t单词%s\n。 float average() 操作结果:计算出平均查找长度。 void menu() 菜单函数设计 操作结果:显示格式: 1向哈希表中插入单词(<15); 2查找哈希表中的单词; 3显示哈希表的存储情况; 4计算哈希表的平均查找长度; 5退出程序。 int main() 主程序设计 操作结果:通过调用各个函数操作得到结果。

链地址法解决Hash冲突

实验5 链地址法解决Hash 冲突 一、需求分析 1、 输入的必须是数字。 2、演示程序以用户和计算机的对话方式执行,即在计算机显示“提示信息”后之后,由用户在键盘上输入演示程序中规定的运算命令;相应的输入数据和运算结果显示在其后。 3、程序执行的命令包括: (1)输入哈希表的长度、余数等和数据,初始化后创建哈希表。 (2)输出哈希表。 (3)计算平均查找长度 4、测试数据 数据:{ 47, 7, 29, 11, 16, 92, 22, 8, 3, 50, 37, 89 } 哈希函数为:Hash(key)=key mod 11。 二、概要设计 在建立哈希表的过程中,用链地址解决哈希冲突,思想史将具有相同哈希地址的记录链成一个单链表,m 个哈希地址就设m 个单链表,然后用一个数组将m 个单链表的表头指针存储起来,形成一个动态的结构。 在计算平均查找长度时经分析我发现,如果设Hash 表连出的单链表长度为n;则长度为1的出现1次,长度2出现1次.......所以对总的查找长度贡献了n*(n+1)/2次,所以我们在哈希表中设置l 表示单链表的长度,那么总的查找长度就是 l*(l+1)/2(从i=0,到i= 表长度-1) 本程序的设计思路是: 1.建立Hash 表 2.输出哈希地址 3.计算平均查找长度 三、详细设计 int main 主函 void Init_table 初始化Hash V oidcreat_table 创建Hash void print 输出 计算ASL 结束

1、定义头文件 #include #define MAX 30 using namespace std; 2、元素类型、节点类型和辅助变量 typedef struct node { int key;//关键字 int flag;// 有数据的标志 int l;//每个结点引出的链表长度 node *next; }Hashtable[MAX]; Hashtable table; 3,初始化Hash void Init_table(Hashtable &list)//初始化Hash表 { for(int i=0;i>length;//从键盘输入长度cout<<"输入Hash函数的余数"<>mod;//输入余数 cout<<"输入数据的个数"<>account;//输入数据的个数 int date;//临时变量用来输入数据 for(int i=0;i>date; int n=date%mod; if(list[n].flag==0)//该位置为空则插入到该地址,flag变为1 { list[n].key=date; list[n].flag=1; } else//否则链地址解决冲突 { node *p;p=new node;//定义指针变量p并开辟地址 p->key=date;

处理冲突的方法2

处理冲突的方法 通常有两类方法处理冲突:开放定址(Open Addressing)法和拉链(Chaining)法。前者是将所有结点均存放在散列表T[0..m-1]中;后者通常是将互为同义词的结点链成一个单链表,而将此链表的头指针放在散列表T[0..m-1]中。 1、开放定址法 (1)开放地址法解决冲突的方法 用开放定址法解决冲突的做法是:当冲突发生时,使用某种探查(亦称探测)技术在散列表中形成一个探查(测)序列。沿此序列逐个单元地查找,直到找到给定的关键字,或者碰到一个开放的地址(即该地址单元为空)为止(若要插入,在探查到开放的地址,则可将待插入的新结点存人该地址单元)。查找时探查到开放的地址则表明表中无待查的关键字,即查找失败。 注意: ①用开放定址法建立散列表时,建表前须将表中所有单元(更严格地说,是指单元中存储的关键字)置空。 ②空单元的表示与具体的应用相关。 【例】关键字均为非负数时,可用"-1"来表示空单元,而关键字为字符串时,空单元应是空串。 总之:应该用一个不会出现的关键字来表示空单元。 (2)开放地址法的一般形式 开放定址法的一般形式为:hi=(h(key)+di)%m 1≤i≤m-1 其中: ①h(key)为散列函数,di为增量序列,m为表长。 ②h(key)是初始的探查位置,后续的探查位置依次是hl,h2,…,hm-1,即h(key),hl,h2,…,hm-1形成了一个探查序列。 ③若令开放地址一般形式的i从0开始,并令d0=0,则h0=h(key),则有: hi=(h(key)+di)%m 0≤i≤m-1 探查序列可简记为hi(0≤i≤m-1)。 (3)开放地址法堆装填因子的要求 开放定址法要求散列表的装填因子α≤l,实用中取α为0.5到0.9之间的某个值为宜。 (4)形成探测序列的方法 按照形成探查序列的方法不同,可将开放定址法区分为线性探查法、二次探查法、双重散列法等。 ①线性探查法(Linear Probing) 该方法的基本思想是: 将散列表T[0..m-1]看成是一个循环向量,若初始探查的地址为d(即h(key)=d),则最长的探查序列为:d,d+l,d+2,…,m-1,0,1,…,d-1 即:探查时从地址d开始,首先探查T[d],然后依次探查T[d+1],…,直到T[m-1],此后又循环到T[0],T[1],…,直到探查到T[d-1]为止。 探查过程终止于三种情况: (1)若当前探查的单元为空,则表示查找失败(若是插入则将key写入其中); (2)若当前探查的单元中含有key,则查找成功,但对于插入意味着失败;

数据结构与算法-散列表查找操作实验

广州XX学院 数据结构与算法实验报告 专业班级计科181 实验日期2019.12.10 姓名XX 学号20181533 实验名称实验6.散列表查找操作指导教师曾岫 一、实验目的 1.熟悉散列查找方法和特点。 2.掌握散列查找解决冲突的方法。 3.用C语言完成算法和程序设计并上机调试通过; 4.撰写实验报告,给出算法思路或流程图和具体实现(源程序)、算法分析结果(包括时间复杂度、空间复杂度以及算法优化设想)、输入数据及程序运行结果(必要时给出多种可能的输入数据和运行结果)。 二、实验要求 1、程序要求包含头文件以及main函数 2、实验中所设计的函数(算法)需要满足实验的要求 3、程序的编译、运行要成功通过 4、运行的结果正确,且有相应的提示 三、实验环境 WIND7、VC++6.0或C与C++程序设计学习与实验系统 四、实验内容 1.闭散列表查找的实现 2.开散列表查找的实现 五、源代码及算法分析 1.闭散列表查找的实现 #include "stdio.h" #define MAXSIZE 20 /* 假定的最大存储单元数 */ typedef int keytype; /* 以整型为元素类型 */ typedef keytype HTable[MAXSIZE]; /* 定义散列表数组 */ /* 用线性探测再散列法处理冲突建立散列表 */ void creHT(HTable HT,int m,int p) /* 创建长度为m的散列表,p为除留余数中的除数 */ { int i,n=0; keytype x;

for(i=0;im) break; /* n记录散列表中的元素个数 */ i=x%p; /* 计算散列地址 */ while(HT[i]!=-1) /* 线性探测 */ i=(i+1)%m; HT[i]=x; /* 将元素存入空闲单元 */ scanf("%d",&x); } } /* 输出散列表 */ void list(HTable HT,int m) /* 输出长度为m的散列表 */ { int i; for(i=0;i

经典Hash实现(采用拉链法处理冲突)

class StorePageMap { /** * The table, resized as necessary. Length MUST Always be a power of two. */ private Entry[] table; /** * The number of key-value mappings contained in this identity hash map. */ private int size; /** * The next size value at which to resize (capacity * load factor)。 * @serial 不败战神:https://www.wendangku.net/doc/c316693354.html, */ private int threshold; /** * Constructs an empty HashMap with the default initial capacity * (16) and the default load factor (0.75)。 */ StorePageMap() { threshold = 12; table = new Entry[17]; } /** * Returns the number of key-value mappings in this map. * * @return the number of key-value mappings in this map.

*/ 天骄无双:https://www.wendangku.net/doc/c316693354.html, final int size() { return size; } /** * Returns true if this map contains no key-value mappings. * * @return true if this map contains no key-value mappings. */ final boolean isEmpty() { return size == 0; } /** * Returns the first StorePage for the given key. */ final TableStorePage get(long key) { int i = (int)(key % table.length); Entry e = table[i]; 人皇:https://www.wendangku.net/doc/c316693354.html, while (true) { if (e == null) return null; if (e.key == key) return e.value; e = e.next; } }

hash散列表的解决冲突——开放定址法(线性探索再散列法)

以下是列举收集来的三个题目,三个题目是同一个意思, 一,利用线性探测法构造散列表(用除余法来得出散列地址,用开放地址法解决同义词问题)题目:已知一组关键字为(26,36,41,38,44,15,68,12,06,51),用除余法构造散列函数,用线性探查法解决冲突构造这组关键字的散列表。 解答:为了减少冲突,通常令装填因子α

数据结构课程设计-哈希表建立职工管理系统

一.问题描述 职工的姓名通过哈希表建表,基于哈希表的查找,建立一个简单的职工管理 系统。可以利用自己的班集体中的“人名”设计一个哈希表,使得平均查找长度 不超过R,完成相应的建表和查表程序。对将班上同学看做职工进行管理,包括 插入、删除、查找、排序等功能。 二.基本要求 假设人名为中国姓名的汉语拼音形式。待填入哈希表的人名共有30个,取 平均查找长度的上限为2。哈希函数用除留余数法构照,用链表法处理冲突。 职工对象包括姓名、性别、出生年月、工作年月、学历、职务、住址、电话 等信息。 (1)新增一名职工:将新增职工对象按姓名以字典方式职工管理文件中。 (2)删除一名职工:从职工管理文件中删除一名职工对象。 (3)查询:从职工管理文件中查询符合某些条件的职工。 (4)修改:检索某个职工对象,对其某些属性进行修改。 (5)排序:按某种需要对职工对象文件进行排序。 三.数据结构的设计 哈希表(Hash table,也叫散列表),是根据关键码值(Key value)而直接进 行访问的数据结构。也就是说,它通过把关键码值映射到表中一个位置来访问记录,加快查找的速度。这个映射函数叫做散列函数,存放记录的数组叫做哈希表。 1.哈希表的构造方法多种多样,实际工作中需视不同的情况采用不同的哈希 函数,经常在构建一个哈希表选择构造方法时需要考虑的因素有:计算哈希函 数所需时间、关键字的长度、哈希表的大小、关键字的分布情况、记录的查找频 率等。本次题目要求采用除留余数法构造哈希函数,一般方法如下:取关键字被某个不大于散列表表长m的数p除后所得的余数为散列地址。 即 H(key) = key MOD p,p<=m 不仅可以对关键字直接取模,也可在折叠、平方取中等运算之后取模。对p 的选择很重要,一般取素数或m,若p选的不好,容易产生同义词。由于职工主 要以姓名为标志,于是将职工姓名的汉语拼音所对应的ASCII码进行相加,所得 到的和作为关键字;取p等于表长30,则关键字除以30以后所得到的余数便是 哈希地址。 2.哈希表是种数据结构,它可以提供快速的插入操作和查找操作。哈希表也 有一些缺点它是基于数组的,数组创建后难于扩展某些哈希表被基本填满时,性 能下降得非常严重。这个问题是哈希表不可避免的,即冲突现象:对不同的关键 字可能得到同一哈希地址。对于如何解决冲突现象,也有喝多方法,本次题目要 求用连地址法处理冲突。一般方法如下: 将所有关键字为同义词的记录存储在同一线性链表中。假设某哈希函数产生 哈希地址的区间在【0,m-1】上,则设立一个指针型变量 Chain ChainHash[m]; 其每个分量的初始状态都是空指针。凡哈希地址为i的记录都插入到头指针为ChainHash[i]的链表中。在链表中的插入位置可以再表头或表尾,也可以再中间,以保持同义词在同一线性链表中按关键字有序。 3.哈希表的查找过程和建表过程基本一致。给定K值,根据建表时设定的哈

哈希查找_数据结构实验报告

南昌航空大学实验报告 课程名称:数据结构实验名称:实验九查找 班级:学生姓名:学号: 指导教师评定:签名: 题目:编程实现哈希表的造表和查找算法。 要求:用除留余数法构造哈希函数,用二次探测再散列解决冲突。 一、需求分析 1.用户可以根据自己的需求输入一个顺序表(哈希表) 2.通过用除留余数法构造哈希函数,并用开放地址的二次探测再散列解决冲突。 3.在经过排序后显示该哈希表。 4.程序执行的命令包括: (1)创建哈希表(2)输出哈希表(3)二次探测再散列解决冲突 二、概要设计 ⒈为实现上述算法,需要顺序表的抽象数据类型: ADT Hash { 数据对象D:D是具有相同特征的数据元素的集合。各数据元素均含有类型相同,可唯一标识数据元素的关键字。 数据关系R:数据元素同属一个集合。 基本操作P: Creathash(&h) 操作结果:构造一个具有n个数据元素的哈希查找表h。 destroyhash(&h) 初始条件:哈希查找表h存在。 操作结果:销毁哈希查找表h。 displayhash(h) 初始条件:哈希查找表h存在。 操作结果:显示哈希查找表h。 hash(h,&k) 初始条件:哈希查找表h存在。 操作结果:通过除留余数法得到地址用k返回。 hash2 (i,&k) 初始条件:哈希查找表h存在存在,i是除留余数法得到的地址。 操作结果:返回二次探测再散列解决冲突得到的地址k。 search (h,key) 初始条件:哈希查找表h存在。 操作结果:查找表h中的key,若查找成功,返回其地址,否则返回

-1 insert (&h,key) 初始条件:哈希查找表h存在。 操作结果:若表h中没有key,则在h中插入key。 search1(h, key,&p) 初始条件:哈希查找表h存在。 操作结果:在表h中查找key,若没有,则返回p的插入的地址,否 则返回-1。 }ADT Hash 2. 本程序有三个模块: ⑴主程序模块 main(){ 初始化; { 接受命令; 显示结果; } } ⑵创建hash表的模块:主要建立一个哈希表; ⑶解决冲突模块:利用开放地址的二次探测再散列解决冲突; (4)输出哈希表模块:显示已创建哈希表。 三、详细设计 ⒈元素类型,结点类型 typedef struct { int key; }keytype; typedef struct { keytype elem[100]; int length; /*当前的长度*/ int size; /*哈希表的总长*/ }hashtable; /*全局变量*/ int a=0,b=0; /*哈希函数*/ 2.对抽象数据类型中的部分基本操作的伪码算法如下: /*哈希函数*/ int hash(hashtable *h,int k) { return k%h->size; }

相关文档
相关文档 最新文档