文档库 最新最全的文档下载
当前位置:文档库 › Improving the Efficiency of a Wide-Coverage CCG Parser

Improving the Efficiency of a Wide-Coverage CCG Parser

Improving the Efficiency of a Wide-Coverage CCG Parser
Improving the Efficiency of a Wide-Coverage CCG Parser

Improving the Ef?ciency of a Wide-Coverage CCG Parser

Bojan Djordjevic and James R.Curran School of Information Technologies

University of Sydney

NSW2006,Australia {bojan,james}@https://www.wendangku.net/doc/b44170667.html,.au

Stephen Clark

Computing Laboratory

Oxford University Wolfson Building,Parks Road Oxford,OX13QD,UK

stephen.clark@https://www.wendangku.net/doc/b44170667.html,

Abstract

The C&C CCG parser is a highly ef?cient

linguistically motivated parser.The ef?-

ciency is achieved using a tightly-integrated

supertagger,which assigns CCG lexical cat-

egories to words in a sentence.The integra-

tion allows the parser to request more cat-

egories if it cannot?nd a spanning anal-

ysis.We present several enhancements to

the CKY chart parsing algorithm used by the

parser.The?rst proposal is chart repair,

which allows the chart to be ef?ciently up-

dated by adding lexical categories individu-

ally,and we evaluate several strategies for

adding these categories.The second pro-

posal is to add constraints to the chart which

require certain spans to be constituents.Fi-

nally,we propose partial beam search to fur-

ther reduce the search space.Overall,the

parsing speed is improved by over35%with

negligible loss of accuracy or coverage.

1Introduction

A recent theme in parsing research has been the application of statistical methods to linguistically motivated grammars,for example LFG(Kaplan et al.,2004;Cahill et al.,2004),HPSG(Toutanova et al.,2002;Malouf and van Noord,2004),TAG (Sarkar and Joshi,2003)and CCG(Hockenmaier and Steedman,2002;Clark and Curran,2004b).The attraction of linguistically motivated parsers is the potential to produce rich output,in particular the predicate-argument structure representing the under-lying meaning of a sentence.The disadvantage of such parsers is that they are typically not very ef?-

cient,parsing a few sentences per second on com-

modity hardware(Kaplan et al.,2004).The C&C CCG parser(Clark and Curran,2004b)is an order of magnitude faster,but is still limited to around25

sentences per second.

The key to ef?cient CCG parsing is a?nite-state

supertagger which performs much of the parsing work(Bangalore and Joshi,1999).CCG is a lex-icalised grammar formalism,in which elementary syntactic structures—in CCG’s case lexical cate-gories expressing subcategorisation information—are assigned to the words in a https://www.wendangku.net/doc/b44170667.html,G su-pertagging can be performed accurately and ef?-ciently by a Maximum Entropy tagger(Clark and Curran,2004a).Since the lexical categories contain so much grammatical information,assigning them with low average ambiguity leaves the parser,which combines them together,with much less work to do at parse time.Hence Bangalore and Joshi(1999),in the context of LTAG parsing,refer to supertagging as almost parsing.

Clark and Curran(2004a)presents a novel

method of integrating the supertagger and parser:

initially only a small number of categories,on av-

erage,is assigned to each word,and the parser at-

tempts to?nd a spanning analysis using the CKY

chart-parsing algorithm.If one cannot be found,the

parser requests more categories from the supertagger

and builds the chart again from scratch.This process

repeats until the parser is able to build a chart con-

taining a spanning analysis.1

Tsuruoka and Tsujii(2004)investigate a similar idea in the context of the CKY algorithm for a PCFG.

The supertagging accuracy is high enough that the parser fails to?nd a spanning analysis using the initial category assignment in approximately4% of Wall Street Journal sentences(Clark and Curran, 2007).However,parsing this4%,which largely consists of the longer sentences,is disproportion-ately expensive.

This paper describes several modi?cations to the C&C parser which improve parsing ef?ciency with-out reducing accuracy or coverage by reducing the impact of the longer sentences.The?rst involves chart repair,where the CKY chart is repaired when extra lexical categories are added(according to the scheme described above),instead of being rebuilt from scratch.This allows an even tighter integra-tion of the supertagger,in that the parser is able to request individual categories.We explore methods for choosing which individual categories to add,re-sulting in an11%speed improvement.

The next modi?cation involves parsing with con-straints,so that certain spans are required to be con-stituents.This reduces the search space consider-ably by eliminating a large number of constituents which cross the boundaries of these spans.The best set of constraints results in a10%speed improve-ment over the original parser.These constraints are general enough that they could be applied to any constituency-based parser.Finally,we experiment with several beam strategies to reduce the search space,?nding that a partial beam which operates on part of the chart is most effective,giving a further 6.1%ef?ciency improvement.

The chart repair and constraints interact in an in-teresting,and unexpected,manner when combined, giving a35.7%speed improvement overall without any loss in accuracy or coverage.This speed im-provement is particularly impressive because it in-volves techniques which only apply to4%of Wall Street Journal sentences.

2The CCG Parser

Clark and Curran(2004b)describes the CCG parser. The grammar used by the parser is extracted from CCGbank,a CCG version of the Penn Treebank (Hockenmaier,2003).The grammar consists of425 lexical categories plus a small number of combi-natory rules which combine the categories(Steed-man,2000).A Maximum Entropy supertagger?rst

assigns lexical categories to the words in a sen-

tence,which are then combined by the parser using

the combinatory rules.A log-linear model scores

the alternative parses.We use the normal-form

model,which assigns probabilities to single deriva-

tions based on the normal-form derivations in CCG-

bank.The features in the model are de?ned over

local parts of the derivation and include word-word

dependencies.A packed chart representation allows

ef?cient decoding,with the Viterbi algorithm?nd-

ing the most probable derivation.

The supertagger uses a log-linear model to de-

?ne a distribution over the lexical category set for

each word and the previous two categories(Ratna-

parkhi,1996)and the forward backward algorithm

ef?ciently sums over all histories to give a distribu-

tion for each word.These distributions are then used

to assign a set of lexical categories to each word

(Curran et al.,2006).The number of categories in

each set is determined by a parameterβ:all cate-

gories are assigned whose forward-backward proba-

bilities are withinβof the highest probability cate-

gory(Curran et al.,2006).If the parser cannot then

?nd a spanning analysis,the value ofβis reduced—

so that more lexical categories are assigned—and

the parser tries again.This process repeats until an

analysis spanning the whole sentence is found.

In our previous work,when the parser was unable

to?nd a spanning analysis,the chart was destroyed

and then rebuilt from scratch with more lexical cate-

gories assigned to each word.However,this rebuild-

ing process is wasteful because the new chart is al-

ways a superset of the old one and could be created

by just updating the previous chart.We describe the

chart repair process in Section3which allows addi-tional categories to be assigned to an existing chart

and the CKY algorithm run over just those parts of

the chart which require modi?cation.

2.1Chart Parsing

The parser uses the CKY chart parsing algorithm

(Kasami,1965;Younger,1967)described in Steed-

man(2000).The CKY algorithm applies naturally to CCG since the grammar is binary.It builds the chart bottom-up,starting with the lexical categories span-

ning single words,incrementally increasing the span

until the whole sentence is covered.Since the con-

stituents are built in order of span size,at every stage all the sub-constituents which could be used to cre-ate a particular new constituent are already present in the chart.

The charts are packed by grouping together equiv-alent chart entries,which allows a large number of derivations to be represented ef?ciently.Entries are equivalent when they interact in the same manner with both the generation of subsequent parse struc-ture and the statistical parse selection.In practice, this means that equivalent entries have the same span;form the same structures,i.e.the remain-ing derivation plus dependencies,in any subsequent parsing;and generate the same features in any sub-sequent parsing.

The Viterbi algorithm is used to?nd the most probable derivation from a packed chart.For each equivalence class of individual entries,we record the entry at the root of the subderivation which has the highest score for the class.The equivalence classes are de?ned so that any other individual entry can-not be part of the highest scoring derivation for the sentence.The highest-scoring subderivations can be calculated recursively using the highest-scoring equivalence classes that were combined to create the individual entry.

Given a sentence of n words,we de?ne pos∈{0,...,n?1}to be the starting position of an en-try in the chart(represented by a CCG category)and span∈{1,...,n}its length.Let cell(pos,span) be the set of categories which span the sentence from pos to pos+span.These will be combinations of categories in cell(pos,k)and cell(pos+k,span?k) for all k∈{1,...,span?1}.The chart is a two di-mensional array indexed by pos and span.The valid (pos,span)pairs correspond to pos+span≤n, that is,to spans that do not extend beyond the end of the sentence.The squares represent valid cells in Figure1.The span from position3with length4, i.e.cell(3,4),is marked with a diamond in Figure2. 3Chart Repair

The parser interacts with the supertagger by decreas-ing the value of theβparameter when a spanning analysis cannot be found for a sentence.This has the effect of adding more lexical categories to the chart.Instead of rebuilding the chart from scratch

10

5

1

2

3

4

6

7

8

9

Figure1:Cells affected by chart repair.

when new categories are added,it can be repaired by modifying cells that are affected by the new cat-egories.Considering the case where a single lexical category is added to the i th word in an n word sen-tence,the new category can only affect the cells that satisfy pos≤i and pos+span>i.These cells are shown in Figure1for the word at position3.

The number of affected cells is(n?pos)(pos+1), and so the average over the sentence is approxi-

mately1

n

n?1

(n?p)(p+1)dp≈n2

6

cells.The

total number of cells in the chart is n(n+1)

2

.The chart can therefore be repaired bottom up,in CKY order, by updating a third of the cells on average. Additional lexical categories for a word are in-serted into the corresponding cell in the bottom row, with the additional categories being marked as new. For each cell C in the second row,each pair of cells A and B is considered whose spans combine to cre-ate the span of C.In the original CKY,all categories from A are combined with all categories from B.In chart repair,categories are only combined if at least one of them is new,because otherwise the result is already in C.The categories added to C are marked, and the process is repeated for all affected cells in CKY order.

Chart repair speeds up parsing for two reasons. First,it reuses previous computations and eliminates wasteful rebuilding of the chart.Second,it allows

Figure2:Cells affected by adding a constraint. lexical categories to be added to the chart one at a time until a spanning derivation is found.In the orig-inal approach extra categories were added in bulk by changing theβlevel,which signi?cantly increased the average ambiguity.Chart repair allows the min-imum amount of ambiguity to be added for a span-ning derivation to be found.

The C&C parser has a prede?ned limit on the num-ber of categories in the chart.If this is exceeded before a spanning analysis is found then the parser fails on the sentence.Our new strategy allows a chart containing a spanning analysis to be built with the minimum number of categories possible.This means that some sentences can now be parsed that would have previously exceeded the limit,slightly increasing coverage.

3.1Category selection

The order in which lexical categories are added to the chart will impact on parsing speed and accu-racy,and so we evaluate several alternatives.The ?rst ordering(βVALUE)is by decreasingβvalue, where theβvalue is the ratio between the probabil-ity of the most likely category and the probability of the given category for that word.2The second or-2We are overloading the use ofβfor convenience.Here,βrefers to the variable ratio dependent on the particular category, whereas theβvalue used in supertagging is a cutoff applied to the variable ratio.dering(P ROB)is by decreasing category probability as assigned by the supertagger using the forward-backward algorithm.

We also investigated ordering categories using in-formation from the chart.Examining the sentences which required chart repair showed that,when a word is missing the correct category,the cells af-fected(as de?ned in Section3)by the cell are often empty.The C HART ordering uses this observation to select the next lexical category to assign.It selects the word corresponding to the cell with the high-est number of empty affected cells,and then adds the highest probability category not in the chart for that word.Finally,we included a R ANDOM ordering baseline for comparison purposes.

4Constraints

The set of possible derivations can be constrained if we know in advance that a particular span is re-quired to be the yield of a single constituent in the correct parse.A constraint on span p reduces the search space because p must be the yield of a single cell.This means that cells with yields that cross the boundary of p cannot be part of a correct derivation, and do not need to be considered(the grey cells in Figure2).In addition,if a cell yields p as a pre?x or suf?x(the hashed cells in Figure2)then it also has constraints on how it can be created.

Figure2shows an example constraint requiring words3–6to be a constituent,which corresponds to p=cell(3,4).Consider cell(3,7):it yields words 3–9and so contains p as the pre?x.Normally it can be created by combining cell(3,1)with cell(4,6), or cell(3,2)with cell(5,5),and so on up to cell(3,6) with cell(9,1).However the?rst three combinations are not allowed because the second child crosses the boundary of p.This gives a lower limit for the span of the left child.Similarly,if p is the suf?x of the span of a cell then there is a lower limit on the span of the right child.

As the example demonstrates,a single constraint can eliminate many combinations,reducing the search space signi?cantly,and thus improving pars-ing ef?ciency.

4.1Creating Constraints

How can we know in advance that the correct deriva-tion must yield speci?c spans,since this appears to

require knowledge of the parse itself?We have ex-plored constraints derived from shallow parsing and from the raw sentence.Our results demonstrate that simple constraints can reduce parsing time signi?-cantly without loss of coverage or accuracy. Chunk tags were used to create constraints.We experimented with both gold standard chunks from the Penn Treebank and also chunker output from the C&C chunk tagger.The tagger is very similar to the Maximum Entropy POS tagger described in Curran and Clark(2003).Only NP chunks were used be-cause the accuracy of the tagger for other chunks is lower.The Penn Treebank chunks required modi-?cation because CCGbank analyses some construc-tions differently.We also created longer NP s by con-catenating adjacent base NP s,for example in the case of possessives.

A number of punctuation constraints were used and had a signi?cant impact especially for longer sentences.There are a number of punctuation rules in CCGbank which absorb a punctuation mark by combining it with a category and returning a cate-gory of the same type.These rules are very produc-tive,combining with many constituent types.How-ever,in CCGbank the sentence?nal punctuation is always attached at the root.A constraint on the?rst n?1words was added to force the parser to only attach the sentence?nal punctuation once the rest of the sentence has been parsed.

Constraints are placed around parenthesised and quoted phrases that usually form constituents be-fore attaching elsewhere.Constraints are also placed around phrases bound by colons,semicolons,or hy-phens.These constraints are especially effective for long sentences with many clauses separated by semicolons,reducing the sentence to a number of smaller units which signi?cantly improves parsing ef?ciency.

In some instances,adding constraints can be harmful to parsing ef?ciency and/or https://www.wendangku.net/doc/b44170667.html,ck of precision in the constraints can come from noisy output from NLP components,e.g.the chunker,or from rules which are not always applicable,e.g. punctuation constraints.We?nd that the punctua-tion constraints are particularly effective while the gold standard chunks are required to gain any ben-e?t for the NP constraints.Adding constraints also has the potential to increase coverage because the re-duced search space means that longer sentences can be parsed without exceeding the pre-de?ned limits on chart size.

5Selective Beam Search

Beam search involves greedy elimination of low probability partial derivations before they can form complete derivations.It is used in many parsers to reduce the search space,for example Collins(2003). We use a variable width beam where all categories c in a particular cell C that satisfy score(c)< max{score(x)|x∈C}?B,for some beam cut-off B,are removed.The category scores score(c) are log probabilities.

In the C&C parser,the entire packed chart is con-structed?rst and then the spanning derivations are marked.Only the partial derivations that form part of spanning derivations are scored to select the best parse,which is a small fraction of the categories in the chart.Because the categories are scored with a complex statistical model with a large number of features,the time spent calculating scores is signif-icant.We found that applying a beam to every cell during the construction of the chart was more expen-sive than not using the beam at all.When the beam was made harsh enough to be worthwhile,it reduced accuracy and coverage signi?cantly.

We propose selective beam search where the beam is only applied to spans of particular lengths. The shorter spans are most important to cull because there are many more of them and removing them has the largest impact in terms of reducing the search space.However,the supertagger already acts like a beam at the lexical category level and the parser model has fewer features at this level,so the beam may be more accurate for longer spans.We there-fore expect the beam to be most effective for spans of intermediate length.

6Experiments

The parser was trained on CCGbank sections02-21 and section00was used for development.The per-formance is measured in terms of coverage,F-score and parsing time.The F-score is for labelled depen-dencies compared against the predicate-argument dependencies in CCGbank.The time reported in-cludes loading the grammar and statistical model,

which takes around5seconds,and parsing the1913

sentences in section00.

The failure rate(opposite of coverage)is broken

down into sentences with length≤40and>40

because longer sentences are more dif?cult to parse

and the C&C parser already has very high coverage

on shorter sentences.There are17841-40word sen-

tences and12941+word sentences.The average

length and standard deviation in the41+set are50.8

and31.5respectively.

All experiments used gold standard POS tags.

Original and REPAIR do not use constraints.The NP(GOLD)experiments use Penn Treebank gold standard NP chunks to determine an upper bound

on the utility of chunk constraints.The times re-

ported for NP(C&C)using the C&C chunker include

the time to load the chunker model and run the chun-

ker(around1.3seconds).PUNCT adds all of the

punctuation constraints.

Finally the best system was compared against the

original parser on section23,which has2257sen-

tences of length1-40and153of length41+.The

maximum length is only65,which explains the high

coverage for the41+section.

6.1Chart Repair Results

The results in Table1show that chart repair gives

an immediate11.1%improvement in speed and a

small0.21%improvement in accuracy.96.1%of

sentences do not require chart repair because they

are successfully parsed using the initial set of lexi-

cal categories supplied by the supertagger.Hence,

11%is a signi?cant improvement for less than4%

of the sentences.

We believe the accuracy was improved(on top of

the ef?ciency)because of the way the repair process

adds new categories.Adding categories individually

allows the parser to be in?uenced by the probabil-

ities which the supertagger assigns,which are not

directly modelled in the parser.If we were to add

this information from the supertagger into the parser

statistical model directly we would expect almost

no accuracy difference between the original method

and chart repair.

Table2shows the impact of different category

ordering approaches for chart repair(with PUNCT

constraints).The most effective approach is to use

the information from the chart about the proportion

METHOD secs%F-SCORE CATS

RANDOM70.2-16.286.5723.1

βVALUE60.4—86.6615.7

PROB60.10.586.6514.3

CHART57.2 5.386.617.0

Table2:Category ordering for chart repair.

of empty cells,which adds half as many categories on average as theβvalue and probability based ap-proaches.All of our approaches signi?cantly out-perform randomly selecting extra categories.The CHART category ordering is used for the remaining experiments.

6.2Constraints Results

The results in Table1show that,without chart re-pair,using gold standard noun phrases does not im-prove ef?ciency,while using noun phrases identi-?ed by the C&C chunker decreases speed by10.8%. They both also slightly reduce parsing accuracy. The number of times the parsing process had to be restarted with the constraints removed,was more costly than the reduction of the search space.This is unsurprising because the chunk data was not ob-tained from CCGbank and the chunker is not ac-curate enough for the constraints to improve pars-ing ef?ciency.The most frequent inconsistencies between CCGbank and chunks extracted from the Penn Treebank were?xed in a preprocessing step as explained in Section4.1,but the less frequent con-structions are still problematic.

The best results for parsing with constraints(with-out repair)were with both punctuation and gold standard noun phrase constraints,with20.5%im-provement in speed and0.42%in coverage,but an F-score penalty of0.3%.This demonstrates the pos-sible ef?ciency gain with a perfect chunker–the corresponding results with the C&C chunker are still worse than without constraints.The best results without a decrease in accuracy use only punctuation constraints,with10.4%increase in speed and0.37% in coverage.The punctuation constraints also have the advantage of being simple to implement.

The best overall ef?ciency gain was obtained when punctuation and gold standard noun phrases were used with chart repair,with a45.4%improve-ment in speed and0.63%in coverage,and a0.4%

METHOD secs%F-SCORE COVER n≤40n>40 Original88.3—86.5498.850.39211.63 REPAIR78.511.186.7599.010.33610.08 NP(GOLD)88.4-0.186.2799.060.22410.85 NP(C&C)97.8-10.886.3199.160.2249.30 PUNCT79.110.486.5699.220.1689.30 NP(GOLD)+PUNCT69.820.586.2499.270.1688.53 NP(C&C)+PUNCT97.0-9.986.3199.160.16810.08 NP(GOLD)+REPAIR65.026.486.0499.370.224 6.20 NP(C&C)+REPAIR77.512.286.3599.370.224 6.20 PUNCT+REPAIR57.235.286.6199.480.168 5.43 NP(GOLD)+PUNCT+REPAIR48.245.486.1499.480.168 5.43 NP(C&C)+PUNCT+REPAIR63.228.486.4399.530.163 3.88

Table1:Parsing performance on section00with constraints and chart repair METHOD secs%F-SCORE COVER n≤40n>40 Original88.3—86.5498.850.39211.63 PUNCT79.110.486.5699.220.1689.30 REPAIR78.511.186.7599.010.33610.08 PUNCT+REPAIR57.235.286.6199.480.168 5.43 PUNCT+REPAIR+BEAM52.440.786.5699.480.168 5.43

Table3:Best performance on Section00

drop in accuracy.The best results without a drop in accuracy were with only punctuation constraints and chart repair,with improvements of35.2%speed and 0.63%coverage.Coverage on both short and long sentences is improved–the best results show a43% and67%decrease in failure rate for sentence lengths in the ranges1-40and41+respectively.

6.3Partial Beam Results

We found that using the selective beam on1–2word spans had negligible impact on speed and accuracy. Using the beam on3–4word spans had the most im-pact without accuracy penalty,improving ef?ciency by another~5%.Experiments with the selective beam on longer spans continued to improve ef?-ciency,but with a much greater penalty in F-score, e.g.a further~5%at a cost of0.5%F-score for3–6 word spans.However,we are interested in ef?ciency improvements with negligible cost to accuracy.

6.4Overall Results

Table3summarises the results for section00.The chart repair and punctuation constraints individually increase parsing ef?ciency by around10%.How-ever,the most interesting result is that in combina-tion they increase ef?ciency by over35%.This is because the cost of rebuilding the chart when the constraints are incorrect has been signi?cantly re-duced by chart repair.Finally,the use of the selec-tive beam gives modest improvement of5.5%.The overall ef?ciency gain on section00is40.7%with an additional0.5%coverage,halving both the num-ber of short and long sentences that fail to be parsed. Table4shows the performance of the punctuation constraints,chart repair and selective beam system on section23.The results are consistent with sec-tion00,showing a30.9%improvement in speed and 0.29%in coverage,with accuracy staying at roughly the same level.The results show a consistent35-40%reduction in parsing time and a40-65%reduc-tion in parse failure rate.

7Conclusion

We have introduced several modi?cations to CKY parsing for CCG that signi?cantly increase parsing

METHOD secs%F-SCORE COVER n≤40n>40 Original91.3—86.9299.290.621 1.961 PUNCT+REPAIR+BEAM58.735.786.8299.580.3990.654

Table4:Best performance on Section23

ef?ciency without an accuracy or coverage penalty. Chart repair improves ef?ciency by reusing the chart from the previous parse attempts.This allows us to further tighten the parser-supertagger integra-tion by adding one lexical category at a time until a spanning derivation is found.We have also explored several approaches to selecting which category to add next.We intend to further explore strategies for determining which category to add next when a parse fails.This includes combining chart and prob-ability based orderings.Chart repair alone gives an 11.1%ef?ciency improvement.

Constraints improve ef?ciency by avoiding the construction of sub-derivations that will not be used. They have a signi?cant impact on parsing speed and coverage without reducing the accuracy,provided the constraints are identi?ed with suf?cient preci-sion.Punctuation constraints give a10.4%improve-ment,but NP constraints require higher precision NP chunking than is currently available for CCGbank. Constraints and chart repair both manipulate the chart for more ef?cient parsing.Adding categories one at a time using chart repair is almost a form of agenda-based parsing.We intend to explore other methods for pruning the space and agenda-based parsing,in particular A*parsing(Klein and Man-ning,2003),which will allow only the most proba-ble parts of the chart to be built,improving ef?ciency while still ensuring the optimal derivation is found. When all of our modi?cations are used parsing speed increases by35-40%and the failure rate de-creases by40-65%,both for sentences of length1-40 and41+,with a negligible accuracy penalty.The re-sult is an even faster state-of-the-art wide-coverage CCG parser.

Acknowledgements

We would like to thank the anonymous reviewers for their feedback.James Curran was funded under ARC Discovery grants DP0453131and DP0665973.References

Srinivas Bangalore and Aravind Joshi.1999.Supertag-ging:An approach to almost https://www.wendangku.net/doc/b44170667.html,putational Linguistics,25(2):237–265.

A.Cahill,M.Burke,R.O’Donovan,J.van Genabith,

and A.Way.2004.Long-distance dependency resolu-tion in automatically acquired wide-coverage PCFG-based LFG approximations.In Proceedings of the 42nd Meeting of the ACL,pages320–327,Barcelona, Spain.

Stephen Clark and James R.Curran.2004a.The impor-tance of supertagging for wide-coverage CCG parsing.

In Proceedings of20th International Conference on Computational Linguistics,pages282–288,Geneva, Switzerland,23–27August.

Stephen Clark and James R.Curran.2004b.Parsing the WSJ using CCG and log-linear models.In Pro-ceedings of the42nd Annual Meeting of the Associ-ation for Computational Linguistics,pages104–111, Barcelona,Spain,21–26July.

Stephen Clark and James R.Curran.2007.Wide-coverage ef?cient statistical parsing with CCG and log-linear https://www.wendangku.net/doc/b44170667.html,putational Linguistics.(to ap-pear).

Michael Collins.2003.Head-driven statistical models for natural language https://www.wendangku.net/doc/b44170667.html,putational Linguis-tics,29(4):589–637.

James R.Curran and Stephen Clark.2003.Investigat-ing GIS and smoothing for maximum entropy taggers.

In Proceedings of the10th Conference of the European Chapter of the Association for Computational Linguis-tics,pages91–98,Budapest,Hungary,12–17April. James R.Curran,Stephen Clark,and David Vadas.2006.

Multi-tagging for lexicalized-grammar parsing.In Proceedings of the Joint Conference of the Interna-tional Committee on Computational Linguistics and the Association for Computational Linguistics,pages 697–704,Sydney,Austrailia,17–21July.

Julia Hockenmaier and Mark Steedman.2002.Gener-ative models for statistical parsing with Combinatory Categorial Grammar.In Proceedings of the40th Meet-ing of the ACL,pages335–342,Philadelphia,PA.

Julia Hockenmaier.2003.Data and Models for Statis-tical Parsing with Combinatory Categorial Grammar.

Ph.D.thesis,University of Edinburgh.

Ron Kaplan,Stefan Riezler,Tracy H.King,John T.Maxwell III,Alexander Vasserman,and Richard Crouch.2004.Speed and accuracy in shallow and deep stochastic parsing.In Proceedings of the Human Language Technology Conference and the4th Meeting of the North American Chapter of the Association for Computational Linguistics(HLT-NAACL’04),Boston, MA.

J.Kasami.1965.An ef?cient recognition and syntax analysis algorithm for context-free languages.Techni-cal Report AFCRL-65-758,Air Force Cambridge Re-search Laboratory,Bedford,MA.

Dan Klein and Christopher D.Manning.2003.A*pars-ing:Fast exact Viterbi parse selection.In Proceed-ings of Human Language Technology and the North American Chapter of the Association for Computa-tional Linguistics Conference,pages119–126,Ed-mond,Canada.

Robert Malouf and Gertjan van Noord.2004.Wide coverage parsing with stochastic attribute value gram-mars.In Proceedings of the IJCNLP-04Workshop: Beyond shallow analyses-Formalisms and statistical modeling for deep analyses,Hainan Island,China. Adwait Ratnaparkhi.1996.A maximum entropy part-of-speech tagger.In Proceedings of the EMNLP Con-ference,pages133–142,Philadelphia,PA.

Anoop Sarkar and Aravind Joshi.2003.Tree-adjoining grammars and its application to statistical parsing.In Rens Bod,Remko Scha,and Khalil Sima’an,editors, Data-oriented parsing.CSLI.

Mark Steedman.2000.The Syntactic Process.The MIT Press,Cambridge,MA.

Kristina Toutanova,Christopher Manning,Stuart Shieber,Dan Flickinger,and Stephan Oepen.2002.

Parse disambiguation for a rich HPSG grammar.In Proceedings of the First Workshop on Treebanks and Linguistic Theories,pages253–263,Sozopol, Bulgaria.

Yoshimasa Tsuruoka and Jun’ichi Tsujii.2004.Iterative CKY parsing for probabilistic context-free grammars.

In Proceedings of the IJCNLP conference,pages52–60,Hainan Island,China.

D.Younger.1967.Recognition and parsing of context-

free languages in time https://www.wendangku.net/doc/b44170667.html,rmation and Control, 10(2):189–208.

相关文档