文档库 最新最全的文档下载
当前位置:文档库 › 2013年4月-The Plant Journal-【水稻】-QTL-seq

2013年4月-The Plant Journal-【水稻】-QTL-seq

2013年4月-The Plant Journal-【水稻】-QTL-seq
2013年4月-The Plant Journal-【水稻】-QTL-seq

TECHNICAL ADVANCE/RESOURCE

QTL-seq:rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations

Hiroki Takagi 1,2,Akira Abe 2,3,Kentaro Yoshida 1,Shunichi Kosugi 1,Satoshi Natsume 1,Chikako Mitsuoka 1,Aiko Uemura 1,Hiroe Utsushi 1,Muluneh Tamiru 1,Shohei Takuno 4,Hideki Innan 5,Liliana M.Cano 6,Sophien Kamoun 6and Ryohei Terauchi 1,*1

Iwate Biotechnology Research Center,Kitakami,Iwate,024-0003,Japan,2

United Graduate School of Iwate University,Morioka,Iwate,020-8550,Japan,3

Iwate Agricultural Research Center,Kitakami,Iwate,024-0003,Japan,4

Department of Plant Sciences,University of California,Davis,CA 95616,USA,5

Graduate University for Advanced Studies,Hayama,Japan,and 6

The Sainsbury Laboratory,Norwich Research Park,Norwich,UK

Received 7September 2012;revised 13December 2012;accepted 20December 2012;published online 05January 2013.*For correspondence (e-mail terauchi@ibrc.or.jp).

SUMMARY

The majority of agronomically important crop traits are quantitative,meaning that they are controlled by multiple genes each with a small effect (quantitative trait loci,QTLs).Mapping and isolation of QTLs is important for ef?cient crop breeding by marker-assisted selection (MAS)and for a better understanding of the molecular mechanisms underlying the traits.However,since it requires the development and selection of DNA markers for linkage analysis,QTL analysis has been time-consuming and labor-intensive.Here we report the rapid identi?cation of plant QTLs by whole-genome resequencing of DNAs from two populations each composed of 20–50individuals showing extreme opposite trait values for a given phenotype in a segregating progeny.We propose to name this approach QTL-seq as applied to plant species.We applied QTL-seq to rice recombinant inbred lines and F 2populations and successfully identi?ed QTLs for important agronomic traits,such as partial resistance to the fungal rice blast disease and seedling vigor.Simulation study showed that QTL-seq is able to detect QTLs over wide ranges of experimental variables,and the method can be generally applied in population genomics studies to rapidly identify genomic regions that underwent arti?cial or natural selective sweeps.

Keywords:quantitative trait loci,breeding,whole genome sequencing,next generation sequencer,selective sweep,technical advance.

INTRODUCTION

The world’s population has already exceeded 7billion and is still growing,while the amount of land suitable for agri-culture is decreasing due to a variety of factors such as rapid climate change.Therefore there is a great demand for ef?cient crop improvement to increase yield without further expanding farmland and damaging the environ-ment (Godfray,2010;David et al.,2011).

In crop plants,multiple genes each with a relatively minor effect control the majority of agronomically impor-tant traits.These genes are called quantitative trait loci (QTLs)(Falconer and Mackay,1996).Identi?cation of QTLs

is an important task in plant breeding.Once a QTL control-ling a favorable trait is mapped with closely linked DNA markers,it is introduced into an elite cultivar by crossing of the recurrent elite parent to the donor plant.Following each backcross,the progeny inheriting the desirable QTL are selected by using tightly linked DNA makers,a process known as marker-assisted selection (MAS;Ashikari and Matsuoka,2006).Marker-assisted selection reduces the effort and time needed for phenotype evaluation of the progeny during successive rounds of selection,and also improves introgression breeding.

?2013The Authors

The Plant Journal ?2013Blackwell Publishing Ltd

174

The Plant Journal (2013)74,174–183doi:

10.1111/tpj.12105

Traditionally QTLs have been identi?ed by linkage analy-sis of progeny derived from a cross between parents show-ing contrasting phenotypes for a trait of interest.To perform linkage analysis,DNA markers capable of discrimi-nating parental genomes are required.Due to this require-ment,parents for crosses are selected from genetically distantly related cultivars.This entails that parents may be different in many QTLs controlling a given phenotype, complicating the isolation of individual loci.On the other hand,whenever closely related parents are used,identi?-cation of suf?cient DNA markers for linkage analysis becomes a limiting step.

Bulked-segregant analysis(BSA)is an elegant method to identify DNA markers tightly linked to the causal gene for a given phenotype(Giovannoni et al.,1991;Michelmore et al.,1991).Following a cross between parental lines showing contrasting phenotypes,the resulting F2progeny are scored for segregation of the phenotype.Two bulked DNA samples are generated from the progeny showing contrasting phenotypes,and DNA markers exhibiting dif-ferences between the two bulks are screened.In the origi-nal reports,DNAs bulked from F2progeny were screened with restriction fragment polymorphisms(RFLPs)and ran-dom ampli?ed polymorphic DNA(RAPD)markers to iden-tify the markers linked to the traits of https://www.wendangku.net/doc/1618006120.html,ter,BSA was applied to identify QTLs(Mansur et al.,1993;Darvasi and Soller,1994),which is sometimes called‘selective DNA pooling’.However,in these analyses,the availability of DNA markers was the main factor limiting effectiveness of the methods.Furthermore,genotyping of each marker for the two bulked DNAs is still time-consuming and costly.

Recent development of whole genome sequencing has accelerated the analysis of QTLs in yeast,a model organ-ism with a relatively small genome size(12.5Mb).Ehrenr-eich et al.(2010)made a cross between two diploid yeast strains and obtained a large number of haploid progeny. They then applied BSA to select two populations with extreme phenotypes,and genotyped the bulked DNA with a single nucleotide polymorphism(SNP)microarray and whole genome sequencing,which successfully identi?ed the location of QTLs involved in resistance to various chemical compounds.The proposed method is called X-QTL since an extremely large number of progeny were used in each bulk.Similar applications of whole-genome sequencing to BSA are reported in yeast with successful identi?cation of QTLs for xylose utilization(Wenger et al., 2010),heat tolerance(Parts et al.,2011),and ethanol toler-ance(Swinnen et al.,2012).However,the application of whole genome sequencing to BSA for identifying QTLs in plant with much larger genome sizes than yeast has not been reported to date.

In this paper,we report plant QTL identi?cation using whole-genome resequencing of two DNA bulks of progeny (each with20–50individuals)showing extreme phenotypic values by next-generation sequencing(NGS)technology. Since this approach has a wide applicability in QTL identi?cation in plant species,including crops,we propose to name the method QTL-seq as speci?cally applied to plant species.Because it does not require DNA marker development and genotyping,the most time-consuming and costly procedure needed for the conventional QTL analysis,QTL-seq allows the rapid identi?cation of QTLs.

RESULTS

Principle of QTL-seq

The principle of QTL-seq is shown in Figure1and is explained by taking rice as an example.QTL-seq combines bulked-segregant analysis(Giovannoni et al.,1991;Michel-more et al.,1991;Mansur et al.,1993;Darvasi and Soller, 1994)and whole-genome resequencing for rapid identi?ca-tion of the genomic regions that differ between the two parents used in a genetic cross and also contribute to the higher and lower values of the traits of interest among the resulting progeny.For QTL mapping using QTL-seq,we ?rst generate a mapping population by crossing two culti-vars showing contrasting phenotypes for the traits of inter-est.In Figure1,we assume that we are interested in plant height and that cultivars A and B have a low and high stat-ure,respectively.Different kinds of mapping populations can be used for QTL-seq depending on the traits to be studied.Recombinant inbred lines(RILs)and doubled hap-loids(DH)show a high degree of homozygosity,and indi-viduals in each line can be regarded as proxy clones that allow replicated measurements of the phenotype,and thus are suitable for detecting QTLs of minor effects.The advantage of using an F2population is the a short time required for its generation.However,no replicated mea-surements are possible for each genotype.As a result,the approach is not suitable for detecting minor effect QTLs. After the progeny of a mapping population are measured for the focused trait,we score segregation of the pheno-type.If the number of QTLs involved in the trait variation is multiple,frequency distribution of measured values will be close to the normal(Gaussian)distribution(Figure1a). Here,we focused on the multiple progeny showing extreme phenotypes,i.e.those exhibiting the highest and the lowest extreme values.We sampled DNA from10to20 individuals from each extremity and bulked them to gener-ate‘Highest’bulk and‘Lowest’bulk.Each of the bulked DNAs was applied to whole genome resequencing with a>6x genome coverage.We expect the bulked DNA to contain genomes from both parents in a1:1ratio for the majority of genomic regions.However,we should detect unequal representation of the genomes from the two par-ents in the genomic regions harboring QTL for the pheno-typic difference between‘Highest’and‘Lowest’bulks.

?2013The Authors

The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183

Whole genome sequencing reveals QTL175

To examine the relative amount of the genomes derived from the two parents,we evaluated the proportion of short reads corresponding to each of the two parental genomes that can be discriminated by single nucleotide polymor-phisms(SNPs)available between the two.After aligning the sequence data to the reference sequence of either of the two parents,we counted the number(k)of short reads harboring SNPs that are different from the reference sequence.We de?ned the proportion of k in the total short reads(n)covering a particular genomic position(=k nà1) as the SNP-index(Figure1b;Abe et al.,2012a).The SNP-index is0if the entire short reads contain genomic frag-ments from the parent that was used as a reference sequence.The SNP-index is1if all the short reads repre-sent the genome from the other parent.A SNP-index of0.5 means an equal contribution of both parents’genomes to the bulked progeny.Accordingly,the SNP-index is calcu-lated for all the SNPs detected between the two parents, and the relationships between SNP-index and SNP posi-tion in the genome is graphically represented(Figure1c). We carried out this procedure separately for the‘Highest’and‘Lowest’bulk sequences.

In practice,SNPs with SNP-index<0.3in both bulked sequences are?ltered out during SNP calling because they cannot be discriminated from spurious SNPs caused by sequencing or alignment errors.However,if SNPs with a SNP-index of0.3or greater are present in only one of the two bulks,we consider them as real SNPs and assume their presence in the other bulk as well.In this case,we make use of the SNP-index value of the other bulked DNA even if it is<0.3(see Experimental Procedures).By taking an average of SNP-indices of SNPs located in a given

(a)

(b)

(c)Figure1.A simpli?ed scheme of QTL-seq as

applied to rice.

(a)Two inbred cultivars with contrasting phe-

notypes are crossed to generate F2progeny that

are segregating for the trait value.In this exam-

ple,parent A has low stature while parent B has

high stature.Since multiple quantitative trait

loci(QTLs)control plant height,frequency of

plant height among the F2progeny follows the

normal distribution.We select multiple progeny

with highest and lowest stature,and bulk their

DNAs to make‘Highest’and‘Lowest’bulk,

respectively.These DNA bulks are applied to

whole genome resequencing and aligned to the

reference sequence of cultivar A to calculate the

single nucleotide polymorphism(SNP)-index.

(b)De?nition of SNP-index.Short reads gener-

ated by whole-genome sequencing are aligned

to the reference sequence.If10short reads

cover a given nucleotide position,the coverage

of the site is10.Among the10reads,if four

contain a SNP different from the reference

nucleotide,the SNP-index is de?ned as0.4.On

the other hand,if all the reads harbor a SNP

different from the reference,the SNP-index is

1.0.

(c)Examples of SNP-index plot.The QTL can be

identi?ed as peaks or valleys of the SNP-index

plot.Each spot corresponds to a SNP,and the

x-axis corresponds to the chromosomal posi-

tion.Lines are average values of SNP-index or

D(SNP-index)drawn by sliding window analy-

sis.Top:SNP-index plot of‘Highest’bulk.Mid-

dle:SNP-index plot of‘Lowest’bulk.Bottom:a

plot of D(SNP-index).

?2013The Authors The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183

176Hiroki Takagi et al.

genomic interval,sliding window analysis can be applied to facilitate visualization of the graphs.We expect the SNP-index graphs of ‘Highest’and ‘Lowest’bulks to be identical for the genomic regions that are not relevant to the pheno-typic difference between the two.However,the genomic regions harboring QTLs that contribute to the difference in the phenotype between the two bulks should exhibit unequal contributions from the two parental genomes.Furthermore,SNP-indices of these regions for ‘Highest’and ‘Lowest’bulks would appear as mirror images with respect to the line of SNP-index =0.5.Such regions are expected to have a high probability of containing QTLs responsible for the trait difference between the ‘Highest’and ‘Lowest’https://www.wendangku.net/doc/1618006120.html,parison of the two graphs is important to discern the QTLs from the genomic regions showing segregation distortion caused by reasons other than the imposed arti?cial selection (e.g.meiotic drive),and result in departure of the SNP-index from 0.5in both bulks in the same direction.It is therefore convenient to combine the two graphs for ‘Highest’and ‘Lowest’bulks by subtracting the SNP-index value of the latter from the former to generate the graph of D (SNP-index)(Figure 1c).In this graph,D (SNP-index)=1if the bulked DNA com-prises only parent B genome,D (SNP-index)=à1if it is of

parent A genome only and D (SNP-index)=0if both par-ents have the same SNP-indices at the genomic regions.QTL-seq applied to RILs:detection of QTLs controlling partial resistance to rice blast in Nortai

We applied QTL-seq for the detection of QTLs involved in partial resistance of the rice cultivar Nortai against the fun-gal pathogen Magnaporthe oryzae ,the causal agent of rice blast disease.Resistance of Nortai to M.oryzae race 037.1does not seem to be mediated by typical R-genes;the hypersensitive response cannot be clearly distinguished and the trait is quantitative and dif?cult to measure.We crossed Nortai to the cultivar Hitomebore that is highly susceptible to the race 037.1and obtained F 2(Figure 2a).Each F 2progeny was established as a line and brought to the F 7generation by a single-seed descent method to gen-erate a total of 241RILs.

Using the 241RILs,we carried out M.oryzae inoculation assay to assess the resistance of the progeny.Susceptibil-ity of the progeny was measured and categorized to seven classes from class 4(resistant)to class 10(highly suscepti-ble;Figure 2b).The inoculation assay was conducted four times over 4years to ensure the correct scoring of each RIL (Figure S1in Supporting Information).The average

(a)(d)

(b)

(c)

Figure 2.QTL-seq applied to rice recombinant inbred lines (RILs)identi?es quantitative trait loci (QTLs)conferring partial blast resistance to the Nortai cultivar.

(a)Phenotype of two rice cultivars Nortai and Hitomebore 2weeks after inoculation with a compatible race (race 037.1)of blast fungus.Nortai shows partial resistance whereas Hitom-ebore is susceptible.

(b)Scores (4,highly resistant,to 10,highly sus-ceptible)assigned to different levels of partial resistance in RILs derived from a cross between Nortai and Hitomebore.

(c)Frequency distribution of partial resistance levels of 241RILs.The x -axis corresponds to the level of partial resistance as given in (b).N and H indicate the average resistance level of Nortai and Hitomebore,respectively.The DNAs of RILs with resistance levels 4and 5were bulked to make resistance (R-)bulk,and those of levels 8–10were bulked to make susceptible (S-)bulk.

(d)Single nucleotide polymorphism (SNP)-index plots of R-bulk (top)and S-bulk (next to the top),D (SNP-index)plot (next to the bottom)of chromosome 6with statistical con?dence intervals under the null hypothesis of no QTLs (gray,P <0.1;green,P <0.05;pink,P <0.01)and log of odds (LOD)score plot of partial resis-tance QTLs as obtained by classical QTL analy-sis of 241RILs (bottom).

?2013The Authors

The Plant Journal ?2013Blackwell Publishing Ltd,The Plant Journal ,(2013),74,174–183

Whole genome sequencing reveals QTL 177

score of Nortai and Hitomebore in four trials was4.75and 6.38,respectively.The frequency distributions of RILs fall-ing into different classes is close to normal distribution, suggesting that multiple genes control the partial resis-tance of RILs(Figure2c).We de?ned20RILs consistently showing high resistance(class4and5)as Resistant(R-) progeny,and an additional set of20RILs consistently showing high susceptibility(classes8,9and10)as Sus-ceptible(S-)progeny.Genomic DNA of R-progeny was bulked in an equal ratio to generate R-bulk DNA,and that of S-progeny was bulked to generate S-bulk DNA.

Each DNA bulk was subjected to whole-genome rese-quencing using an Illumina GAIIx sequencer.We obtained a total of57.9and62.4million sequence reads(each of 75bp)from DNA bulk of R-progeny and S-progeny, respectively(Table S1).These reads were aligned to the reference sequence of the Hitomebore cultivar using BWA software(Li and Durbin,2009).The average read depth was>6.88x in both bulked DNA(Table S1).A total of 161563SNPs were identi?ed between Nortai and Hitome-bore genomes(Table S2),and the SNP-index was calcu-lated for each SNP(Figure1b;Abe et al.,2012a).Graphs showing relationships between SNP-index and genomic positions are given in Figures2(d)and S2.We found highly contrasting patterns of SNP-index graphs for R-bulk and S-bulk in the region between2.39and4.39Mb on chromosome6as shown in Figure2(d).The resistant RILs mainly had Nortai-type genomic segments in the2.39to 4.39Mb region of chromosome6,whereas susceptible RILs had Hitomebore-type genome in the same region, indicating that there is a major QTL differentiating Nortai and Hitomebore partial resistance located at this genomic https://www.wendangku.net/doc/1618006120.html,bining the information from the two graphs for R-bulk and S-bulk,we made a graph of D(SNP-index) whereby the D(SNP-index)=(SNP-index of R-bulk)–(SNP-index of S-bulk)(Figures2d and S2).This revealed that most of the genomic regions show D(SNP-index)=0,but some genomic regions exhibit positive or negative values of D(SNP-index).These may correspond to QTLs governing the difference between the R-and S-progeny.We calcu-lated statistical con?dence intervals of D(SNP-index)for all the SNP positions with given read depths under the null hypothesis of no QTLs,and plotted them along with D (SNP-index)(Experimental Procedures;Figures2d and S2). The chance that D(SNP-index)becomes higher than0.79as observed for the chromosomal region of2.39–4.39Mb is P<0.01under the null hypothesis.

To verify the candidate QTLs detected by the QTL-seq method,we applied traditional QTL analysis to the same 241RILs using SNP markers.A total of425SNP markers covering the genome were used for genotyping the241 RILs by the Golden Gate Assay on the iSCAN platform(Illu-mina,https://www.wendangku.net/doc/1618006120.html,/).The data were analyzed by QTL-CARTOGRAPHER(Basten et al.,2005)and LOD(log of odds)scores of linkage between SNP markers and the trait were obtained(Figure2d).The highest LOD score (LOD=7.38)was observed in the interval of SNP makers located at0and4.9Mb.This interval corresponded to the genomic region identi?ed by the QTL-seq method(Fig-ure2d).This result demonstrates that QTL-seq allows the rapid detection of QTLs using RILs.We named the interval 2.39–4.39Mb on chromosome6of Nortai qPi-nor1(t)as the location of the partial resistance in Nortai.

Using RILs derived from two other cross-combinations of rice cultivars,we applied QTL-seq and identi?ed the peaks of the D(SNP-index)plot presumably corresponding to a major QTL:a D(SNP-index)peak for lower grain amy-lose content in the cultivar Iwate96as compared with Hitomebore(Figure S3),although this peak was not statis-tically signi?cant(0.05

Application of QTL-seq to F2progeny

We further examined the possibility of applying QTL-seq to an F2population,which is much easier to generate than RILs of advanced generations.A japonica type cultivar Dun-ghan Shali is known to have a strong seedling vigor com-pared with Hitomebore(Figure3a).We have recently?ne-mapped a major QTL,qPHS3-2,on chromosome3that confers the seedling vigor in Dunghan Shali using conven-tional QTL analysis of RILs of the F7generation derived from a cross between Dunghan Shali and a japonica culti-var Kakehashi(Abe et al.,2012b).The QTL most likely cor-responds to a gene OsGA20ox1,a gene involved in gibberelllin(GA)byosynthesis(Abe et al.,2012b;Yano et al.,2012).Using Dunghan Shali,we addressed whether QTL-seq can detect qPHS3-2in the F2progeny derived from a cross between Dunghan Shali and Hitomebore.After crossing Dunghan Shali to Hitomebore,we obtained F2 progeny.Selfed seeds of a total of531F2individuals were scored for their seedling height after14days of imbibition in water at25°C.The variation in seedling height in seed-ling followed a normal distribution,indicative of the involvement of multiple genes in determining this charac-ter(Figure3b).Two DNA bulks were prepared;the50tall-est individuals as‘H-bulk’and the50shortest individuals as‘L-bulk’,and were used for QTL-seq analysis(Figures3c and S2).By examining the D(SNP-index)plot,we identi?ed two genomic positions exhibiting the highest D(SNP-index) values:the region on chromosome3from36.21to

?2013The Authors

The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183 178Hiroki Takagi et al.

37.31Mb with D(SNP-index)=0.61(statistical signi?cance under the null hypothesis:P<0.01)and the region on chro-mosome1from39.08to41.08Mb with D(SNP-index)=0.67(P<0.05).This former position corresponded exactly to the reported qPHS3-2,most probably the locus of OsGA20ox1.Likewise,the latter position was also previ-ously detected as a minor QTL(qPHS-1)(Abe et al.,2012b). This result demonstrates that QTLs identi?ed by conven-tional QTL mapping using RILs of F7generations could be successfully recovered by QTL-seq using the F2generation.

Simulation of QTL-seq

As shown above,QTL-seq successfully identi?ed genomic regions controlling quantitative traits in the examples of rice RILs and F2families.More generally,we are interested in how experimental variables affect the performance of QTL-seq to faithfully detect QTLs.To this end,we carried out a computer simulation of QTL-seq by changing vari-ables like(i)the contribution of QTLs to phenotypic varia-tion,(ii)the percentage of individuals to be selected,(iii) the read depth,and(iv)the dominance effect of the QTL locus on phenotype.We assumed that the rice genome size is roughly360Mb and the recombination rate is 4cM Mbà1.We also postulated that150000SNPs between the two parents are distributed with equal inter-vals(i.e.a SNP every2.4kb).For the QTL-seq process,it is assumed that from among all progeny individuals of F2or F7generations,we select p%each of progeny with oppo-site extreme trait values to make‘Highest’and‘Lowest’bulks,and we sample n random alleles from each bulk to represent the depth of https://www.wendangku.net/doc/1618006120.html,ing these n alleles, we calculated D(SNP-index).Since we routinely take an average of D(SNP-index)of10consecutive SNPs to obtain a sliding window value m,we evaluated the behavior of m by simulations.With10000replications of the simulation, we found that the99%cutoff value of|m|for SNPs that are not selected(null distribution)in F2depends on the per-centages of individuals in the bulk(p)and read depth of the focused region(Figure4a,left).The intervals of values of m become narrower as the coverage and the percentage of individuals in each bulk(p)increases(for our application to rice,the99%cutoff of|m|would be0.29given n=10 and p=0.15).We also applied the same simulation to RILs of the F7generation(Figure4a,right).The null distribution of|m|is wider than that for an F2population.

We next explored the power to detect a QTL.In practice, we place a QTL in the simulated genome assuming that the relative contribution of the QTL to the total phenotype vari-ation is given by Qp.Then,we evaluated the power as the proportion of the simulation replications with|m|around the QTL larger than the99%cutoff value obtained earlier. In this power simulation,bi-allelic states were allowed at the focal QTL,and for the dominance effect we considered two cases,codominance and complete dominance.For Qp ,

(a) (c)(b)

Figure3.QTL-seq applied to rice F2progeny

identi?es quantitative trait loci(QTLs)involved

in seedling vigor.

(a)Seedlings of Hitomebore and Dunghan Shali

10days after water imbibition.Dunghan Shali

shows higher seedling vigor compared with Hi-

tomebore.

(b)Frequency distribution of seedling height in

531F2progenies14days after water imbibition.

H and D indicate the average seedling height of

Hitomebore and Dunghun Shali,respectively.

We selected50F2progeny shorter than18cm

to make Low(L-)bulk and50progeny taller

than24cm to make High(H-)bulk,and applied

to QTL-seq using the Hitomebore reference

genome sequence.

(c)Results of QTL-seq for chromosome3(left)

and1(right).The D(SNP-index)plot(top)with

statistical con?dence intervals under the null

hypothesis of no QTL(gray,P<0.1;green,

P<0.05;pink,P<0.01)and log of odds(LOD)

score plot of QTL controlling plant height as

obtained by classical QTL analysis of250

recombinant inbred lines of the F7generation

(bottom).

?2013The Authors

The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183

Whole genome sequencing reveals QTL179

two values(0.05and0.1)were used.Note that Qp is the rel-ative contribution of this QTL to the total phenotype varia-tion,which includes everything other than the genetic effect of the focal QTL(that is,the environmental factors and the genetic contributions from other QTLs that are not speci?ed here).Figure4(b)and(c)shows the results for the cases of codominance and complete dominance, respectively.We found that the larger read depth increases the power in all cases.The power is higher when Qp=0.1 than the cases of Qp=0.05.It appears that higher power is expected when the QTL allele is codominant(additive)as compared with complete dominance,and there is an opti-mum value for the percentage of individuals in each bulk (p).When the value is small,the power is low,probably because there would be too much sampling variance.As p increases,the power increases,but it starts to decrease when the sample size is so large that many individuals with intermediate phenotype are included in the bulking.We found higher power in F7RILs than in F2populations,and thus concluded that in QTL-seq application to the F2popu-lations,p=0.15and n=20would be a reasonable choice, and that the method has reasonable power to detect QTLs with a relative contribution of roughly10%(Qp=0.1).

DISCUSSION

Two types of genetic variations,the ones derived from arti-?cial mutagenesis and those naturally occurring in landrac-es and wild crop relatives,have been used in plant breeding.Mutant lines generated by arti?cial mutagenesis are valuable for isolating agronomically important genes. To this end,we have recently developed MutMap,an ef?-cient method to identify the causal mutation of a given phenotype by whole genome resequencing of the bulked DNA of progeny showing mutant phenotype(Abe et al., 2012a).Although MutMap is a powerful technique,crop breeding has mostly depended on genetic variations avail-able among different cultivars and species in what is called QTL breeding.This is in part because naturally occurring variants harbor a potentially larger repertoire of useful alleles than the arti?cially generated mutants due to the larger number of mutations accumulated over long time in nature.Therefore,analysis of the QTL variations among natural variants is important for enhancing breeding by isolating useful alleles of the genes controlling agronomi-cally important traits(Yano,2001).However,conventional QTL analysis is a laborious process requiring the develop-ment of DNA markers and the generation of a large number of advanced generation progeny.Here we demon-strated the successful application of whole-genome rese-quencing for detecting rice QTLs for agronomically important traits,including partial resistance and seedling vigor,using RILs and F2populations,respectively.The major advantage of QTL-seq is that it does not necessitate DNA marker development and marker genotyping for map-ping purposes.The SNPs available between the parental lines serve as such markers,thus reducing the cost and

(a)

(b)(c)Figure4.Simulation reveals the capability of QTL-seq to detect quantitative trait loci(QTLs) in a wide range of values of experimental vari-ables.

(a)Ninety-nine per cent intervals of the null dis-tribution of m statistics[average value of D (SNP-index)of10consecutive single nucleotide polymorphisms(SNPs)].The x-and y-axes rep-resent the percentage of individuals in each bulk(p)and the m value,respectively.The results for F2progeny(left)and F7recombinant inbred line progeny(right)are shown.Different read depths(509,209,109and59)are indi-cated by different colors(inset).

(b),(c)The power of QTL-seq for detecting QTLs in the cases of codominance(b)and dom-inance(c).Two values of the QTL effect [Qp=0.1(top)and Qp=0.05(bottom)],as well as two types of populations(F2and F7)were tested.

?2013The Authors

The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183 180Hiroki Takagi et al.

time required for marker development and genotyping. Furthermore,the use of SNP-index allows accurate evalua-tion of the frequencies of parental alleles in a subset of progeny of a given genomic position.These two key attri-butes make QTL-seq an attractive method for quick and cost-effective identi?cation of QTLs.

Bulked-segregant analysis was?rst applied to facilitate the linkage analysis of discrete characters in F2populations (Giovannoni et al.,1991;Michelmore et al.,1991).In these studies,F2progeny showing two discrete characters were isolated,and DNA from the F2individuals were pooled to make two DNA bulks corresponding to the two character types.After a battery of DNA markers including RAPD markers(Williams et al.,1990)were tested for these two bulked DNAs,markers showing differences between the two DNA bulks were selected to represent the DNA mark-ers linked to the gene(s)responsible for the difference in the characters.This original bulked-segregant method was later extended to QTL analysis.After RILs or F2were scored for the phenotypes,progeny showing extreme opposite phenotypes were selected,and these DNAs were separately bulked to?nd DNA markers showing linkage with the phenotypic differences(‘selective DNA pooling’; Mansur et al.,1993;Darvasi and Soller,1994).This latter method is in principle similar to QTL-seq,but requires DNA marker development and testing of bulked DNA with each marker,both time-consuming and labor-intensive processes that are circumvented by QTL-seq.Conse-quently,QTL-seq is much more rapidly performed.QTL-seq also allows an accurate quantitative evaluation of the genomic contribution from the two parents to the bulked DNAs by using SNP-index,whereas the conventional method has to rely on analog assessment of marker states, e.g.the relative strength of intensity of DNA amplicons after PCR ampli?cation of the markers.Therefore,we believe that QTL-seq is quicker and has a much higher power than the previous methods used for QTL identi?ca-tion.Applications of whole genome sequencing to two DNA bulks of progeny with extreme phenotypes have been reported in yeast(X-QTL;Ehrenreich et al.,2010;Wenger et al.,2010;Parts et al.,2011;Swinnen et al.,2012),and its statistical property applied to yeast was also addressed (Magwene et al.,2011).QTL-seq has the same principle as these methods.However,QTL-seq is the?rst application of a similar method in plant species with a much larger gen-ome size(rice,380Mb),and we demonstrated that it can be carried out with a signi?cantly smaller number of prog-eny(20–50)in each bulk than the methods previously reported in yeast.In crop species,bulking of an extremely large number of progeny is not practical.

QTL-seq applied to seedling vigor in rice demonstrated that this method successfully identi?es QTL in an F2genera-tion,which is a much earlier generation than the F7one that we used for conventional QTL analysis based on RILs.Our simulation analysis showed that if the phenotypic effect of the focal QTL accounts for more than10%of the entire varia-tion and if the read depth is more than or equal to20its genomic position may be readily detected by QTL-seq even in the F2generation.We also demonstrated that QTL-seq is applicable to progeny obtained from crosses made between genetically closely related cultivars.The rice cultivars Nortai and Hitomebore and Dunghan Shali used in the current experiments all belong to japonica species,and DNA poly-morphisms among them are low,making DNA marker devel-opment dif?cult in the conventional scheme of QTL analysis. We envisage that QTL-seq can be applied to any popula-tion for detecting genomic regions that underwent arti?cial or natural selection.For instance,a population of a species is distributed over a certain environmental gradient(high temperature versus low temperature).We could then make two DNA bulks:one from multiple individuals from high temperature zones and the other from low temperature zones.Sequence reads from these two DNA bulks are com-pared to a reference sequence,and D(SNP-index)is calcu-lated for all the genomic regions.The regions showing higher D(SNP-index)than the background genome should point to the regions responsible for the adaptation of pop-ulation to high/low temperature.In this regard,QTL-seq could be perceived as a general method for detecting genomic regions showing signatures of recent selective sweep by whole-genome resequencing of DNAs from two groups of individuals that underwent recent arti?cial or natural selection in the opposite directions.

In view of the recent rapid development in sequencing technology,we foresee that methods that make use of whole-genome sequencing-based techniques,including QTL-seq,MutMap(Abe et al.,2012a),SHOREmap(Schnee-berger et al.,2009),NGM(Austin et al.,2011)and others (Mokry et al.,2011;Trick et al.,2012),will dramatically accelerate crop improvement in a cost-effective manner. These and other related technologies that take full advanta-ges of the rapidly declining cost of genome sequencing are expected to signi?cantly contribute to the on-going efforts aimed at addressing the world food security prob-lem by reducing breeding time.

EXPERIMENTAL PROCEDURES

Evaluation of partial resistance of RILs

To evaluate the partial resistance of RILs to leaf blast disease we conducted upland nursery trials in2006and2011at Iwate Agri-cultural Research Center in Kitakami,Iwate,Japan.Overall,a total of four independent inoculation assays were carried out.For each RIL,about200seeds were sown in single40-cm long rows that were spaced at10cm apart.For use as inoculum,seedlings of a highly susceptible cultivar Moukoto were grown on both sides of each block of RIL rows.Nitrogen was applied at the rate of20kg per1000m2as a basal fertilizer.Disease severity was visually scored according to the procedure of Asaga(1981).

?2013The Authors

The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183

Whole genome sequencing reveals QTL181

Whole-genome sequence of bulked DNA

Bulked DNA samples were prepared by mixing an equal ratio of DNA extracted from100mg of fresh rice leaves as previously described(Abe et al.,2012a).The library for Illumina sequencing was constructed from5l g of DNA sample and sequenced by76 cycles on an Illumina Genome Analyzer IIx as described in Abe et al.(2012a).The short reads in which more than10%of sequenced nucleotide exhibited a phred quality score of<30were excluded from the analysis that followed.

Alignment of short reads to the reference sequence and sliding window analysis

To identify the QTL,we aligned the short reads obtained from the two DNA bulks to the reference genome of the cultivar Hitome-bore(DDBJ Sequence Read Archive,DRA000809)using BWA soft-ware(Li and Durbin,2009)[Correction added on13March2013 after original online publication:DRA000499was changed to DRA000809].Alignment?les were converted to SAM/BAM?les using SAMtools(Li et al.,2009),and applied to the SNP-calling?l-ter‘Coval’we previously developed(SK et al.,in preparation;Abe et al.,2012a)to increase SNP-calling accuracy.SNP-index was cal-culated for all the SNP positions.We excluded SNP positions with SNP-index of<0.3and read depth<7from the two sequences,as these may represent spurious SNPs called due to sequencing and/ or alignment errors.However,if SNPs with SNP-index!0.3were present in only one of the sequences obtained from the two DNA bulks(bulk A),we considered them as real SNPs and assumed their presence in the other bulk(bulk B)too.In this case we used the SNP-index value of the bulk B DNA even if it was<0.3.For positions in the genome where the entire short reads match the reference sequence,we assign a SNP-index of0.Sliding window analysis was applied to SNP-index plots with2Mb window size and10kb increment.We calculated the average SNP-index of the SNPs located in the window(m)and used it for the sliding win-dow plot.If the number of SNPs within the2Mb window was <10,we skipped the interval for the https://www.wendangku.net/doc/1618006120.html,e of m for sliding window analysis after taking the average of10SNP-indices was important to reduce the noise in the plot(Figure S5).

To generate con?dence intervals of the SNP-index value under the null hypothesis of no QTL,we carried out computer simulation. We?rst made two bulks of progeny with a given number of individ-uals by random sampling.From each bulk,a given number of alleles corresponding to the read depth were sampled.We then cal-culated SNP-index for each bulk,and derived D(SNP-index).This process was repeated10000times for each read depth and con?-dence intervals were generated(Figure S6).These intervals were plotted for all the genomic regions that have variable read depths. The SNP genotyping of RILs

We used the Illumina GA IIx sequencer to obtain the Nortai gen-ome sequence,which was compared with the Hitomebore whole-genome sequence to identify SNPs via a?lter pipeline.The identi-?ed SNPs were applied to the IlluminaaAssay Design Tool to design the Oligo Pool Assay(OPA)for the GoldenGate Genotyp-ing Assay(Illumina).The DNA was extracted from50mg of fresh rice leaves using the DNeasy96Plant Kit(Qiagen,https://www.wendangku.net/doc/1618006120.html,/)and was quanti?ed using the Quant-iT PicoGreen dsDNA Reagent and Kits(Invitrogen,https://www.wendangku.net/doc/1618006120.html,/). The designed OPA and250ng of DNA were used for the prepara-tion of bead chips according to the protocol for the GoldenGate Genotyping Assay.The bead chips were scanned by iSCAN and the data were analyzed by GenomeStudio(Illumina).Computer simulation

In order to obtain the null distribution of m,we simulated the RIL construction process according to the single seed descent(SSD). method.We set the genomic parameters to be roughly consistent with rice.That is,the genome size is set to about360Mb and the recombination rate at4cM Mbà1.We postulated that150000 SNPs between the two parents are distributed at equal intervals (i.e.a SNP every2.4kb).The number of individuals in a progeny is assumed to be N=200.The breeding process was continued to the F7generation,and the QTL-seq process was applied to the F2 and F7generations independently.In the QTL-seq process it is assumed that from among all progeny individuals of F2or F7gen-erations we select p%each of progeny with opposite extreme trait values to make‘Highest’and‘Lowest’bulks.Each bulk is sequenced to depth n,so that the SNP data we obtain will be a random set of n alleles from Np individuals,where replacement is allowed.We simulated10000replications of this process,from which the null distribution of m for the F2and F7generations were obtained.We were also interested in the distribution of m in the region encompassing the SNP that is responsible to the focal phe-notype.For this purpose,we modi?ed the simulation such that a QTL is placed in the simulated genome.At the QTL,there were two alleles,and the genetic contribution of this QTL relative to the total phenotype variation was given by Qp.Although this model includes only one QTL,it does not mean that there is only one QTL in the genome.The remaining contribution with proportion1–Qp represents all factors including the genetic contributions of other multiple QTLs and environmental variables.With10000 replications of the simulations under this simple model,we obtained the distribution of m under various parameter sets,from which the power to detect QTLs was computed as the proportion of replications with m out of the99%cutoff values(see above). ACKNOWLEDGEMENTS

This study was supported by the Program for Promotion of Basic Research Activities for Innovative Biosciences(PROBRAIN),the Ministry of Agriculture,Forestry,and Fisheries of Japan(Genom-ics for Agricultural Innovation PMI-0010)and Grant-in-aid for Sci-enti?c Research from the Ministry of Education,Cultures,Sports and Technology,Japan to HK and RT(Grant-in-Aid for Scienti?c Research on Innovative Areas23113009)and JSPS KAKENHI to RT (Grant No.24248004).We thank Shigeru Kuroda for general sup-port of the work.

SUPPORTING INFORMATION

Additional Supporting Information may be found in the online ver-sion of this article.

Figure S1.Frequency distributions of partial resistance levels of the241recombinant inbred lines derived from a cross between Nortai and Hitomebore over four independent trials carried out in 2006and2011.

Figure S2.Single nucleotide polymorphism(SNP)-index and D(SNP-index)plots for12chromosomes of rice bulked DNA. Figure S3.QTL-seq applied to recombinant inbred lines derived from a cross between the cultivar‘Iwate96’and‘Hitomebore’seg-regating in grain amylose content.

Figure S4.QTL-seq applied to recombinant inbred lines derived from a cross between the cultivar‘Arroz da Terra’and‘Iwatekko’segregating in germination rate at low temperature condition. Figure S5.Flow chart of QTL-seq analysis.

Figure S6.Simulation test for deriving con?dence intervals of D (SNP-index)under the null hypothesis(no quantitative trait locus).

?2013The Authors

The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183 182Hiroki Takagi et al.

Table S1.Summary of Illumina GAIIx sequencing for Nortai9Hi-tomebore recombinant inbred lines and Hitomebore9Dunghan Shali F2.

Table S2.Number of single nucleotide polymorphisms detected between Hitomebore and13rice cultivars.

REFERENCES

Abe,A.,Kosugi,S.,Yoshida,K.et al.(2012a)Genome sequencing reveals agronomically-important loci in rice from mutant populations.Nat.Bio-technol.30,174–178.

Abe,A.,Takagi,H.,Fujibe,T.,Aya,K.,Kojima,M.,Sakakibara,H.,Uemura,

A.,Matsuoka,M.and Terauchi,R.(2012b)OsGA20ox1,a candidate gene

for a major QTL controlling seedling vigor in rice.Theor.Appl.Genet.

125,647–657.

Asaga,K.(1981)A procedure for evaluating?eld resistance to blast in rice varieties.J.Cent.Agric.Exp.Stn.35,51–138.

Ashikari,M.and Matsuoka,M.(2006)Identi?cation,isolation and pyramiding of quantitative trait loci for rice breeding.Trends Plant Sci.11,344–350. Austin,R.S.,Vidaurre,D.,Stamatiou,G.et al.(2011)Next-generation map-ping of Arabidopsis genes.Plant J.67,715–725.

Basten,C.J.,Weir,B.S.and Zeng,Z.B.(2005)QTL Cartographer,Version

1.17:A Reference Manual and Tutorial for QTL Mapping.Raleigh:North

Carolina State University.

Darvasi,A.and Soller,M.(1994)Selective DNA pooling for determination of linkage between a molecular marker and a quantitative trait locus.Genet-ics,138,1365–1373.

David,T.,Christian, B.,Jason,H.and Belinda,L.B.(2011)Global food demand and the sustainable intensi?cation of agriculture.Proc.Natl https://www.wendangku.net/doc/1618006120.html,A,108,20260–20264.

Ehrenreich,M.I.,Torabi,N.,Jia,Y.,Kent,J.,Martis,S.,Shapiro, A.J., Gresham,D.,Caudy,A.A.and Kruglyak,L.(2010)Dissection of geneti-cally complex traits with extremely large pools of yeast segregants.Nat-ure,446,1039–1042.

Falconer,D.S.and Mackay,T.F.C.(1996)Introduction to Quantitative Genet-ics,4th edn.London:Prentice Hall.

Fujino,K.,Sekiguchi,H.,Matsuda,Y.,Sugimoto,K.,Ono,K.and Yano,M.

(2008)Molecular identi?cation of a major quantitative trait locus,qLTG3–1,controlling low-temperature germinability in rice.Proc.Natl Acad.Sci.

USA,105,12623–12628.

Giovannoni,J.J.,Wing,R.A.,Ganal,M.W.and Tanksley,S.D.(1991)Isolation of molecular markers from speci?c chromosome intervals using DNA pools from existing mapping populations.Nucleic Acids Res.19,6553–6558. Godfray,H.C.J.(2010)Food security:The challenge of feeding9billion peo-ple.Science,327,812–818.

Li,H.and Durbin,R.(2009)Fast and accurate short read alignment with Bur-rows-Wheeler transform.Bioinformatics,25,1754–1760.Li,H.,Handsaker,B.,Wysoker,A.,Fennell,T.,Ruan,J.,Homer,N.,Marth,

G.,Abecasis,G.and Durbin,R.;1000.Genome.Project.Data.Process-

ing.Subgroup.(2009)The sequence alignment/map format and samtools.

Bioinformatics,25,2078–2079.

Magwene,P.M.,Willis,J.H.and Kelly,J.K.(2011)The statistics of bulk seg-regant analysis using next generation sequencing.PLoS Comput.Biol.7, e1002255.

Mansur,L.M.,Orf,J.and Lark,K.G.(1993)Determining the linkage of quantita-tive trait loci to RFLP markers using extreme phenotypes of recombinant in-breds of soybean(Glycine max L.Merr.).Theor.Appl.Genet.86,914–918. Michelmore,R.W.,Paran,I.and Kesseli,R.V.(1991)Identi?cation of markers linked to disease resistance genes by bulked segregant analysis:A rapid method to detect markers in speci?c genomic regions by using segregat-ing populations.Proc.Natl https://www.wendangku.net/doc/1618006120.html,A,88,9828–9832.

Miura,K.,Lin,S.Y.,Yano,M.and Nagamine,T.(2001)Mapping Quantitative Trait Loci Controlling Low Temperature Germinability in Rice(Oryza sati-va L.).Breed.Sci.51,293–299.

Mokry,M.,Nijman,I.J.,van Dijken,A.,Benjamins,R.,Heidstra,R.,Scheres,

B.and Cuppen,E.(2011)Identi?cation of factors required for meristem

function in Arabidopsis using a novel next generation sequencing fast forward genetics approach.BMC Genomics,12,256.

Parts,L.,Cubillos,A.F.,Warringer,J.et al.(2011)Revealing the genetic structure of a trait by sequencing a population under selection.Genome Res.21,1131–1138.

Schneeberger,K.,Ossowski,S.,Lanz,C.,Juul,T.,Petersen,A.H.,Nielsen, K.L.,Jorgensen,J.-E.,Weigel,D.and Andersen,S.U.(2009)SHOREmap: simultaneous mapping and mutation identi?cation by deep sequencing.

Nat.Methods,6,550–551.

Swinnen,S.,Schaerlaekens,K.,Pais,T.et al.(2012)Identi?cation of novel causative genes determining the complex trait of high ethanol tolerance in yeast using pooled-segregant whole-genome sequence analysis.Gen-ome Res.22,975–984.

Trick,M.,Adamski,N.M.,Mugford,S.G.,Jiang,C.-C.,Febrer,M.and Uauy,

C.(2012)Combining SNP discovery from next-generation sequencing

data with bulked segregant analysis(BSA)to?ne-map genes in poly-ploidy wheat.BMC Plant Biol.12,14.

Wenger,W.J.,Schwartz,K.and Sherlock,G.(2010)Bulk segregant analysis by high-throughput sequencing reveals a novel xylose utilization gene from Saccharomyces cerevisiae.PLoS Genet.6,e1000942.

Williams,J.G.K.,Kubelik,A.R.,Livak,K.J.,Rafalski,J.A.and Tingey,S.V.

(1990)DNA polymorphisms ampli?ed by arbitrary primers are useful as genetic markers.Nucleic Acids Res.18,6531–6535.

Yano,M.(2001)Genetic and molecular dissection of naturally occurring var-iation.Curr.Opin.Plant Biol.4,130–135.

Yano,K.,Takashi,T.,Nagamatsu,S.,Kojima,M.,Sakakibara,H.,Kitano,H., Matsuoka,M.and Aya,K.(2012)Ef?cacy of microarray pro?ling data combined with QTL mapping for the identi?cation of a QTL gene control-ling the initial growth rate in rice.Plant Cell Physiol.53,729–739.

?2013The Authors

The Plant Journal?2013Blackwell Publishing Ltd,The Plant Journal,(2013),74,174–183

Whole genome sequencing reveals QTL183

相关文档