文档库 最新最全的文档下载
当前位置:文档库 › 转录组揭示大麦遗传进化pnas

转录组揭示大麦遗传进化pnas

Transcriptome profiling reveals mosaic genomic origins of modern cultivated barley

Fei Dai a,Zhong-Hua Chen a,b,Xiaolei Wang a,Zefeng Li a,Gulei Jin a,Dezhi Wu a,Shengguan Cai a,Ning Wang c,Feibo Wu a, Eviatar Nevo d,1,and Guoping Zhang a,1

a Department of Agronomy,Zhejiang Key Laboratory of Crop Germplasm,Zhejiang University,Hangzhou310058,China;

b School of Science and Health, University of Western Sydney,Richmond,NSW2753,Australia;

c Faculty of Life an

d Environmental Sciences,University of Tsukuba,Tsukuba,Ibaraki305-8577, Japan;and d Institut

e o

f Evolution,University of Haifa,Mount Carmel,Haifa31905,Israel

Contributed by Eviatar Nevo,July29,2014(sent for review March2,2014)

The domestication of cultivated barley has been used as a model system for studying the origins and early spread of agrarian culture.Our previous results indicated that the Tibetan Plateau and its vicinity is one of the centers of domestication of cultivated barley.Here we reveal multiple origins of domesticated barley using transcriptome profiling of cultivated and wild-barley geno-types.Approximately48-Gb of clean transcript sequences in12 Hordeum spontaneum and9Hordeum vulgare accessions were generated.We reported12,530de novo assembled transcripts in all of the21samples.Population structure analysis showed that Tibetan hulless barley(qingke)might have existed in the early stage of domestication.Based on the large number of unique ge-nomic regions showing the similarity between cultivated and wild-barley groups,we propose that the genomic origin of modern cultivated barley is derived from wild-barley genotypes in the Fer-tile Crescent(mainly in chromosomes1H,2H,and3H)and Tibet (mainly in chromosomes4H,5H,6H,and7H).This study indicates that the domestication of barley may have occurred over time in geographically distinct regions.

evolution|genetic diversity|genomic similarity|RNA-Seq|

single nucleotide variants

D omestication of crops is the outcome of complex in-

dependent or combined processes of artificial and natural selection that lead to plants adaptive to cultivation and human consumption(1,2).Wild barley(Hordeum spontaneum L.),the progenitor of cultivated barley(H.vulgare L.),is one of the founder crops of the Old World for Neolithic food production (3),and it harbors a myriad of mutations favorable for its ad-aptation to harsh environments.Hence,wild barley could pro-vide natural sources of genetic diversity for plant abiotic and biotic stress tolerance(3,4).Understanding the domestication process of cultivated barley will therefore be helpful for exploiting elite genetic resources in wild barley and breaking the current bottleneck in modern barley breeding caused by nar-rower genetic diversity(1,4).

The Near East Fertile Crescent is commonly recognized as a major evolutionary center of wild barley and domestication of cultivated forms(3–5).Many reports indicate that cultivated barley was first domesticated about10,000y ago in the Near East Fertile Crescent(3,5).However,the debate on the mono-phyletic or polyphyletic origin of barley still remains contentious (1,5–9).Morrell and Clegg,based on the resequencing data from18loci,proposed that barley has been domesticated at least twice,once in the Near East Fertile Crescent and then1,500–3,000km farther east(7).Genotyping of chloroplast micro-satellite markers has also suggested that barley has been do-mesticated more than once,on each occasion in a different geographical region(10).Unlike wheat and other founder crops, the natural distribution of wild barley scattered widely from the Near East to Central Asia and the Tibetan Plateau(5,11,12). There is increasing evidence to support the theory that cultivated barley is of polyphyletic origin(7,9,13,14).

Barley is an annual diploid grass species with a large haploid genome of5.1-Gb and a high abundance of repetitive elements (15).RNA sequencing(RNA-Seq)is a high-throughput tech-nology for transcriptome profiling using deep-sequencing pro-tocol for rapid characterization of transcript sequences and gene expression(16).This technique is effective for detecting not only differentially expressed genes,but also sequence variants and new transcripts(17).Recent advances in the characterization and quantification of transcriptome with RNA-Seq have been made in rice(18),maize(19),and barley(20).In view of the large genome of barley and the wide genetic diversity of wild barley(4,14,21),RNA-Seq has become a very effective and powerful technology in generating comprehensive transcriptome profiles(22).

Our previous studies have shown significant genetic differ-ences between wild barley from the Near East and Tibet(14). However,no research has been conducted using comparative transcriptomics to distinguish between cultivated barley and different wild-barley populations.We hypothesized that the ge-nome segments of cultivated barley should show certain simi-larity with its ancestral wild barley,and using RNA-Seq we should be able to determine the genomic origin of cultivated barley.Specifically,we selected some representative wild-barley genotypes from the Near East and Tibet,and representative world-wide selections of cultivated barley genotypes,

conducted Author contributions:F.D.,E.N.,and G.Z.designed research;F.D.,X.W.,G.J.,D.W.,S.C.,

and F.W.performed research;F.D.,Z.-H.C.,X.W.,Z.L.,G.J.,D.W.,and N.W.analyzed data;

and F.D.,Z.-H.C.,E.N.,and G.Z.wrote the paper.

The authors declare no conflict of interest.

Freely available online through the PNAS open access option.

Data deposition:The sequences reported in this paper have been deposited in the Na-tional Center for Biotechnology Information Sequence Read Archive,www.ncbi.nlm.nih. gov(accession nos.SAMN02483491–SAMN02483511).

1To whom correspondence may be addressed.Email:nevo@research.haifa.ac.il or zhanggp@https://www.wendangku.net/doc/2019216041.html,.

This article contains supporting information online at https://www.wendangku.net/doc/2019216041.html,/lookup/suppl/doi:10. 1073/pnas.1414335111/-/DCSupplemental.

https://www.wendangku.net/doc/2019216041.html,/cgi/doi/10.1073/pnas.1414335111PNAS Early Edition|1of6E V O L U T I O N

their transcriptome profiling,and investigated the genomic ori-gin of modern cultivated barley.

Results

RNA-Seq Performance and Data Analysis.We performed an RNA-Seq analysis on the samples collected at the seedling stage of12 wild(Table S1)and9cultivated barley genotypes(Table S2) using the Illumina HiSeq2000platform.Overall,paired-end sequencing(100bp)of transcriptome yielded over635million raw reads(59.16Gb)for the21libraries(Table S3).After re-moving all of the adaptor sequences,empty reads,and low-quality reads,there were around534million clean reads remaining(Table S3),with a mean of2.63and1.82Gb for wild and cultivated barley,respectively.Approximately81.9–85.5%of the clean reads were mapped to the current whole-genome shotgun(WGS)contigs of cv.Morex(15),yielding120,837 transcripts with a mean length of2,336bp(Table S3).In total, 61,611genes were identified for both wild and cultivated barley, in which16,993and9,443transcripts were unique for wild and cultivated barley,respectively.

The remaining reads were aligned to the barley full-length cDNA(fl-cDNA)(23),resulting in3,861transcripts,and3,276 genes in the21samples.To discover previously unrecognized transcripts,unmapped RNA-Seq reads were de novo assembled, resulting in the assembly of12,530de novo transcripts(Fig. S1A).Around74.0%of the de novo assembled transcripts were found in all of the samples.However,only18.6%of these transcripts were detected in wild-barley accessions from the Near East and Tibet(Fig.S1A).These results indicated that wild barley has a much larger transcript diversity than cultivated barley. Discovery of Exon Single Nucleotide Variants and Indel s and Their Mapping to the Synthetic Assembly of the Barley Genome.Single nucleotide variants(SNVs)and indel detection in exons were performed using the dataset referring to the current WGS con-tigs of cv.Morex and its assembly.After aligning the reads in each sample against the WGS contigs of cv.Morex,we identified a total of247,611SNVs and9,084indels in31,084genes,in-cluding191,534SNVs and6,744indels in20,682genes mapped to seven barley chromosomes(Table S4).A large number of SNVs were detected in wild barley from the Near East,ranging from 46,538–58,502,and in wild barley from Tibet,ranging from 38,704–52,007(Table S3).In contrast,there were on average only 27,968SNVs in cultivated barley,indicating the gene pool of cultivated barley is dramatically smaller than that of wild barley. Moreover,we built a synthetic assembly of the barley genome based on the WGS contigs assembly of cv.Morex,Baker,and Bowman.The1.7-Gb(1,334,625contigs)synthetic assembly of the barley genome(Dataset S1),which includes approximately one-third of the barley genome,has provided a detailed insight into the physical distribution of genes.By using the five acces-sions(B1K-04-12,Bowman,Barke,Igri,and Haruna Nijo) reported by the International Barley Genome Sequencing Con-sortium(15),we were able to analyze the genomic contribution of different wild-barley populations to the modern cultivars.

In total,111,121SNVs with no missing data were detected in all of the26samples(Table S4),including4,099SNVs showing multiple variations in a single site.As the missing and multiple variation sites could make subsequently unreliable inferences in any sample,we only used107,022SNVs with no missing data in the26samples to assess the unique SNVs in cultivated barley (Fig.S1B)and in the two main wild-barley populations.Sur-prisingly,there were only16,977SNVs detected in all of the cultivated and wild barley in this study(Fig.S1B).The numbers of unique SNVs for wild barley from the Near East and Tibet were37,998and14,724,respectively.However,the13cultivated barley genotypes only harbored12,163unique SNVs(Fig.S1B).

Furthermore,we created a dataset containing92,776SNVs with no missing data,which were anchored to the synthetic as-sembly of the barley genome in all of the26genotypes for the population structure analysis and genomic similarity test.All of these SNVs were randomly distributed on the seven chromo-somes,with a mean of13,254SNVs or1,678genes in each chromosome(Table S4).

Population Structure and Grouping of Cultivated and Wild Barley.We constructed a neighbor-joining tree and conducted a population structure analysis and a principle component analysis(PCA) (Fig.1)based on the dataset of92,776SNVs with no missing data(Table S4).Three major groups could be clearly observed from the neighbor-joining tree:the cultivars except Tibetan hulless barley(qingke),the wild barley from the Near East,and the wild barley and qingke from the Tibetan Plateau(Fig.1A). According to the first and second eigenvectors of the PCA,all of the samples could be divided into cultivars and wild-barley groups,with the exception of qingke,which demonstrated a very close link to the Tibetan wild barley(Fig.1B).Again,the results showed the cultivars had much less genetic diversity than the wild genotypes.

Furthermore,we performed a population structure analysis to estimate individual ancestry and admixture proportions,assum-ing the existence of certain populations.By analyzing the number of clusters(K),we found a clear evolutionary divergence be-tween cultivated and wild barley from the Near East and Tibet with a K=3(Fig.1C).Obviously,qingke was attributed to the population of Tibetan wild barley.When K=4,the cultivars from East Asia fell right into a new subgroup within the culti-vated barley population(Fig.1C).With K values at5and6,some Israel and Tibet wild barley could be separated,respectively. Based on the results of population structure inference,we divided all of the wild-barley accessions into three groups:the wild barley from Tibet(Wb-T,including XZ2,XZ12,XZ15,XZ21,XZ174, and XZ181),the wild barley from the Near East group1(Wb-NE1, including ECI-6-0,Tabigha-T-0,and Tabigha-B-63),and the wild barley from the Near East group2(Wb-NE2,including ECI-2-0, Iran-6-26,Turkey-19-24,and B1K-04-12).All of the cultivated barley—except qingke—were classified into one group(Cultivar)to minimize the effect of genetic drift,which may be considered as a representative of modern cultivated barley worldwide. Genomic Origin of Modern Cultivated Barley.We used the same dataset containing92,776SNVs with no missing data to analyze genomic similarity between modern cultivated barley(Cultivar) and Wb-T,Wb-NE1,and Wb-NE2,respectively.The average heterozygosity of Cultivar,Wb-NE1,Wb-NE2,and Wb-T was 0.135,0.091,0.198,and0.161,respectively.To avoid the occur-rence of multivariants in one site,we combined each group of barley accessions as a gene pool according to the methods de-scribed by Rubin et al.(24).We used300-kb windows and150-kb overlapping slide windows to detect the potential similarity of genomic regions,resulting in the best coverage along the syn-thetic assembly genome of barley between Cultivar and three wild-barley groups.As a result,600windows with high similarity were detected in a total of21,880SNVs fell into the selection criteria,which account for~8.7%of the synthetic assembly of the barley genome.Among these windows,only51similar windows were detected between Cultivar and Wb-NE1.However,the number increased dramatically to289and384between Cultivar and Wb-NE2,and Wb-T,respectively.Meanwhile,a tight re-lationship between Wb-NE2and Wb-T was found based on a total of298similar windows(Fig.S2).However,the number of similar windows between Wb-NE1and Wb-NE2and between Wb-NE1and Wb-T is only101and31,respectively.These results indicate that the domestication process might have happened in a certain population of wild barley from the Fertile Crescent.

2of6|https://www.wendangku.net/doc/2019216041.html,/cgi/doi/10.1073/pnas.1414335111Dai et al.

Moreover,the Circos diagrams demonstrated a tightly geno-mic relationship between cultivated barley and wild-barley groups (Fig.2A ).There were only 20unique genomic regions between Cultivar and Wb-NE1(Table S5).In contrast,212and 300unique genomic regions were detected between Cultivar and Wb-NE2and between Cultivar and Wb-T,respectively,showing a mosaic distribution in each of the chromosomes (Fig.2A ).Meanwhile,the length of unique genomic regions was summarized

and the genomic similarity rate of Cultivar to each wild group was calculated based on the length of unique genomic regions (Fig.2B ).The high genomic similarity rate between Cultivar and Wb-NE2at 59.5%,51.9%,and 49.6%is found in chromosomes 1H,2H,and 3H,respectively.In contrast,the high genomic similarity rate between Cultivar and Wb-T at 55.1%,65.3%,49.1%,and 55.2%is detected only in chromosomes 4H,5H,6H,and 7H,respectively.

A l e x i s

B a

r k e

10

B o w

m a n 100

I g r i

E s t e r e l

F r a n k

a

100

100

00Y i w u e r l

e n g T X 9425100

A m a g

i N i j o H a r u n a N i j o 100

10010

0P a d a n g g a m u

H i m a l a 2

100

B e i q i n g 5

100

X Z 15

XZ174

100

X Z 12

X Z 2

X Z 21

X Z

181100

100

100

100100

100

I r a n -6-

26

T u

r k e y -19-2410

B 1K -0

4-12E C I -2-0

Ta bi gh a-B -63

E C I -6-0

T a

b i g h a -T -0100

100

100

100

0.05

B

K =3

K =4

K =5

K =6

K =2

C

A

Fig.1.Phylogenetic tree,PCA,and population structure analysis of wild and cultivated barley.The analyses were based on 92,776SNVs randomly distributed in the seven barley chromosomes.(A )Phylogenetic tree was constructed using the neighbor-joining methods with 1,000bootstraps.(B )PCA1,the first principal component;PCA2,the second principal component;green circles indicate barley cultivars (except qingke ),which are referred as a green branch in A .(C )Each color denotes one population in the population structure analysis.Each vertical bar represents one accession in which the percentages of contri-bution from the ancestral populations are indicated by the lengths of colored segments.The number of clusters (K )was set from 2to 6.

Dai et al.

PNAS Early Edition |3of 6

E V O L U T I O N

To validate the above results,we also conducted a case study on five additional cultivars (Cultivar-additional,cv.Borwina,Kindred,Vogelsanger Gold,Steptoe and Harrington)containing exome capture data (25).After aligning the reads in each sample against the WGS contigs of cv .Morex,and matching with our existing dataset of 92,776SNVs,54,290SNVs were detected with no missing data in all of the five additional cultivars.Cluster analysis,using 54,290SNVs,produced a neighbor-joining tree (Fig.S3A )similar to Fig.1A .Therefore,we conducted a genomic similarity analysis between Cultivar-additional and each wild-barley group.A total of 583windows with high similarity were detected,where 277and 392similar windows were found between Cultivar-additional and Wb-NE2,and Wb-T,respectively.Consistent with Fig.2A ,the unique genomic regions in the Circos diagrams also showed a mosaic distribution in each chromosome (Fig.S3B ).

Discussion

Genome sequencing has provided many new insights into the domestication of barley.However,because of the large genome of barley (14),RNA-Seq has quickly become an alternative and effective method to study the domestication of barley.We have

generated comprehensive transcriptome profiles for the in exon detection of genetic diversity of wild and cultivated barley (Tables S1–S3).The results show that only 42.2%of the SNVs were detected in 13representative cultivated barley accessions from Tibet,East Asia,and Europe (Fig.S1B ),including 11.3%SNVs unique to the cultivated barley;some of them might be false-positives caused by missing data of wild barley.Hence,we suggest that the gene pool of the cultivated barley is derived from wild barley from both the Near East and Tibet.This suggestion is strongly supported by the fact that 8,177(7.6%)and 7,913(7.4%)of SNVs detected in cultivated barley were originated from the wild barley from the Near East and Tibet,respectively (Fig.S1B ).In contrast,we identified 57.8%SNVs unique to 13wild-barley accessions (Fig.S1B ),indicating that more than half of the genetic diversity in exon has been lost during barley do-mestication or modern breeding.This finding also indicates that wild barley has a much larger gene pool than cultivated barley and contains more potentially useful genetic resources for barley improvement (4,14,21).

Moreover,some genetic studies have demonstrated that wild barley consists of several genetically differentiated populations (3,4).Our results showed the wild-barley accessions from Iran and Turkey were closely related to the Tibetan wild barley,whereas some Israeli wild-barley accessions showed large genetic and evolutionary distances from the rest of wild barley (Fig.1A and B ).Based on the results of population structure,we divided all of the wild-barley accessions into three groups:Wb-T acces-sions from the Tibetan Plateau and its vicinity (11,12);Wb-NE1accessions from Israel (26–28);and four Wb-NE2accessions:B1K-04-12(21)and ECI-2-0(27)from Israel,Iran-6-26from Gawdar,Iran (29),and Turkey-19-24from Bitlis,Turkey (30)(Table S1).The results suggested that the wild barley from the Near East have wide genetic diversity (21,26–30),which could be divided into two or more subgroups,and are also consistent with the report that incipient sympatric speciation might begin to occur between the two opposite slopes of “Evolution Canyon ”at Mount Carmel,Israel,despite their geographical proximity (31).Domestication of crops and expansion of agriculture has fundamentally reshaped the course of human history and many other living organisms (1,32–34).The introduction and estab-lishment of barley cultivation into the Plateau has provided a staple food for the livelihood of pioneer Tibetan settlers and for the subsequent rapid population growth on the Tibetan Plateau (35).Hulless barley (qingke )has been inferred with a single origin of domestication according to the nud gene (36).Our results show that qingke is different from modern cultivated barley at genomic level (Fig.1A and B ),but has a close genetic relationship with wild barley,especially the Tibetan wild barley (Fig.1B ).Thus,we propose that qingke probably existed in an early stage of domestication.However,the genomic similarity between modern cultivated barley and qingke cannot be assessed efficiently using the current data because of limited samples.It is also difficult to exclude the possibility of gene flow between altitude-adapted Tibetan wild barley and introduced domesticated barley,which might have allowed early Tibetan farmers to de-velop varieties that were well adapted to local climatic conditions in the Plateau (37).Thus,qingke was not included in genomic similarity analysis.

It has been reported that about half of the sequenced loci from wild barley exhibit significant differentiation between the eastern and western portion of the species (7,38).Based on the released WGS contigs assembly of cv .Morex,Baker,and Bowman (15),and the synthetic assembly of the barley genome (Dataset S1),we were able to estimate the genomic contribution of different wild-barley populations to the representatives of modern culti-vars (Tables S1and S2).Interestingly,the origin of the cultivated barley genome showed much higher portions from chromosomes 1H –3H in Wb-NE2,and from chromosomes 4H –7H in Wb-T,

Chromosome 1H

2H

3H

4H

5H

6H

7H Total

Wb-T 33.830.236.455.165.349.155.248.8Wb-NE10.0

4.7

2.5

7.1

1.1

4.1

4.9

3.6

Wb-NE2

59.551.949.629.624.432.526.536.1

B

A

Fig.2.Genomic similarity between cultivated and wild barley.(A )Circos diagram shows the seven chromosomes (1H ~7H)of barley in each of the four groups:green,cultivated barley (Cultivar);yellow,wild barley from the Near East group 1(Wb-NE1);blue,wild barley from the Near East group 2(Wb-NE2);red,the wild barley from Tibet (Wb-T).The number on each chromosome indicates genomic position on the synthetic assembly of the barley genome (Mb).The similar blocks are connected with lines and each line represents one unique window (300kb)of the genome with highest similarity between Cultivar and the three wild-barley groups.(B )Genomic similarities between Cultivar and three wild-barley groups based on the total length of unique windows,excluding the overlap of slide windows.

4of 6|https://www.wendangku.net/doc/2019216041.html,/cgi/doi/10.1073/pnas.1414335111

Dai et al.

respectively,according to the length of unique windows between Cultivar and wild-barley groups(Fig.2).This finding was not a coincidence,as supported by the results obtained in many re-cent studies.Some genes related to spike morphology and flowering time,as the main traits of early domestication,have been cloned from chromosome2H:genes such as Ppd-H1(39), vrs1(40),HvAP2(41,42),and HvCEN(43).On the other hand, VRN-H3,another early domesticated gene conferring a spring growth habit(44),has been cloned from chromosome7H,which coincides with the spring-type of Tibetan wild barley(12).Most importantly,tough(nonbrittle)rachis,one of the earliest do-mesticated traits,encoded by two closely linked complementary genes,had been mapped to chromosome3H,supporting the concept of polyphyletic domestication of cultivated barley(6,9). Similarly,our data show that the difference of genomic contri-bution between two wild-barley populations was smaller in chromosome3H than other chromosomes(Fig.2). Accordingly,it may be concluded that the genome of modern cultivated barley originated from two major wild-barley pop-ulations,one from the Near East Fertile Crescent and the other from the Tibetan Plateau,with different contributions on each chromosome.Our case study on five cultivated barley accessions also verified this finding.Because of the unavoidable gene flow between early domesticated genotypes and their wild ancestors (45,46),the domestication process of barley is more complex than that expected previously.The current data cannot exclude the possibility that the Tibetan wild-barley genome was merged into cultivated barley during a second domestication following the first domestication that occurred in the Fertile Crescent. Hence,further evidence is required through a comprehensive genomic investigation of wild and cultivated barley from the Near East Fertile Crescent and Tibetan Plateau.

Materials and Methods

Barley Plants.We used12wild(H.spontaneum)(Table S1)and9cultivated barley plants(Table S2)to conduct an RNA-seq analysis.Six wild-barley accessions from the Near East were collected and previously characterized by Nevo et al.(26–28).Six wild-barley genotypes were from the collection of Xu,since the1960s,from the extensive area of the Tibetan Plateau(11,12), and evaluated by Dai et al.(14,47)and Qiu et al.(48).Among the cultivated barley genotypes used in this study,Padanggamu,Beiqing5,and Himala2 were hulless,also called qingke,and cultivated on the Tibetan Plateau and its vicinity.Three cultivars from East Asia and three cultivars from Europe were reported in our previous work(14).Moreover,for comparative studies, one wild barley(B1K-04-12)and four cultivated barley(Bowman,Barke,Igri, and Haruna Nijo)with WGS sequence were selected as references,which have been investigated by the International Barley Genome Sequencing Consortium(15)(Tables S1and S2).

Phylogenetic,Population Structure,and PCA Analysis.A dataset of92,776SNVs with no missing data anchored to the synthetic assembly of the barley ge-nome was used to conduct phylogenetic analysis,population structure analysis,and PCA.The phylogenetic tree of the26accessions was constructed using MEGA5.1program(49)with neighbor-joining methods(1,000boot-straps).FRAPPE(50)was used to investigate the population structure based on a maximum-likelihood method,with10,000iterations,and the number of clusters(K)was set from2to6.Moreover,we performed a PCA analysis using EIGENSOFT(51).

Genomic Similarity Analysis.The same dataset was used for genomic similarity analysis as an addition to the phylogenetic analysis.Each type of SNV with a known site referring to the synthetic assembly of the barley genome was counted for four groups:Cultivar,Wb-T,Wb-NE1,and Wb-NE2.Therefore, barley gene pools for the four groups were constructed according to Rubin et al.(24).Briefly,for all sites with two or more variants,the minority var-iants were treated as errors rather than using the reference base.If the variants in a certain site had the same frequency,they were selected ran-domly.Finally,we constructed four(Cultivar,Wb-T,Wb-NE1,and Wb-NE2) barley gene pools,where all of the SNVs were linearly distributed on the seven chromosomes of the synthetic assembly of the barley genome. Moreover,average heterozygosity for the four barley groups was estimated according to Nei(52).

To maximize the genomic coverage and maintain the high accuracy,we used300-kb windows and150-kb overlapping slide windows along the synthetic assembly of the barley genome to study the genomic similarity. Large windows usually have poor coverage and small windows do not contain enough SNVs,leading to inaccuracy in the similarity analysis.The numbers of SNVs were counted for each window,and the windows were removed with the number of SNVs≤20.We selected the windows with no missing data in all of the three pairs between Cultivar and Wb-T,Wb-NE1,or Wb-NE2along the synthetic assembly of the barley genome for further analysis.The number of identical SNVs in each window was counted in all of the three pairs.Then,the similarity of each window was calculated in all of the three pairs,which was the number of identical SNVs divided by the total number of SNVs in each window.The windows were kept with similarity≥95%.Fi-nally,unique windows with the highest similarity among three pairs were selected for the visualization of synthetic relationships using Circos(53).

Additional experimental details on sample and cDNA library preparation, deep sequencing,read mapping and de novo assembly,and SNVs or indel calling can be found in SI Materials and Methods. ACKNOWLEDGMENTS.We thank Hangzhou Guhe Information and Technol-ogy Co.,Ltd,for help in sequencing and bioinformatics analysis,Profs.A.Beiles and A.Beattie for critical reading of the manuscript,and Prof.D.F.Sun for providing seeds of Tibetan wild barley.This study was supported by the Natural Science Foundation of China(31330055,31201166,31301246,and 31171544).

1.Brown TA,Jones MK,Powell W,Allaby RG(2009)The complex origins of domesti-

cated crops in the Fertile Crescent.Trends Ecol Evol24(2):103–109.

2.Allaby RG,Fuller DQ,Brown TA(2008)The genetic expectations of a protracted

model for the origins of domesticated crops.Proc Natl Acad Sci USA105(37): 13982–13986.

3.Zohary D,Hopf M,Weiss E(2012)Domestication of Plants in the Old World:The

Origin and Spread of Domesticated Plants in Southwest Asia,Europe,and the Med-iterranean Basin(Oxford Univ Press,Oxford).

4.Nevo E(2006)Genome evolution of wild cereal diversity and prospects for crop im-

provement.Plant Genet Resour4(1):36–46.

5.Harlan JR,Zohary D(1966)Distribution of wild wheats and barley.Science153(3740):

1074–1080.

6.Zohary D(1999)Monophyletic vs.polyphyletic origin of the crops on which agricul-

ture was founded in the Near East.Genet Resour Crop Evol46(2):133–142.

7.Morrell PL,Clegg MT(2007)Genetic evidence for a second domestication of barley

(Hordeum vulgare)east of the Fertile Crescent.Proc Natl Acad Sci USA104(9): 3289–3294.

8.Nevo E,Korol AB,Beiles A,Fahima T(2002)Evolution of Wild Emmer and Wheat

Improvement.Population Genetics,Genetic Resources,and Genome Organization of Wheat’s Progenitor,Triticum dicoccoides(Springer,Berlin).

9.Azhaguvel P,Komatsuda T(2007)A phylogenetic analysis based on nucleotide se-

quence of a marker linked to the brittle rachis locus indicates a diphyletic origin of barley.Ann Bot(Lond)100(5):1009–1015.

10.Molina-Cano JL,et al.(2005)Chloroplast DNA microsatellite analysis supports

a polyphyletic origin for barley.Theor Appl Genet110(4):613–619.11.Xu TW(1975)On the origin and phylogeny of cultivated barley with preference to

the discovery of Ganze wild two-rowed barley Hordeum spontaneum c.Koch.Acta Genet Sin2(2):129–137.

12.Xu TW(1982)Origin and evolution of cultivated barley in China.Acta Genet Sin9:440–446.

13.von Bothmer R,Komatsuda T(2011)in Barley:Production,Improvement,and Uses,ed

Ullrich SE(Wiley-Blackwell,Chichester,UK),pp14–62.

14.Dai F,et al.(2012)Tibet is one of the centers of domestication of cultivated barley.

Proc Natl Acad Sci USA109(42):16969–16973.

15.Mayer KF,et al.;International Barley Genome Sequencing Consortium(2012)A

physical,genetic and functional sequence assembly of the barley genome.Nature 491(7426):711–716.

16.Wang Z,Gerstein M,Snyder M(2009)RNA-Seq:A revolutionary tool for tran-

scriptomics.Nat Rev Genet10(1):57–63.

17.Marioni JC,Mason CE,Mane SM,Stephens M,Gilad Y(2008)RNA-seq:An assessment

of technical reproducibility and comparison with gene expression arrays.Genome Res 18(9):1509–1517.

18.Mizuno H,et al.(2010)Massive parallel sequencing of mRNA in identification of

unannotated salinity stress-inducible transcripts in rice(Oryza sativa L.).BMC Ge-nomics11:683–695.

19.Swanson-Wagner R,et al.(2012)Reshaping of the maize transcriptome by domesti-

cation.Proc Natl Acad Sci USA109(29):11878–11883.

20.Kohl S,et al.(2012)A putative role for amino acid permeases in sink-source com-

munication of barley tissues uncovered by RNA-seq.BMC Plant Biol12:154.

21.Hübner S,et al.(2009)Strong correlation of wild barley(Hordeum spontaneum)pop-

ulation structure with temperature and precipitation variation.Mol Ecol18(7):1523–1536.

Dai et al.PNAS Early Edition|5of6E V O L U T I O N

22.Martin JA,Wang Z(2011)Next-generation transcriptome assembly.Nat Rev Genet

12(10):671–682.

23.Matsumoto T,et al.(2011)Comprehensive sequence analysis of24,783barley full-

length cDNAs derived from12clone libraries.Plant Physiol156(1):20–28.

24.Rubin CJ,et al.(2010)Whole-genome resequencing reveals loci under selection

during chicken domestication.Nature464(7288):587–591.

25.Mascher M,et al.(2013)Barley whole exome capture:A tool for genomic research in

the genus Hordeum and beyond.Plant J76(3):494–505.

26.Nevo E,Zohary D,Brown AHD,Haber M(1979)Genetic diversity and environmental

associations of wild barley,Hordeum spontaneum,in Israel.Evolution33(3):815–833.

27.Nevo E,Apelbaum-Elkaher I,Garty J,Beiles A(1997)Natural selection causes micro-

scale allozyme diversity in wild barley and a lichen at“Evolution Canyon”Mt.Carmel, Israel.Heredity78(4):373–382.

28.Nevo E,Brown AHD,Zohary D,Storch N,Beiles A(1981)Microgeographic edaphic

differentiation in allozyme polymorphisms of wild barley(Hordeum spontaneum, Poaceae).Plant Syst Evol138(3-4):287–292.

29.Nevo E,Beiles A,Kaplan D,Storch N,Zohary D(1986a)Genetic diversity and envi-

ronmental associations of wild barley,Hordeum spontaneum(Poaceae),in Iran.Plant Syst Evol153(3-4):141–164.

30.Nevo E,Zohary D,Beiles A,Kaplan D,Storch N(1986b)Genetic diversity and envi-

ronmental associations of wild barley,Hordeum spontaneum,in Turkey.Genetica 68(3):203–213.

31.Parnas T(2006)Evidence for incipient sympatric speciation in wild barley,Hordeum

spontaneum,at“Evolution Canyon”,Mount Carmel,Israel,based on hybridization and physiological and genetic diversity estimates.Master Thesis,(Univ of Haifa,Haifa, Israel).

32.Purugganan MD,Fuller DQ(2009)The nature of selection during plant domestica-

tion.Nature457(7231):843–848.

33.Riehl S,Zeidi M,Conard NJ(2013)Emergence of agriculture in the foothills of the

Zagros Mountains of Iran.Science341(6141):65–67.

34.Skoglund P,et al.(2012)Origins and genetic legacy of Neolithic farmers and hunter-

gatherers in Europe.Science336(6080):466–469.

35.Qi X,et al.(2013)Genetic evidence of paleolithic colonization and neolithic expansion

of modern humans on the Tibetan plateau.Mol Biol Evol30(8):1761–1778.

36.Taketa S,et al.(2008)Barley grain with adhering hulls is controlled by an ERF family

transcription factor gene regulating a lipid biosynthesis pathway.Proc Natl Acad Sci USA105(10):4062–4067.

37.Guedes J,et al.(2014)Moving agriculture onto the Tibetan Plateau:The archae-

obotanical evidence.Archaeol Anthropol Sci,10.1007/s12520-013-0153-4.38.Morrell PL,Lundy KE,Clegg MT(2003)Distinct geographic patterns of genetic di-

versity are maintained in wild barley(Hordeum vulgare ssp.spontaneum)despite migration.Proc Natl Acad Sci USA100(19):10812–10817.

39.Turner A,Beales J,Faure S,Dunford RP,Laurie DA(2005)The pseudo-response reg-

ulator Ppd-H1provides adaptation to photoperiod in barley.Science310(5750): 1031–1034.

40.Komatsuda T,et al.(2007)Six-rowed barley originated from a mutation in a home-

odomain-leucine zipper I-class homeobox gene.Proc Natl Acad Sci USA104(4): 1424–1429.

41.Houston K,et al.(2013)Variation in the interaction between alleles of HvAPETALA2

and microRNA172determines the density of grains on the barley inflorescence.Proc Natl Acad Sci USA110(41):16675–16680.

42.Nair SK,et al.(2010)Cleistogamous flowering in barley arises from the suppression of

microRNA-guided HvAP2mRNA cleavage.Proc Natl Acad Sci USA107(1):490–495. https://www.wendangku.net/doc/2019216041.html,adran J,et al.(2012)Natural variation in a homolog of Antirrhinum CENTROR-

ADIALIS contributed to spring growth habit and environmental adaptation in culti-vated barley.Nat Genet44(12):1388–1392.

44.Yan L,et al.(2006)The wheat and barley vernalization gene VRN3is an orthologue of

FT.Proc Natl Acad Sci USA103(51):19581–19586.

45.Ellstrand NC,Prentice HC,Hancock JF(1999)Gene flow and introgression from do-

mesticated plants into their wild relatives.Annu Rev Ecol Syst30:539–563.

46.Hübner S,et al.(2012)Islands and streams:clusters and gene flow in wild barley

populations from the Levant.Mol Ecol21(5):1115–1129.

47.Dai F,et al.(2010)Differences in phytase activity and phytic acid content between

cultivated and Tibetan annual wild barleys.J Agric Food Chem58(22):11821–11824.

48.Qiu L,et al.(2011)Evaluation of salinity tolerance and analysis of allelic function of

HvHKT1and HvHKT2in Tibetan wild barley.Theor Appl Genet122(4):695–703. 49.Tamura K,et al.(2011)MEGA5:Molecular evolutionary genetics analysis using

maximum likelihood,evolutionary distance,and maximum parsimony methods.Mol Biol Evol28(10):2731–2739.

50.Tang H,Peng J,Wang P,Risch NJ(2005)Estimation of individual admixture:Analytical

and study design considerations.Genet Epidemiol28(4):289–301.

51.Price AL,et al.(2006)Principal components analysis corrects for stratification in

genome-wide association studies.Nat Genet38(8):904–909.

52.Nei M(1978)Estimation of average heterozygosity and genetic distance from a small

number of individuals.Genetics89(3):583–590.

53.Krzywinski M,et al.(2009)Circos:An information aesthetic for comparative

genomics.Genome Res19(9):1639–1645.

6of6|https://www.wendangku.net/doc/2019216041.html,/cgi/doi/10.1073/pnas.1414335111Dai et al.

相关文档
相关文档 最新文档