文档库

最新最全的文档下载
当前位置:文档库 > SAMOVA 1.0

SAMOVA 1.0

SAMOV A 1.0 implements an approach to define groups of populations that are geographically homogeneous and maximally differentiated from each other. As a by-product, it also leads to the identification of genetic barriers between these groups. The method is based on a simulated annealing procedure that aims at maximizing the proportion of total genetic variance due to differences between groups of populations (SAMOV A, Spatial Analysis of MOlecular V Ariance). The method is described in Dupanloup, Schneider and Excoffier (2002).

SAMOV A 1.0 runs on Windows. There is no Linux or Mac version yet.

Input files

SAMOV A 1.0 needs two input files. The first one (*.geo) must contain the geographic coordinates of the sampling localities of your populations. The second one (*.arp) is an Arlequin input file containing the genetic data sampled in your populations. The Arlequin file must have the SAME NAME as the geographical file with the extension (*.arp). The order of the populations in the two input files MUST BE THE SAME !!!

The file containing the geographic coordinates of the sampling localities of your populations must have the .geo extension.

Important notice: SAMOV A 1.0 does not work if two sampling localities have the same geographical coordinates.

The geographical input file must be structured the following way. Each line corresponds to a population. Each line must contain five fields separated by a tab character:

an integer number corresponding to the line in the file

the name of your population within quotes

the longitude of your sampling point

the latitude of your sampling point

an integer (for example, 1).

Example of geographic file INPUTFILE.GEO :

1 "Sample 1" 31.23 31.03 1

2 "Sample 2" 10.1

3 36.5 1

3 "Sample 3" 15.01 41.55 1

4 "Sample 4" 23.2 55.51 1

Example of Arlequin project file (for the same populations listed in the geographical input file) INPUTFILE.ARP :

#AMOV A analysis

Title="A New Sample File Designed To Compute AMOV A"

NbSamples=4

GenotypicData=0

DataType=DNA

LocusSeparator=WHITESPACE

MissingData='?'

[Data]

[[Samples]]

SampleName="Sample 1"

SampleSize=2

SampleData= {

Hap1 1 AAAAAAAAAAAAAA TTAAAA

Hap2 1 AAAAAACCAAAAAATTAAAA

}

SampleName="Sample 2"

SampleSize=2

SampleData= {

Hap3 1 TTTTTTTAAAAAAATTAAAA

Hap4 1 AAAAAACCAAAAAATTAAAA

}

SampleName="Sample 3"

SampleSize=2

SampleData= {

Hap5 1 AAAAAGGGAAAAAA TTAAAA

Hap6 1 AAAAAACCAAAAGATTAAAA

}

SampleName="Sample 4"

SampleSize=2

SampleData= {

Hap7 1 AAAAAAAAAAGGGA TTAAAA

Hap8 1 AAAAAACCAAAAAATTAAAA

}

Running

SAMOV A needs:

the name of the input files (for example: inputdata, in this case, you MUST have in the directory containing the soft the 2 inputfiles used by SAMOV A and these files MUST be called inputdata.geo and inputdata.arp).

the number K of groups of populations you wish to define (the final structure defined by SAMOV A will contain K groups)

the number of simulated annealing processes you wish to perform (100 seems a good choice)

the type of molecular distance between haplotypes you want to compute (SAMOV A like AMOV A is based on a matrix of distances between haplotypes observed in the whole set of samples). With this option, you can choose between pairwise differences between haplotypes (for DNA data) or sum of squared size differences between haplotypes (for microsatellite data).

When the SAMOV A window disappears from your screen that means that the computations are finished. It takes time and this time depends on the number of populations you have and the number of simulated annealing processes you wish to perform.

Output files

A set of output files are created by SAMOV A:

SAMOV A_results_arlequin.txt : the genetic structure defined by SAMOVA as well as the fixation indices corresponding to this group structure and their significance level evaluated by 1,000 permutations of populations among groups.

SAMOV A2.log : this file contains all the steps done by SAMOV A 2.0 and, in case of problems, the location of the problems.

SAMOV A_finalstructure.arp: an arlequin project file created by appending the input arlequin project file with the genetic structure defined by SAMOV A.

SAMOV A_results.ps : this files (eps) can be read with GSview for Windows; it contains a map of the sampling points and the barriers between the groups of populations defined by SAMOV A. Arlequin.log : this file is generated during the computation of the fixation indices corresponding to the genetic structure defined by SAMOV A. It contains all the run-time WARNINGS and ERRORS encountered during this computations.

Known issues

Samova 1.0 has been developed on Windows XP. It might encounter problems when running on later versions of Windows. To solve this problem, run Samova 1.0 in compatibility mode for Windows XP. To do so:

right click on the icon of Samova 1.0

click Properties

in the Properties dialog box, click the Compatibility tab

select the Run This Program in compatibility mode for Windows XP

Tab characters MUST be used as separator in the .geo files. If you use spaces, instead, Samova 1.0 will not run properly.

References

Dupanloup, I., Schneider, S., Excoffier, L. (2002) A simulated annealing approach to define the genetic structure of populations. Molecular Ecology 11(12):2571-81.

See also:

Excoffier, L., Smouse, P., Quattro, J.M. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131: 479-491.

Excoffier, L., Lischer, H.E.L. (2010) Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources 10:

564-567.

Isabelle Dupanloup, CMPG, Institute of Ecology and Evolution, University of Bern