Contents

This directory contains the C and C++ source files for SPLINK, TRANSMIT, PED2SPL, GH2STAT, and SNPHAP. It also contains packages for analysis of genetic association studies for the STATA statistical system with some exercises, and some miscellaneous programs in the R language.

Currently, the programs available are:

The following packages are not maintained but are still available on the site

The following package has been removed:

For license and copying details, please see the LICENSE file.

Wider use of these programs is bound to throw up errors.

Programs other than those which must be run within Stata were written for a Unix environment. On Microsoft Windows platforms, the easiest way to implement them is by using Cygwin, which emulates Unix and includes the necessary compilers etc.

The files listed below - which are compressed groups of files - have had their names and line-endings tweaked to suit either UNIX or PC users.

Please see the INSTALL file for further details.

David Clayton

Icon  Name                    Last modified      Size  
[DIR] Parent Directory - [TXT] DGCgenetics.txt 05-Jun-2008 17:40 415 [   ] DGCgenetics_1.0.tar.gz 05-Jun-2008 17:40 1.1M [   ] DGCgenetics_1.0.zip 05-Jun-2008 17:40 2.7M [   ] DGCgenetics_1.2.tar.gz 12-Jun-2009 15:38 2.0M [   ] DGCgenetics_1.2.zip 12-Jun-2009 15:38 2.0M [TXT] INSTALL 05-Jun-2008 17:40 1.2K [   ] LICENSE 05-Jun-2008 17:40 1.0K [   ] gh2stat-1.2.tgz 05-Jun-2008 17:40 25K [   ] gh2stat-1.2.zip 05-Jun-2008 17:40 28K [TXT] gh2stat.txt 05-Jun-2008 17:40 4.0K [   ] ped2spl-1.4.tar.gz 05-Jun-2008 17:40 15K [   ] ped2spl-1.4.zip 05-Jun-2008 17:40 16K [TXT] ped2spl.txt 05-Jun-2008 17:40 2.0K [   ] snphap-1.3.1.tar.gz 05-Jun-2008 17:40 22K [   ] snphap-1.3.1.zip 05-Jun-2008 17:40 24K [TXT] snphap.txt 05-Jun-2008 17:40 16K [   ] splink-1.09.tar.gz 05-Jun-2008 17:40 234K [   ] splink-1.09.zip 05-Jun-2008 17:40 236K [TXT] splink.txt 05-Jun-2008 17:40 29K [DIR] stata/ 05-Jun-2008 17:40 - [   ] transmit-2.5.4.tar.gz 05-Jun-2008 17:40 31K [   ] transmit-2.5.4.zip 05-Jun-2008 17:40 34K [TXT] transmit.txt 05-Jun-2008 17:40 16K
Software and course materials

This directory contains the C and C++ source files for SPLINK, TRANSMIT, 
PED2SPL, GH2STAT, and SNPHAP. It also contains programs for analysis of 
genetic association studies for the "Stata" statistical package with some 
exercises, and some miscellaneous programs in the R language.

Currently, the programs available are:

   * SNPHAP, a program for estimating frequencies of haplotypes of large
     numbers of diallelic markers from unphased genotype data from 
     unrelated subjects 
        o Uses the EM algorith with "trimming" of improbable assignments
        o Allows for missing data at some loci
        o Can search for multiple solutions using random starting points
        o Includes a Monte Carlo IP algorithm for multiple imputation, allowing
          uncertainty of solutions to be explored 
   * STATA packages
        o programs and exercises for genetic association studies
        o programs for choosing "haplotype tagging" SNPs
        o program for IBD regression (linkage analysis with covariates)
        o programs for Monte Carlo permutation testing
        o course materials for Advanced Statistical Modelling Course in the 
         Erasmus Summer Programme, 2002
   * DGCgenetics,  an R package for analysis of genetic association studies,         including some practical exercises
	o Some extensions to the R "genetics" package
        o Several test data sets
        o Miscellaneous teaching exercises
     (this used to be called dgc.genetics)	
   * snpMatrix This R package for analysis of genome-wide association studies 
     is no longer distributed from this site. It is no part of BioConductor 
     (http://www.bioconductor.org/)

For license and copying details, please see the LICENSE file.

Wider use of these programs is bound to throw up errors.

The files listed below - which are compressed groups of files - have had
their names and line-endings tweaked to suit either UNIX or PC users.

   * *.tar.gz files are targetted at UNIX users
   * *.zip files are targetted at PC users

Please see the INSTALL file for further details.

No longer maintained, but remaining on the site:

   * GH2STAT, a program for processing IBD distribution files dumped by 
     genehunter, for later analysis in statistics packages
        o writes a record for each pair of pedigree members considered
	o pairs may be restricted to be both affected and/or to have IBD
	  status which is a priori uncertain
        o covariate and other data from the genehunter pedigree input file
          may also be included in the output file
        o output file is formatted for input into SAS, Splus or R, Stata, or
	  spreadsheet programs
   * PED2SPL, a program for interfacing of and marker selection from
     standard LINKAGE input files to SPLINK and TRANSMIT.
        o select specified markers from larger record
        o old "binary" coding of marker genotypes handled as well as the
          more modern numeric coding
   * SPLINK, a program for sib pair linkage analysis.
        o Maximum likelihood subject to "possible triangle" restriction
        o Marker haplotypes based on several closely linked markers
        o Haplotype frequencies are estimated from the data
   * TRANSMIT, a program for transmission disequilibrium testing.
        o Marker haplotypes based on several closely linked markers
        o Parental genotype and/or haplotype phase may be missing
     (There are a number of known problems. In particular, the code 
      for X loci is flawed)

Removed:

   * TDTHAP, a package for TDT with extended haplotypes in the "R" language.



David Clayton, david.clayton@cimr.cam.ac.uk