SNP and indel discovery and genotyping in next-generation sequencing data

Gilks, William (2016) SNP and indel discovery and genotyping in next-generation sequencing data. [Dataset]

Full text not available from this repository.

Abstract

Code, logs and data for discovery and genotyping of SNPs and indels, in the the D.melanogaster genome, using GATK HaplotypeCaller. Code is in the zipped folder named code.zip. Run logs for this code as in the zipped folder named logs.zip. The unfiltered vcf genotypes file is named lhm_rg_HC_2015-09-15.vcf.gz. The filtered vcf genotypes file is named f1.lhm_rg_HC_raw.vcf.gz. The vcf submitted to NCBI dbSNP (filtered, and with indels >50bp and variants with null alternate alleles both removed) is named dbSNP.lhm_rg_HC_raw.vcf.gz. The folder local_reference.zip contains the reference assembly files against which genotypes were called against, and includes the code used to format the data prior to use. Also included is genotypes data from the two in-house reference line samples sequenced (BDGP6+ISO1 mito/dm6, Bloomington Drosophila Stock Center no. 2057)

Samples are 220 Sussex-LHM hemiclones, and 2 RG. The first run did not include chromosome 4 and the mitochondrial genome, so these were genotyped separately, and then added to the rest of the results.

The link for the NCBI dbSNP record is currently https://www.ncbi.nlm.nih.gov/projects/SNP/snp_viewBatch.cgi?sbid=1062461and the submitter handle is MORROW_EBE_SUSSEX.

At the time of writting, the NCBI D.melanogaster build is still being updated, and therefore ss identifiers, but not rs identifers are available.

The pre-print manuscript for this data is available on biorxiv: "Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample" http://biorxiv.org/content/early/2016/10/17/081554 doi: http://dx.doi.org/10.1101/081554

Item Type: Dataset
Additional Information: DOI:10.5281/zenodo.159272
Keywords: Genetics, data visualisation, R, ggplot, genomics, mutation, single-nucleotide polymorphism, Drosophila.
Schools and Departments: School of Life Sciences > Evolution, Behaviour and Environment
Subjects: Q Science > QH Natural history > QH0301 Biology > QH0426 Genetics
Q Science > QH Natural history > QH0301 Biology > QH0426 Genetics > QH0438.4 Special aspects of the subject as a whole, A-Z
Q Science > QH Natural history > QH0301 Biology > QH0426 Genetics > QH0447 Genes. Alleles. Genome
Q Science > QH Natural history > QH0301 Biology > QH0426 Genetics > QH0470.A-Z Experimental organisms, A-Z
Q Science > QH Natural history > QH0301 Biology > QH0426 Genetics > QH0470.A-Z Experimental organisms, A-Z > QH0470.D7 Drosophila
Q Science > QH Natural history > QH0301 Biology > QH0426 Genetics > QH0460 Mutations
Related URLs:
Depositing User: William Gilks
Date Deposited: 15 Nov 2016 13:14
Last Modified: 15 Nov 2016 13:14
URI: http://srodev.sussex.ac.uk/id/eprint/65473
📧 Request an update
Project NameSussex Project NumberFunderFunder Ref
2Sexes_1Genome: Sex-specific genetic effects on fitness and human diseaseG0781EUROPEAN UNION2011-STG280632