August 5, 2010

more accurate SNP identification by using population sequence data: the SNP-seq method

Researchers at the Scripps Research Institute, located in La Jolla CA, have devised a cutting-edge program to identify SNPs and sequence individuals' genotypes called SNIP-seq. This program was designed to utilize population sequence data (when a number of samples/individuals n>=20 have been sequenced across the same genomic regions) to identify SNPs. In addition, SNIP-seq assigns genotypes for each SNP to each sample.

This method has been deemed highly accurate and reduces the rate of false positives that have been caused by sequencing errors. Vikas Bansal, the first author of the paper from Genome Research (abstract), explained the motivation to develop this technology:
...You have a lot of tools for aligning the short reads generated by the next-gen sequencing platforms to a reference genome and you also have tools for identifying SNPs, but when you have population sequence data, you can leverage the fact that you have multiple individuals' sequences across the same genomic regions to improve both the accuracy of SNPs and the genotype calling.
The researchers evaluated the accuracy of their method in a really cool way - they used sequence data from a 200kb region on chromosome 9p21 (location of genes that add to a person's risk of developing coronary artery disease and diabetes) from 48 individuals. The SNIP-seq method proved accurate for detecting variants and filtered out false SNPs. The even cooler thing is that the researchers "stumbled" across novel SNPs in this chromosomal region, which they later validated using pooled sequencing data and confirmed using Sanger sequencing.

This new, more accurate method (false-positive rate ~2%, down from ~5%) can help us to re-sequence genomic regions known to be associated with disease, and therefore detect rare variants that might contribute to disease progression. Previous, less accurate sequencing methods that identified false positive SNPs potentially impeded disease research. SNP-seq will potentially distinguish false SNPs from real ones. This breakthrough use of population sequencing data will hopefully lead to more accurate studying of disease-causing genetic variants and viable therapeutic/pharmacogenomic targets.

No comments:

Post a Comment