University of New Hampshire, Durham, NH, 03824, United States
Designed for both paired-end and single-end reads, GBS-SNP-CROP is an open-source pipeline that maximizes data usage by eliminating read length uniformity requirements. Through its strategy of SNP calling based on both within-individual to across-population patterns of polymorphism, the pipeline identifies and distinguishes high-confidence SNPs from both sequencing and PCR errors, whether or not a reference genome is available. In the latter case, GBS-SNP-CROP employs a clustering approach to build a population-specific “Mock Reference” of consensus GBS fragments to guide alignment. As demonstrated with a population of 48 tetraploid Actinidia arguta (kiwiberry) accessions, GBS-SNP-CROP performs favorably compared to both the TASSEL-GBS (reference-based) and TASSEL-UNEAK (de novo) pipelines, in part due to its ability to access 4.4 and 2.0 times more sequence data, respectively, for SNP discovery. The pipeline’s modular design permits easy inspection of all intermediate results, and additional tools allow users to convert the final genotyping matrix into formats suitable for downstream analysis in R, PLINK, and TASSEL. To illustrate its practical use, results are presented from two studies of Actinidia species. In the first, de novo SNP data generated by the pipeline facilitated the identification of an extensive number of redundant accessions in USDA repositories, effectively deconvoluting a multi-species, multi-ploidy germplasm collection. In the second, GBS-SNP-CROP results enabled the efficient development of sex-associated markers that are now being used for high-throughput screening of breeding populations. The features of GBS-SNP-CROP make it worthy of consideration by plant curation and breeding programs, and the current version is available at https://github.com/halelab/GBS-SNP-CROP.git.
Arthur Melo concluded his Ph.D in January of 2015 in Genetics and Plant Breeding at the Agronomy School from Federal University of Goias, Brazil. Since February of the same year, he joins the Professor Iago Hale Lab at College of Life Sciences and Agriculture from University of New Hampshire as a Bioinformatic Postdoctoral Research.