A Comparison of the Efficiency in Finding Genes between Sequences Enriched For Hypo-Methylated Regions and Whole-Genome Shotgun Sequence in Bread Wheat
posted on 2025-08-08, 11:42authored byJoshua Lee Watson
Bread wheat (Triticum aestivum) has a roughly 17Gbp hexaploid genome, resulting from a hybridization event between tetraploid emmer wheat (Triticum dicoccoides) and diploid goat grass (Aegilops tauschii). This large plant genome is composed of at least 80% transposable elements (TE’s), making the transcriptionally active regions (genes) difficult to locate. Epigenetic methylation of DNA is a common indicator of low transcriptional activity and is used to silence TE’s within a genome. Using restriction enzymes that cannot cut methylated DNA (HpaII and HpyCH4IV) Illumina sequencing libraries were constructed that are enriched for hypomethylated regions of the wheat cultivar “Chinese Spring”. The resulting sequence data (roughly 4.5 Gb) was assembled into contigs with AbySS using k-values of 36, 50, and 64. Resulting contigs were then annotated for gene content using Blastx and Blast2GO. Our findings were then compared to un-enriched sequences from a whole genome shotgun sequence to determine the gene enrichment potential of our selection strategy. When contigs were assembled with a k-value of 64 for the libraries made with HpyCH4IV and k-values of 64 and 50 for the HpaII libraries, a higher proportion of genes were identified than in the control whole genome shotgun sequence.