It ie entitled "Sensitive Protein Alignments at Tree-of-Life Scale Using DIAMOND".
Congratulations, Benjamin!
It is entitled "Arabidopsis thaliana genome assemblies and their use in hybrid transcriptome analyses".
Congratulations, Max!
Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomes
Rabanal et al. (2022) Nucleic Acids Res, published online Dec 1, 2022
Although long-read sequencing can often enable chromosome-level reconstruction of genomes, it is still unclear how one can routinely obtain gapless assemblies. In the model plant Arabidopsis thaliana, other than the reference accession Col-0, all other accessions de novo assembled with long-reads until now have used PacBio continuous long reads (CLR). Although these assemblies sometimes achieved chromosome-arm level contigs, they inevitably broke near the centromeres, excluding megabases of DNA from analysis in pan-genome projects. Since PacBio high-fidelity (HiFi) reads circumvent the high error rate of CLR technologies, albeit at the expense of read length, we compared a CLR assembly of accession Eyach15-2 to HiFi assemblies of the same sample. The use of five different assemblers starting from subsampled data allowed us to evaluate the impact of coverage and read length. We found that centromeres and rDNA clusters are responsible for 71% of contig breaks in the CLR scaffolds, while relatively short stretches of GA/TC repeats are at the core of >85% of the unfilled gaps in our best HiFi assemblies. Since the HiFi technology consistently enabled us to reconstruct gapless centromeres and 5S rDNA clusters, we demonstrate the value of the approach by comparing these previously inaccessible regions of the genome between the Eyach15-2 accession and the reference accession Col-0.
It is entitled "Characterization of a natural Arabidopsis thaliana - Pseudomonas viridiflava pathosystem"
Congratulations, Alejandra!
MethylScore, a pipeline for accurate and context-aware identification of differentially methylated regions from population-scale plant whole-genome bisulfite sequencing data
Hüther et al. (2022) Quant. Plant Biol. 3, e9
Whole-genome bisulfite sequencing (WGBS) is the standard method for profiling DNA methylation at single-nucleotide resolution. Different tools have been developed to extract differentially methylated regions (DMRs), often built upon assumptions from mammalian data. Here, we present MethylScore, a pipeline to analyse WGBS data and to account for the substantially more complex and variable nature of plant DNA methylation. MethylScore uses an unsupervised machine learning approach to segment the genome by classification into states of high and low methylation. It processes data from genomic alignments to DMR output and is designed to be usable by novice and expert users alike. We show how MethylScore can identify DMRs from hundreds of samples and how its data-driven approach can stratify associated samples without prior information. We identify DMRs in the A. thaliana 1,001 Genomes dataset to unveil known and unknown genotype–epigenotype associations.
Jiawei Wang, a postdoc in the lab from 2005 to 2011 supported by EMBO, is one of this year's Xplorer Prize awardees, recognizing his fundamental contributions to plant developmental biology, including having been a pioneer in scRNA-seq analyses of plants. Congratulations, Jiawei!