Modelling of gene loss propensity in the pangenomes of three Brassica species suggests different mechanisms between polyploids and diploids

Plant Biotechnol J. 2021 Dec;19(12):2488-2500. doi: 10.1111/pbi.13674. Epub 2021 Aug 24.

Abstract

Plant genomes demonstrate significant presence/absence variation (PAV) within a species; however, the factors that lead to this variation have not been studied systematically in Brassica across diploids and polyploids. Here, we developed pangenomes of polyploid Brassica napus and its two diploid progenitor genomes B. rapa and B. oleracea to infer how PAV may differ between diploids and polyploids. Modelling of gene loss suggests that loss propensity is primarily associated with transposable elements in the diploids while in B. napus, gene loss propensity is associated with homoeologous recombination. We use these results to gain insights into the different causes of gene loss, both in diploids and following polyploidization, and pave the way for the application of machine learning methods to understanding the underlying biological and physical causes of gene presence/absence.

Keywords: Brassica; XGBoost; gene loss propensity; machine learning; pangenome; transposable elements.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brassica napus* / genetics
  • Brassica* / genetics
  • Diploidy
  • Genome, Plant / genetics
  • Polyploidy