Articles Related To Genetics 04/18
Apr 30, 2018

Biomol - Biotechnology Cotton Genome

List provided by Ahmed Abdelmoghny

Recent insights into cotton functional genomics: progress and future perspectives
Javaria Ashraf et al.,
Plant Biotechnology Journal (2018) 16, pp. 699–713
Corresponding author:

Recent insights into cotton functional genomics: progress and future perspectives Javaria Ashraf et al., Plant Biotechnology Journal (2018) 16, pp. 699–713 Corresponding author: Functional genomics has transformed from futuristic concept to well-established scientific discipline during the last decade. Cotton functional genomics promise to enhance the understanding of fundamental plant biology to systematically exploit genetic resources for the improvement of cotton fibre quality and yield, as well as utilization of genetic information for germplasm improvement. However, determining the cotton gene functions is a much more challenging task, which has not progressed at a rapid pace. This article presents a comprehensive overview of the recent tools and resources available with the major advances in cotton functional genomics to develop elite cotton genotypes. This effort ultimately helps to filter a subset of genes that can be used to assemble a final list of candidate genes that could be employed in future novel cotton breeding programme. We argue that next stage of cotton functional genomics requires the draft genomes refinement, re-sequencing broad diversity panels with the development of high-throughput functional genomics tools and integrating multidisciplinary approaches in upcoming cotton improvement programmes.


Comparative transcriptome analysis of cotton fiber development of Upland cotton (Gossypium hirsutum) and Chromosome Segment Substitution Lines from G. hirsutum × G. barbadense
Li PT et al.,
BMC Genomics. (2017); 18: 705.
Corresponding author:,,

How to develop new cotton varieties possessing high yield traits of Upland cotton and superior fiber quality traits of Sea Island cotton remains a key task for cotton breeders and researchers. While multiple attempts bring in little significant progresses, the development of Chromosome Segment Substitution Lines (CSSLs) from Gossypium barbadense in G. hirsutum background provided ideal materials for aforementioned breeding purposes in upland cotton improvement. Based on the excellent fiber performance and relatively clear chromosome substitution segments information identified by Simple Sequence Repeat (SSR) markers, two CSSLs, MBI9915 and MBI9749, together with the recurrent parent CCRI36 were chosen to conduct transcriptome sequencing during the development stages of fiber elongation and Secondary Cell Wall (SCW) synthesis (from 10DPA and 28DPA), aiming at revealing the mechanism of fiber development and the potential contribution of chromosome substitution segments from Sea Island cotton to fiber development of Upland cotton.

Genome-wide identification of the TIFY gene family in three cultivated Gossypium species and the expression of JAZ genes
Quan Sun et al.,
Scientific Reports (2017), 7, 42418,
Corresponding authors:,

TIFY proteins are plant-specific proteins containing TIFY, JAZ, PPD and ZML subfamilies. A total of 50, 54 and 28 members of the TIFY gene family in three cultivated cotton species-Gossypium hirsutum, Gossypium barbadense and Gossypium arboretum-were identified, respectively. The results of phylogenetic analysis showed that these TIFY genes were divided into eight clusters. The different clusters of gene family members often have similar gene structures, including the number of exons. The results of quantitative reverse transcription polymerase chain reaction (qRT-PCR) showed that different JAZ genes displayed distinct expression patterns in the leaves of upland cotton under treatment with Gibberellin (GA), methyl jasmonate (MeJA), Jasmonic acid (JA) and abscisic acid (ABA). Different groups of JAZ genes exhibited different expression patterns in cotton leaves infected with Verticillium dahliae. The results of the comparative analysis of TIFY genes in the three cultivated species will be useful for understanding the involvement of these genes in development and stress resistance in cotton.

Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution
Fuguang Li, et al.
Nature Biotechnology, (2015), 33, (5): 524-530.
Corresponding authors:,,,

Gossypium hirsutum has proven difficult to sequence owing to its complex allotetraploid (AtDt) genome. Here we produce a draft genome using 181-fold paired-end sequences assisted by fivefold BAC-to-BAC sequences and a high-resolution genetic map. In our assembly 88.5% of the 2,173-Mb scaffolds, which cover 89.6%~96.7% of the AtDt genome, are anchored and oriented to 26 pseudochromosomes. Comparison of this G. hirsutum AtDt genome with the already sequenced diploid Gossypium arboreum (AA) and Gossypium raimondii (DD) genomes revealed conserved gene order. Repeated sequences account for 67.2% of the AtDt genome, and transposable elements (TEs) originating from Dt seem more active than from At. Reduction in the AtDt genome size occurred after allopolyploidization. The A or At genome may have undergone positive selection for fiber traits. Concerted evolution of different regulatory mechanisms for Cellulose synthase (CesA) and 1-Aminocyclopropane-1-carboxylic acid oxidase1 and 3 (ACO1,3) may be important for enhanced fiber production in G. hirsutum

Gossypium barbadense genome sequence provides insight into the evolution of extra-long staple fiber and specialized metabolites
Xia Liu et al.
Scientific Reports (2015) volume 5, Article number: 14139,
Corresponding authors:,,

Of the two cultivated species of allopolyploid cotton, Gossypium barbadense produces extra-long fibers for the production of superior textiles. We sequenced its genome (AD)2 and performed a comparative analysis. We identified three bursts of retrotransposons from 20 million years ago (Mya) and a genome-wide uneven pseudogenization peak at 11–20 Mya, which likely contributed to genomic divergences. Among the 2,483 genes preferentially expressed in fiber, a cell elongation regulator, PRE1, is strikingly At biased and fiber specific, echoing the A-genome origin of spinnable fiber. The expansion of the PRE members implies a genetic factor that underlies fiber elongation. Mature cotton fiber consists of nearly pure cellulose. G. barbadense and G. hirsutum contain 29 and 30 cellulose synthase (CesA) genes, respectively; whereas most of these genes (>25) are expressed in fiber, genes for secondary cell wall biosynthesis exhibited a delayed and higher degree of upregulation in G. barbadense compared with G. hirsutum, conferring an extended elongation stage and highly active secondary wall deposition during extra-long fiber development.

The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres
Daojun Yuan et al.,
Scientific Reports (2015) Dec 4; 5,
Corresponding author:,

Gossypium hirsutum contributes the most production of cotton fibre, but G. barbadense is valued for its better comprehensive resistance and superior fibre properties. However, the allotetraploid genome of G. barbadense has not been comprehensively analysed. Here we present a high-quality assembly of the 2.57 gigabase genome of G. barbadense, including 80,876 protein-coding genes. The double-sized genome of the A (or At) (1.50 Gb) against D (or Dt) (853 Mb) primarily resulted from the expansion of Gypsy elements, including Peabody and Retrosat2 subclades in the Del clade, and the Athila subclade in the Athila/Tat clade. Substantial gene expansion and contraction were observed and rich homoeologous gene pairs with biased expression patterns were identified, suggesting abundant gene sub-functionalization occurred by allopolyploidization. More specifically, the CesA gene family has adapted differentially temporal expression patterns, suggesting an integrated regulatory mechanism of CesA genes from At and Dt subgenomes for the primary and secondary cellulose biosynthesis of cotton fibre in a "relay race"-like fashion. We anticipate that the G. barbadense genome sequence will advance our understanding the mechanism of genome polyploidization and underpin genome-wide comparison research in this genus.

Genome sequence of the cultivated cotton Gossypium arboreum
Fuguang Li et al.,
Nature Genetics (2014) 46: 567-572
Corresponding author:,,

The complex allotetraploid nature of the cotton genome (AADD; 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled the Gossypium arboreum (AA; 2n = 26) genome, a putative contributor of the A subgenome. A total of 193.6 Gb of clean sequence covering the genome by 112.6-fold was obtained by paired-end sequencing. We further anchored and oriented 90.4% of the assembly on 13 pseudochromosomes and found that 68.5% of the genome is occupied by repetitive DNA sequences. We predicted 41,330 protein-coding genes in G. arboreum. Two whole-genome duplications were shared by G. arboreum and Gossypium raimondii before speciation. Insertions of long terminal repeats in the past 5 million years are responsible for the twofold difference in the sizes of these genomes. Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells.

The draft genome of a diploid cotton Gossypium raimondii
Kunbo Wang et al.,
Nature Genetics (2012), 44 (10): 1098-1103,
Corresponding author:,,

We have sequenced and assembled a draft genome of G. raimondii, whose progenitor is the putative contributor of the D subgenome to the economically important fiber-producing cotton species Gossypium hirsutum and Gossypium barbadense. Over 73% of the assembled sequences were anchored on 13 G. raimondii chromosomes. The genome contains 40,976 protein-coding genes, with 92.2% of these further confirmed by transcriptome data. Evidence of the hexaploidization event shared by the eudicots as well as of a cotton-specific whole-genome duplication approximately 13–20 million years ago was observed. We identified 2,355 syntenic blocks in the G. raimondii genome, and we found that approximately 40% of the paralogous genes were present in more than 1 block, which suggests that this genome has undergone substantial chromosome rearrangement during its evolution. Cotton, and probably Theobroma cacao, are the only sequenced plant species that possess an authentic CDN1 gene family for gossypol biosynthesis, as revealed by phylogenetic analysis.

Be the first to comment this