You are here

| Pluripotent Stem Cells

It’s DNA methylation Jim, but Not As We Know It!

However, in a great leap forward in epigenomics, an article in Nature by Lister et al has put DNA methylation back under the spotlight, as itis the first to report complete DNA methylation maps of the entire genome at single base resolution. This colossal undertaking was established in two cell types; human embryonic stem cells (hESCs) and human foetal lung fibroblasts, using the MethylC-Seq technique combined with the Illumina Genome Analyzer II platform. This system allowed for a huge number of reads and the scale of the work presented is breathtaking. Around 178 giga-bases of sequence was generated, equal to 57 times the base content of the entire genome and so covering 86% of all bases, allowing 94% of all cytosines to be identified. In total, 45 million methylated cytosines were analysed in foetal fibroblasts (IMR90) and 62 million in hESCs (line H1) (See Figure 1). This huge number of reads allows for confidence in the methylation status of each cytosine residue and affords the study of strand-specific methylation and transcription. The detailed analysis of DNA methylation alone is impressive enough, but the authors go on to link this with messenger RNA/small RNA expression, as well as the location of various histone modifications (tri-methylation of lysine’s 4, 27 and 36 on histone H3, mono-methylation of Lysine 4 Histone H3 and acetylation of Lysine 27 histone H3) and DNA-binding factors (TAF1, SOX2, NANOG, p300 and OCT4). Such detailed, integrated maps allow for the generation of substantial amounts of data and detailed correlations between the various factors studied.

The striking initial finding was the prevalence of non-CpG cytosine methylation (being in the form methyl-CHG or methyl-CHH, where H is Adenosine, Cytosine or Thymine) in hESCs, where around 25% of total methylation was outwith the CpG context whereas, in the foetal fibroblasts, 99.98% of methylation was in the CpG form (See Figure 1). A comparison of methylation states between two different hESC lines (H1 and H9) showed a remarkable similarity, and suggests that this mode of methylation may be a conserved feature in hESC linked to pluripotency. Further analysis uncovered that non-CpG methylation was more common in gene bodies, rather than promoters, and was correlated to the expression status of the gene. Enrichment was also observed on the anti-sense strand, and further correlated to increased intronic transcription. Interestingly, it was further noted that non-CpG methylation was significantly enriched at genes involved in RNA processing, splicing and metabolism and depleted at sites of DNA-binding factors and enhancer elements.



Figure 1. Cytosine Methylation in IMR90 Foetal Lung Fibroblasts and H1 hESCs (Adapted from Lister et al)


To further establish a connection between the pluripotent state and non-CpG methylation, iPSC were generated from the fibroblasts which previously showed very little non-CpG methylation; and accordingly, non-CpG methylation was re-established to the levels observed in hESCs. The next obvious task was to understand the mechanisms behind this modification pattern in hESCs. Thorough analysis of DNA methylation patterns found a consensus sequence for the DNMT3 DNA methyltransferases, while the periodicity of this sequence observed in the genome is consistent with spacing between the active sites in the DNMT3A and DNMT3L heterotetramer complex, which mediates de novo DNA methylation. Gene expression analysis supported these findings, showing the over-representation of DNMT3A in hESCs when compared to fibroblasts.

This paper describes a monumental step forward in epigenetics with respect to the scale, analytical techniques, findings and, importantly, a paradigm shift in relation to the way we think about DNA methylation.

In another recent paper in Nature Genetics, Doi et al (2) alsoreport thatDNA methylation outwith CpG islands is potentially an important regulatory mechanism. In this article, DNA methylation was studied by comparing regions of methylation which differ between fibroblasts, hESCs and iPSCs. Differentially methylated regions (DMRs) were identified between iPSCs and fibroblasts of origin which exhibited comparatively low densities of CpG di-nucleotides but lay close to CpG islands (referred to as CpG “shores”). These regions were described as being Reprogramming-Differentially Methylated Regions (R-DMRs) and were often located near developmental and regulatory genes. A significant proportion of R-DMRs exhibited hypomethylation in iPSCs and overlapped with bivalent domains (regions of tri-methylation of lysine 4 and 27 of histone H3) and SOX2, NANOG, and OCT4 binding sites. This suggests that sites of demethylation during reprogramming of fibroblasts to iPSC are tightly linked to genes involved in pluripotency.

A further part of this study compared the presence of DMRs when comparing iPSC and hESC to determine the similarity of the epigenome in these two apparently “similar” cell types. DMRs were found (both under and over-methylation) between iPSC and hESC with 50% of DMRs lying close to genes of interest, suggesting that iPSC could occupy a distinct and possibly aberrant epigenetic state. However, these regions were relatively small in number (71 DMRs), and perhaps a comparison across different hESCs lines and across iPSCs lines will be required in order to uncover a more representative number of DMRs that “normally” exist, or indeed may uncover regions which “normally” differ. In the future, extended studies similar to this may allow certain regions to be used as epigenetic “Quality Controls” for validating the likeness of iPSCs to hESCs.

  1. Nature - Direct conversion of fibroblasts to functional neurons by defined factors (2009)
  2. Nature Genetics - Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts (2009)