Biology Worthy of Life
An experiment in revivifying biology
Compiled by Stephen L. Talbott (stevet@netfuture.org)
These are raw notes from my own reading of the literature of gene regulation. Please see the essential caveats in order to understand their limitations. Despite those limitations, the notes are presented here because browsing them should convince any reader — including those molecular biologists whose reading is largely confined to their own specialties — that it is the organism that makes use of its genes, not the other way around.
I periodically, if somewhat erratically, add further notes to this document, but am far less good at going through and deleting any outdated material, much less cleaning up disorganized aspects of the presentation. Nevertheless, I will welcome general comments and also suggestions for improving things. Send them to stevet@netfuture.org.
Please do read the brief introduction, which offers a useful perspective on these notes.
For most people, the best way to digest this document would not be to read straight through it from beginning to end, but merely to page through it, reading a few notes here and there in order to get a feel for the variety of processes at work in shaping how the organism makes use of its genes.
To get a sense for the scope of the overall document, you can:
(This concern about the “difficulty” of such-and-such a molecular achievement is encountered all too often in the literature, and is strangely anthropomorphic. Since cells seem to get the job done just about right, where is the difficulty? What drives this kind of talk is apparently a naive picture of what would be the efficient way to do things, and this in turn is owing to a simplistic, machine-like view of the tasks at hand. But if you review anything like the complete contents of the notes you are now reading, it is obvious that we have hardly begun to grasp how any particular activity is interwoven with numerous others. We still have little clue about the various processes to which RNA Pol II is contributing when it “bounces off” the DNA template.)
“Through kinase-dependent mechanisms, CDK8 and/or CDK19 regulate TF function to help ‘reprogram’ gene expression patterns in response to a stimulus or developmental cues. The Mediator kinase module also functions in kinase-independent ways, through Mediator binding, which blocks Mediator-pol II interaction yet appears to promote post-initiation events, such as pol II pause release or elongation. The complexity of the pol II transcription machinery and cell signaling networks presents many opportunities for new discoveries, but also many challenges. Cell type and cell context (e.g., oxidative stress or growth factor induction) will remain important considerations in future work, as the set of active TFs will change in each case” (Luyties and Taatjes 2022, doi:10.1016/j.tibs.2022.01.002).
• Metazoan development requires the orchestration of hundreds of thousands of enhancers to establish precise spatiotemporal gene expression patterns.
• Enhancers commonly exist in a ‘suboptimal’ state with respect to their transcription factor binding affinities, and this evolutionary ‘suboptimization’ of both the sequence and binding motif arrangement is key to encoding enhancer tissue-specificity.
• Accumulating evidence suggests that enhancers regulate gene transcription by stimulating release of promoter-paused RNA polymerase II into productive elongation.
• Bidirectional transcription of enhancer DNA is now appreciated to be a general characteristic of active enhancers, and recent reports document numerous examples of how promoters can function as enhancers to stimulate long-range gene activation. Thus, the distinction between enhancers and promoters is becoming less apparent.
• Clusters of cis-regulatory elements appear to be highly
interconnected in the nucleus, and these complex regulatory ‘hubs’ are
organized into topological domains along the linear chromosome.
“The current paradigm in the field of gene regulation postulates that
regulatory information for generating gene expression is organized into
modules (enhancers), each containing the information for driving gene
expression in a single spatiotemporal context. This modular organization is
thought to facilitate the evolution of gene expression by minimizing
pleiotropic effects. Here we review recent studies that provide evidence of
quite the opposite: (i) enhancers can function in multiple developmental
contexts, implying that enhancers can be pleiotropic, (ii) transcription
factor binding sites within pleiotropic enhancers are reused in different
contexts, and (iii) pleiotropy impacts the structure and evolution of
enhancers. Altogether, this evidence suggests that enhancer pleiotropy is
pervasive in animal genomes, challenging the commonly held view of
modularity” (Sabarís et al. 2019, doi:10.1016/j.tig.2019.03.006).
“While active enhancers are characterized by open chromatin structure, not
all open enhancers are active. The binary distinction into open/closed
regions and active/inactive enhancers is insufficient to describe this
complex relationship. Instead, considering quantitative differences in the
accessibility signal allows for discriminating between different regulatory
states” (table of contents blurb for Bozek and Gompel 2020,
doi:10.1002/bies.201900188).
“Shadow enhancers are seemingly redundant transcriptional cis-regulatory
elements that regulate the same gene and drive overlapping expression
patterns. Recent studies have shown that shadow enhancers are remarkably
abundant and control most developmental gene expression in both
invertebrates and vertebrates, including mammals. Shadow enhancers might
provide an important mechanism for buffering gene expression against
mutations in non-coding regulatory regions of genes implicated in human
disease. Technological advances in genome editing and live imaging have shed
light on how shadow enhancers establish precise gene expression patterns and
confer phenotypic robustness. Shadow enhancers can interact in complex ways
and may also help to drive the formation of transcriptional hubs within the
nucleus. Despite their apparent redundancy, the prevalence and evolutionary
conservation of shadow enhancers underscore their key role in emerging
metazoan gene regulatory networks”
(Kvon, Waymack, Gad and Wunderlich 2021, doi:10.1038/s41576-020-00311-x).
“Silencers are regulatory DNA elements that reduce transcription from their
target promoters; they are the repressive counterparts of enhancers.
Although discovered decades ago, and despite evidence of their importance in
development and disease, silencers have been much less studied than
enhancers. Recently, however, a series of papers have reported systematic
studies of silencers in various model systems. Silencers are often
bifunctional regulatory elements that can also act as enhancers, depending
on cellular context, and are enriched for expression quantitative trait loci
(eQTLs) and disease-associated variants. There is not yet evidence of a
‘silencer chromatin signature’, in the distribution of histone modifications
or associated proteins, that is common to all silencers; instead, silencers
may fall into various subclasses, acting by distinct (and possibly
overlapping) mechanisms”
(Segert, Gisselbrecht and Bulyk 2021, doi:10.1016/j.tig.2021.02.002).
“Enhancers are central to control ... tissue-specific gene expression
pattern ... We find that enhancers showing tissue-specific activity are
highly enriched in intronic regions and regulate the expression of genes
involved in tissue-specific functions, whereas housekeeping genes are more
often controlled by intergenic enhancers, common to many tissues. Notably,
an intergenic-to-intronic active enhancers continuum is observed in the
transition from developmental to adult stages: the most differentiated
tissues present higher rates of intronic enhancers, whereas the lowest rates
are observed in embryonic stem cells. Altogether, our results suggest that
the genomic location of active enhancers is key for the tissue-specific
control of gene expression”
(Borsari, Villegas-Mirón, Pérez-Lluch et al. 2021, doi:10.1101/gr.270371.120)
“Dual-function regulatory elements (REs), acting as enhancers in some
cellular contexts and as silencers in others ... We herein investigated this
class of REs in the human genome and profiled their activity across multiple
cell types. Focusing on enhancer–silencer transitions specific to the
development of T cells, we built an accurate deep learning classifier of REs
and identified about 12,000 silencers active in primary peripheral blood T
cells that act as enhancers in embryonic stem cells. Compared with regular
silencers, these dual-function REs are evolving under stronger purifying
selection and are enriched for mutations associated with disease phenotypes
and altered gene expression. In addition, they are enriched in the loci of
transcriptional regulators, such as transcription factors (TFs) and
chromatin remodeling genes. Dual-function REs consist of two intertwined but
largely distinct sets of binding sites bound by either activating or
repressing TFs, depending on the type of RE function in a given cell line.
This indicates the recruitment of different TFs for different regulatory
modes and a complex DNA sequence composition of these REs with dual
activating and repressive encoding. With an estimated >6% of cell
type–specific human silencers acting as dual-function REs, this overlooked
class of REs requires a specific investigation on how their inherent
functional plasticity might be a contributing factor to human diseases”
(Huang and Ovcharenko 2022, doi:10.1101/gr.275992.121).
“We identify simple rules for enhancer-promoter compatibility: most
enhancers activated all promoters by similar amounts, and intrinsic enhancer
and promoter activities combine multiplicatively to determine RNA output. In
addition, two classes of enhancers and promoters showed subtle preferential
effects. Promoters of housekeeping genes contained built-in activating
motifs for factors such as GABPA and YY1, which decreased the responsiveness
of promoters to distal enhancers. Promoters of variably expressed genes
lacked these motifs and showed stronger responsiveness to enhancers.
Together, this systematic assessment of enhancer-promoter compatibility
suggests a multiplicative model tuned by enhancer and promoter class to
control gene transcription in the human genome”
(Bergman, Jones, Liu et al. 2022, doi:10.1038/s41586-022-04877-w).
“Paralogues (divergent duplicated genes) are often involved in the same
developmental and cellular processes. In Drosophila, these genes are
frequently separated by large genomic distances. Levo et al. used
quantitative single-cell live imaging to analyse the transcriptional
dynamics of such genes, to determine whether they are co-regulated. The team
identified ‘topological operons’, identifying co-regulation by shared
enhancers and co-transcriptional initiation over distances of nearly 250 kb.
The coordinated transcriptional dynamics arise from associations with
discrete promoter-proximal tethering elements that enable contacts between
these genes in 3D throughout the fly genome”
(Koch 2022, doi:10.1038/s41576-022-00502-8).
“We find that the ultralong distance enhancer network has a nested
multilayer architecture that confers functional robustness of gene
expression. Experimental characterization reveals that enhancer epistasis is
maintained by three-dimensional chromosomal interactions and BRD4
condensation”
(Lin, Liu, Liu et al. 2022, doi:10.1126/science.abk3512).
“Housekeeping genes are considered to be regulated by common enhancers
across different tissues. Here we report that most of the commonly expressed
mouse or human genes across different cell types, including more than half
of the previously identified housekeeping genes, are associated with cell
type–specific enhancers. Furthermore, the binding of most transcription
factors (TFs) is cell type–specific. We reason that these cell type
specificities are causally related to the collective TF recruitment at
regulatory sites, as TFs tend to bind to regions associated with many other
TFs and each cell type has a unique repertoire of expressed TFs. Based on
binding profiles of hundreds of TFs from HepG2, K562, and GM12878 cells, we
show that 80% of all TF peaks overlapping H3K27ac signals are in the top
20,000–23,000 most TF-enriched H3K27ac peak regions, and approximately
12,000–15,000 of these peaks are enhancers (nonpromoters). Those enhancers
are mainly cell type–specific and include those linked to the majority of
commonly expressed genes. Moreover, we show that the top 15,000 most
TF-enriched regulatory sites in HepG2 cells, associated with about 200 TFs,
can be predicted largely from the binding profile of as few as 30 TFs.
Through motif analysis, we show that major enhancers harbor diverse and
clustered motifs from a combination of available TFs uniquely present in
each cell type. We propose a mechanism that explains how the highly focused
TF binding at regulatory sites results in cell type specificity of enhancers
for housekeeping and commonly expressed genes”
(Zhu and Landsman 2023; doi:10.1101/gr.278130.123).
In illustrating how to track down functional information relating to the risk variants, the researchers report that “the risk allele of the rs1537373 variant showed increased interaction with the CDKN2A promoter and the enhancer in the long noncoding RNA (lncRNA) ANRIL. This is a terrific example of what the field is up against, as not only will disease-associated variants synergistically act on multiple genes, but there may also be more complex gene-regulatory mechanisms involved, like ones affecting the function of noncoding RNAs”
A relevant side note: “Could the same information be recovered from linear 1D data? ... the findings emphasize that, even in highly related cell types, a proportion of enhancer-interacting signals will only be captured in 3D”.
(Trynka 2017, doi:10.1038/ng.3982).Bear in mind that there is actually no evidence that the chromatin fiber is not “well-organized”; everything suggests it is wonderfully fine-tuned for the infinitely nuanced expression of thousands of genes. It’s just that it isn’t “military order”; rather, it’s more like an intricately choreographed dance.
“Acetylation and butyrylation differ only in chain length, with two and four carbon atoms, respectively. Both PTMs were initially identified in histones, where they substantially overlap. Lysine acetylation and chemically related modifications can influence gene expression simply by neutralizing the positive lysine charge, which loosens the DNA interaction with histones, making it more accessible to transcriptional machinery. In that respect, acetylation and butyrylation should be functionally equivalent, as they are both uncharged. However, histones modified with acetyl or related groups can also be recognized by chromatin-remodeling proteins through specialized domains, such as the bromodomain. In bromodomains, the acyl group is accommodated within a hydrophobic binding pocket and kept in place through interactions between the carbonyl group and an asparagine ‘anchor’. Additionally, a ‘gatekeeper’ residue within the binding pocket restricts the size of the acyl group that can fit in the pocket. For some bromodomains a less bulky gatekeeper allows binding of butyryl chains, while others bind specifically acetyl groups. This allows cells to distinguish these two similar PTMs and to respond differently. For instance, during spermatogenesis, histones become hyper-acetylated and then replaced by transition proteins. The bromodomain-containing protein Brdt is indispensable for this process. Competing butyrylation might control the timing of histone replacement. In fact, Brdt binds less efficiently to butyrylated histones and, as a consequence, histones with this modification are removed later as compared to the acetylated ones.
Crotonylation differs slightly from acetylation and butyrylation, in that a
double bond confers a uniquely planar orientation. As for butyrylation,
substantial overlap exists between acetylation and crotonylation sites on
histones. Only one bromodomain that binds butyrylated peptides also
recognizes crotonylation. This is remarkable considering that butyrylation
and crotonylation differ only by the double bond, and even antibodies cannot
reliably discriminate between the two. YEATS domains have been identified as
specific crotonyllysine ‘‘readers.’’ The planar orientation of the crotonyl
chain allows it to slide into the YEATS binding site and to engage in a
p-p-p stack with two aromatic sidechains so that crotonylated peptides are
bound more efficiently than acetylated ones. Hence, cells have the
necessary equipment to sense specifically crotonylation. In the yeast
metabolic cycle, the phase of high energy availability coincides with a peak
in histone acetylation and expression of pro-growth genes, while a peak of
histone crotonylation is observed as cells enter the more quiescent phase.
The switch from histone acetylation to crotonylation is crucial for turning
off pro-growth genes and this involves crotonyllysine sensing by the
YEATS-domain-containing protein Taf14” (Figlia, Willnow and Teleman 2020,
doi:10.1016/j.devcel.2020.06.036).
“Histone post-translational modifications (PTMs) have emerged as exciting
mechanisms of biological regulation, impacting pathways related to cancer,
immunity, brain function, and more. Over the past decade alone, several
histone PTMs have been discovered, including acylation, lipidation,
monoaminylation, and glycation, many of which appear to have crucial roles
in nucleosome stability and transcriptional regulation”. “New studies
reveal a class of nonenzymatic histone PTMs derived from covalent binding of
highly reactive species, refuting the common notion that histone PTMs
require writers, readers, and erasers for biological significance”
(Chan and Maze 2020, doi:10.1016/j.tibs.2020.05.009).
“Pioneer transcription factors have the ability to access DNA in compacted
chromatin. Multiple transcription factors can bind together to a regulatory
element in a cooperative way, and cooperation between the pioneer
transcription factors OCT4 (also known as POU5F1) and SOX2 is important for
pluripotency and reprogramming. However, the molecular mechanisms by which
pioneer transcription factors function and cooperate on chromatin remain
unclear. Here we present cryo-electron microscopy structures of human OCT4
bound to a nucleosome containing human LIN28B or nMATN1 DNA sequences, both
of which bear multiple binding sites for OCT4. Our structural and
biochemistry data reveal that binding of OCT4 induces changes to the
nucleosome structure, repositions the nucleosomal DNA and facilitates
cooperative binding of additional OCT4 and of SOX2 to their internal binding
sites. The flexible activation domain of OCT4 contacts the N-terminal tail
of histone H4, altering its conformation and thus promoting chromatin
decompaction. Moreover, the DNA-binding domain of OCT4 engages with the
N-terminal tail of histone H3, and post-translational modifications at H3K27
modulate DNA positioning and affect transcription factor cooperativity.
Thus, our findings suggest that the epigenetic landscape could regulate OCT4
activity to ensure proper cell programming”
(Sinha, Bilokapic, Du et al. 2023; doi:10.1038/s41586-023-06112-6).
“The precise effect of chromatin modifications is influenced by multiple
contextual factors, including the underlying DNA sequence, transcription
factor occupancy and genomic positioning”
(Anonymous 2024, doi:10.1038/s41588-024-01705-x).
“Gene transcription is intimately linked to chromatin state and histone
modifications. However, the enzymes mediating these post-translational
modifications have many additional, nonhistone substrates, making it
difficult to ascribe the most relevant modification”
(Mannervik 2024, doi:10.1101/gad.351969.124). In other words, gene
regulation is so thoroughly wrapped up with everything else going on in the
cell that it is hard to disentangle gene regulation from “everything”.
“Regulated installation of large histone modifications is associated with DNA-dependent processes including transcription and replication. Krajewski suggests that this added bulk may trigger spontaneous, transient, and reversible increases in histone dynamics allowing DNA translocating enzymes to traverse nucleosomes more easily. The potential to distort native structure of canonical nucleosomes may expose intermediate nucleosome structures that can be specifically recognized by nucleosome-interacting proteins. Minor nucleosome instabilities resulting from smaller histone modifications may accelerate deposition of bulky histone modifications through allosteric effects. A tunable range of nucleosome dynamics that crescendos with the addition of bulky modifications may be written within the histone code” (Orlandi and McKnight 2019, doi:10.1002/bies.201900217).
Histone crotonylation and some of its regulating factors.
The schematic model shows the principal lysine crotonylation sites on
histones H3 and H4 and the reported writers (crotonyltransferase), erasers
(decrotonylase), readers, and other regulators for each lysine
crotonylation.
Credit: Li, Kun and Ziqiang Wang (2021). “Histone Crotonylation-Centric
Gene Regulation”, pigenetics and Chromatin vol. 14, no. 10 (February 6).
cc by 4.0
Based on a model: “The predicted effect of charge-altering PTMs on DNA accessibility can vary dramatically, from virtually none to a strong, region-dependent increase in accessibility of the nucleosomal DNA ... Proximity to the DNA is suggestive of the strength of the PTM effect, but there are many exceptions. For the vast majority of charge-altering PTMs, the predicted increase in the DNA accessibility should be large enough to result in a measurable modulation of transcription. However, a few possible PTMs, such as acetylation of H4K77, counterintuitively decrease the DNA accessibility, suggestive of the repressed chromatin ... For the majority of charge-altering PTMs, the effect on DNA accessibility is simply additive (noncooperative), but there are exceptions, e.g., simultaneous acetylation of H4K79 and H3K122, where the combined effect is amplified” (Fenley, Anandakrishnan, Kidane and Onufriev 2018, doi:10.1186/s13072-018-0181-5).
“Repressive lateral surface modifications can also interplay with histone tail modifications. For example, H3K64me3 co-localises with H3K9me3 on many genomic regions and the deletion of Suv39h1/2, the enzymes that catalyse H3K9me3, also reduces H3K64me3 levels. H3K64me3 relies on H3K9me3 for its deposition. However, some repetitive elements maintain their H3K64me3 status in Suv39h1/2–/– cells, indicating that H3K64me3 is not entirely dependent on H3K9me3 for its maintenance” (Lawrence, Daujat and Schneider 2016, doi:10.1016/j.tig.2015.10.007).
Histone variants have so many diverse effects in different contexts that we look here at only a random sampling:
The complexity of the role of H2A.Z in gene regulation. “The androgen receptor and estrogen receptor systems represent good examples to explain the function of H2A.Z in gene transcription. In the androgen receptor system, the PSA gene can be considered as the prototype of this pathway: In the absence of androgen (OFF [repressed] state), H2A.Z is loaded by the SRCAP and/or p400/Tip60 complexes. In this repressed configuration, H2A.Z is monoubiquitinated at both enhancers and promoters potentially by RING1B. Upon androgen stimulation (ON), H2A.Z is deubiquitinated by USP10 and its occupancy decreases. Of note, H2A.Z acetylation correlates with androgen receptor induction and similarly, the occupancy of the p400/Tip60 complex increases upon androgen receptor induction. The recruitment of the p400/Tip60 complex is mediated by its MRG15 subunit which recognizes H3K4 methylation states while SRCAP has been shown to interact with androgen receptor. In the case of the estrogen signaling cascade, we focus on the case of the TFF1 locus: In the OFF state, forkhead box protein A1 (FoxA1) binds to a distal enhancer (FoxA1-binding site) of the TFF1 locus where it recruits the p400/Tip60 complex supporting H2A.Z loading. Lack of H2A.Z at the TFF1 promoter, leads to a poorly defined nucleosome occupancy in the repressed/poised state (OFF). Upon activation of the pathway, the p400/Tip60 complex is recruited at the TFF1 promoter by estrogen receptor α which binds to its cognate sequences. At the TFF1 promoter, the p400/Tip60 complex loads H2A.Z leading to a better-defined nucleosome positioning. At the same time, H2A.Z occupancy decreases at the FoxA1-bound distal enhancer.
“From the above, some general rules for H2A.Z in gene regulation can be postulated: At genes that are poised/repressed (OFF), repressive marks of H2A.Z are found and as consequence its loss of function leads to upregulation. At genes that are active, activating PTMs of H2A.Z, such as H2A.Z acetylation, are found and as consequence H2A.Z loss of function leads to downregulation.
“In a repressed (OFF) or poised state, the H2A.Z deposition machinery is recruited by transcription factors and/or histone modifications to chromatin. This recruitment can be transient but still allows an exchange of H2A with H2A.Z. In the OFF state, H2A.Z is deacetylated by the deacetylation machinery and ubiquitinated on its C-terminus by RING1B. Upon gene activation (ON), additional TFs and/or histone modifications lead to the recruitment of the loading/acetylation/deubiquitination machinery. This triggers H2A.Z acetylation and deubiquitination, finally leading to transcriptional activation” (Giaimo, Ferrante, Herchenröther, et al. 2019, doi:10.1186/s13072-019-0274-9)
All this barely hints at the complexity of histone regulation overall.
“Similarly to other vertebrates, carp has two macroH2A genes. MacroH2A.1 is enriched at the ribosomal cistron and at the promoter of the L41 ribosomal protein gene during winter. Enrichment of macroH2A.1 at these sites colocalizes with enrichment for H3K27 methylation, a mark of repressed chromatin. Consistent with this, macroH2A.1 represses rDNA transcription in human cells” (Talbert and Henikoff 2014, doi:10.1016/j.tcb.2014.07.006).
“The eukaryotic cell has devised ways to alter the acidic patch to regulate chromatin structure and function. The replacement of both copies of canonical H2A with H2A.Z promotes chromatin compaction, and this ability is dependent upon H2A.Z creating an acidic patch that is slightly more acidic than H2A. By contrast, incorporation of H2A.Bbd or H2A.Lap1 into nucleosome arrays completely inhibits array folding, which is due to H2A.Bbd/H2A.Lap1 generating an acidic patch that is less acidic than H2A” (Soboleva, Nekrasov, Ryan and Tremethick 2014).
Class | Factor name | Function | Related factors and notes |
---|---|---|---|
GAGA factor | GAF | Generates nucleosome-free region and promoter structure for pausing | NURF |
General Transcription Factors | TFIID | Generates promoter structure for pausing | |
TFIIF | Increases elongation rate | Near promoters | |
TFIIS | Rescues backtracked Pol II | Pol III | |
Pausing factors | NELF | Stabilizes Pol II pausing | |
DSIF | Stabilizes Pol II pausing and facilitates elongation | ||
Positive elongation factor | P-TEFb | Phosphorylates NELF, DSIF, and Pol II CTD for pause release | |
Processivity factors | Elongin | Increases elongation rate | |
ELL | Increases elongation rate | AFF4 | |
SEC | Contains P-TEFb and ELL | Mediator, PAF | |
Activator | c-Myc | Directly recruits P-TEFb | |
NF-κB | Directly recruits P-TEFb | ||
Coactivator | BRD4 | Recruits P-TEFb | |
Mediator | Recruits P-TEFb via SEC | ||
Capping machinery | CE | Facilitates P-TEFb recruitment, counters NELF/DSIF | |
RNMT | Methylates RNA 5' end to complete capping | Myc | |
Premature termination factors | DCP2 | Decaps nascent RNA for XRN2 digestion | Dcp1a/Edc3 |
Microprocessor | Cleaves hairpin structure for XRN2 digestion | Tat, Senx | |
XRN2 | Torpedoes Pol II with RNA 5'-3' exonucleation | ||
TTF2 | Releases Pol II from DNA | ||
Gdown1 | GDOWN1 | Antitermination and stabilizes paused Pol II | TFIIF, Mediator |
Histone chaperone | FACT | H2A-H2B eviction and chaperone | Tracks with Pol II |
NAP1 | H2A-H2B chaperone | RSC, CHD | |
SPT6 | H3-H4 chaperone | Tracks with Pol II | |
ASF1 | H3-H4 chaperone | H3K56ac | |
Chromatin remodeler | RSC | SWI/SNF remodeling in gene body | H3K14ac |
CHD1 | Maintains gene body nucleosome organization | FACT, DSIF | |
NURF | ISWI remodeling at promoter | GAGA factor | |
Poly(ADP-ribose) polymerase | PARP | Transcription independent nucleosome loss | Tip60 |
Polymerase-associated factor complex | PAF | Loading dock for elongation factors | SEC, FACT |
Histone tail modifiers | MOF | Acetylates H4K16 and recruits Brd4 | H3S10ph, 14-3-3 |
TIP60 | Acetylates H2AK5 and activates PARP | ||
Elongator | Acetylates H3 and facilitates nucleosomal elongation | Also in cytoplasm | |
Rpd3C (Eaf3) | Deacetylates and inhibits spurious initiation in gene body | H3K36me3 | |
SET1 | Methylates H3K4 | MLL/COMPASS | |
SET2 | Methylates H3K36 and regulates acetylation-deacetylation cycle | Rpd3C | |
PIM1 | Phosphorylates H3S10 and recruits 14-3-3 and MOF | ||
RNF20/40 | Monoubiquitinates H2BK123 and facilitates nucleosomal DNA unwrapping | UbcH6, PAF |
● “TDP-43 Affects Splicing Profiles and Isoform Production of Genes Involved in the Apoptotic and Mitotic Cellular Pathways” (De Conti, Akinyi, Mendoza-Maldonado et al. 2015, doi:10.1093/nar/gkv814).
● “TRAP150 Interacts with the RNA-Binding Domain of PSF and Antagonizes Splicing of Numerous PSF-Target Genes in T Cells” (Yarosh, Tapescu, Thompson et al. 2015, doi:10.1093/nar/gkv816).
● “The DNA Replication Licensing Factor Miniature Chromosome Maintenance 7 Is Essential for RNA Splicing of Epidermal Growth Factor Receptor, c-Met, and Platelet-derived Growth Factor Receptor” (Chen, Yu, Michalopoulos et al. 2015, doi:10.1074/jbc.M114.622761).
● “The DNA Replication Licensing Factor Miniature Chromosome Maintenance 7 is Essential for RNA Splicing of Epidermal Growth Factor Receptor, c-met and Platelet Derived Growth Factor Receptor” (Luo, Chen and Yu 2015, doi:10.1096/fj.1530-6860).
● “Meta-Analysis of Multiple Sclerosis Microarray Data Reveals Dysregulation in RNA Splicing Regulatory Genes” (Paraboschi, Cardamone, Rimoldi et al. 2015, doi:10.3390/ijms161023463).
● “The Alternative Splicing of Cytoplasmic Polyadenylation Element Binding Protein 2 Drives Anoikis Resistance and the Metastasis of Triple Negative Breast Cancer” (Johnson, Vu, Griffin et al. 2015, doi:10.1074/jbc.M115.671206).
● “Arginine Methylation and Citrullination of Splicing Factor Proline- and Glutamine-Rich (SFPQ/PSF) Regulates Its Association with mRNA” (Snijders, Hautbergue, Bloom et al. 2015, doi:10.1261/rna.045138.114).
In sum: “As direct evidence that DIs can contribute to gene regulation, we showed that inhibition of Clk kinase activity as well as DNA damage can modulate the rate of splicing for particular subsets of DIs, enabling coordinated control of specific genes” (Boutz, Bhutkar and Sharp 2015, doi:10.1101/gad.247361.114).
“Defects in RNA modifications (which are distinct from splice-site
alterations) account for more than 100 human diseases, including
childhood-onset multiorgan failures, cancers and neurologic disorders.
These conditions are now referred to as ‘RNA modopathies’. This number
is likely to represent only a small percentage of the actual number of
existing RNA modopathies”
(Alfonso, Brown, Byers et al. 2021, doi:10.1038/s41588-021-00903-1).
“Studies of the role that [mRNA] modifications play in translation paint
a complex picture. When examining direct regulation of translation, it
appears that the same modification can either promote or repress
translation of an mRNA, depending on the location of the modification or
the biological system studied. Indirectly, the same modification can
either increase mRNA stability, as in the case of hypoxia-induced
stabilization of mRNA, or decrease it, as exemplified in the widely
studied YTHDF2-mediated decay of m6A-methylated mRNA. These
contrasting effects ultimately have different consequences for
translation output. With m6A, the most studied modification,
we are now beginning to understand that the delicate balance of
methylation and demethylation is involved in complex biological processes
such as differentiation and the stress response. Alterations to that
balance contribute to different pathologies. The epitranscriptome,
consisting of various RNA modifications, is thus beginning to be
unraveled as a complex layer of information with major implications for
the regulation of translation in healthy and disease states”
(Peer, Moshitch-Moshkovitz, Rechavi and Dominissini 2019,
doi:10.1101/cshperspect.a032623).
“Until recently, the role of N1-methyladenine (m1A) on mRNAs
during acute stress response remains largely unknown. Here we show that
the methyltransferase complex TRMT6/61A, which generates the
m1A tag, is involved in transcriptome protection during heat
shock. Our bioinformatics analysis indicates that occurrence of the
m1A motif is increased in mRNAs known to be enriched in SGs
[stress granules]. Accordingly, the m1A-generating
methyltransferase TRMT6/61A accumulated in SGs and mass spectrometry
confirmed enrichment of m1A in the SG RNAs. The insertion of a
single methylation motif in the untranslated region of a reporter RNA
leads to more efficient recovery of protein synthesis from that
transcript after the return to normal temperature. Our results
demonstrate far-reaching functional consequences of a minimal RNA
modification on N1-adenine during acute proteostasis stress”
(Alriquet, Calloni, Martínez-Limón et al. 2021, doi:10.1093/jmcb/mjaa023).
“Recent studies suggest noncoding RNAs interact with genomic DNA, forming
RNA•DNA-DNA triple helices, as a mechanism to regulate transcription. One
way cells could regulate the formation of these triple helices is through
RNA modifications. With over 140 naturally occurring RNA modifications,
we hypothesize that some modifications stabilize RNA•DNA-DNA triple
helices while others destabilize them. Here, we focus on a
pyrimidine-motif triple helix composed of canonical U•A-T and C•G-C base
triples. We ... examine how 11 different RNA modifications at a single
position in an RNA•DNA-DNA triple helix affect stability: Compared to the
unmodified U•A-T base triple, some modifications have no significant
change in stability, some have ∼2.5-fold decreases in stability,
and some completely disrupt triple helix formation”
(Kunkler, Schiefelbein, O’Leary et al. 2022, doi:10.1261/rna.079244.122).
Summary: “The main achievement in the field [of RNA modifications]
is the uncovering of a new, intricate, highly sensitive, tuneable layer
of gene expression regulation by mRNA modifications. This new layer of
regulation operates by taking advantage of the unique characteristics of
mRNA — namely, that it is short-lived, highly structured, mobile between
cellular compartments and amplified through transcription. These effects
are mediated in part by ‘readers’, which are exemplified by
methyl-specific binding proteins ... Regulation of gene expression is
also tuned by an interplay between the installation and removal of the
modifications by ‘writers’ and ‘erasers’. Several major lessons have
emerged in the past decade. First, mRNA modifications are highly
prevalent with thousands of gene transcripts modified. Interestingly,
some modifications cluster in specific transcript locations; for example,
inosines are found mostly in repetitive Alu sequences, m6A
preferentially decorates the stop codon vicinity and extremely large
internal exons, and m1A clusters around the AUG start codon,
suggesting that each modification acts through a different mode of
action. Moreover, some modifications, such as m6A and
m1A exhibit high conservation between humans and mice.
“Another important achievement is the discovery that a specific modification can act through different modes of action, through various readers, in a context-dependent manner. An additional important finding is the dynamic nature of some mRNA modifications that allows for a quick response to environmental stimuli; this dynamic nature has already been demonstrated for m6A and m1A. The central role of mRNA modifications is reflected by the devastating effects of aberrant modifications on early development both in humans and mice, as well as in human cancer, inflammation and neurodegeneration, further emphasizing the importance of this regulatory layer” (Gideon Rechavi, quoted in doi:10.1038/nrg.2016.47).
• “YTHDC1 interacts with m6A-modified RNAs to regulate multiple steps of RNA metabolism in the nucleus;
• “YTHDC1 is widely associated with transcriptional activation (via enhancer RNA-mediated crossregulation with active epigenetic marks);
• “YTHDC1 transcriptional repressive action is largely associated with transposable elements and long ncRNAs;
• “The diversity in YTHDC1–m6A functions is linked to their ability to promote membraneless nuclear subcompartments, such as the nuclear speckles”;
(Widagdo, Anggono and Wong 2022, doi:10.1016/j.tig.2021.11.005).“Post-transcriptional tRNA modifications are critical for efficient and accurate translation, and have multiple different roles. Lack of modifications often leads to different biological consequences in different organisms, and in humans is frequently associated with neurological disorders” (Guy and Phizicky 2014).
“Further analysis revealed that mRNA species with shorter median poly(A) tail lengths were, on average, much more abundant than those with longer tails ... transcripts that were translationally activated during larval development had a significantly shorter median poly(A) tail size compared with those that were translationally repressed. Importantly, the correlation between high mRNA and translation levels and short poly(A) tails was found to be conserved in other eukaryotes.
“Interestingly, almost all genes produced transcripts with very long
(>200 nt) poly(A) tails, indicating that well-expressed mRNAs undergo
controlled poly(A) tail shortening (pruning). In support of this, tails
of the majority of highly expressed and codon-optimized genes had lengths
that would accommodate one or two PABPs (∼30–60 nt), whereas
less-abundant mRNAs with poorly optimized codons had much longer poly(A)
tails and a wider distribution of lengths”
(Zlotorynski 2018, doi:10.1038/nrm.2017.120).
“The poly(A) tail of mRNA has been thought to be a pure stretch of
adenosine nucleotides with little informational content except for
length. Lim et al. identified enzymes that can decorate poly(A) tails
with non-A nucleotides. The noncanonical poly(A) polymerases, TENT4A and
TENT4B, incorporate intermittent non-A residues (G, U, or C) with a
preference for guanosine, which results in a heterogenous poly(A) tail.
Deadenylases trim poly(A) tails to initiate mRNA degradation but stall at
the non-A residues. In effect, the not-so-pure tail stabilizes mRNAs by
slowing down deadenylation.”
(Lim, Kim, Lee et al. 2018, 10.1126/science.aam5794)
“We find that the [polyadenylation] sequence can modulate gene expression
by over five orders of magnitude”
(Slutskin, Weinberger and Segal 2019, doi:10.1101/gr.247312.118).
“Premature transcription termination [PTT] is widespread in metazoans. It
can occur close to the transcription start site or further downstream in
the gene body. PTT generates transcripts that, depending on the
circumstances, are either rapidly degraded, or are stabilised by
polyadenylation, thus contributing to transcriptome diversification.
Stable premature transcripts can have independent functions as noncoding
(nc)RNA or mRNA encoding proteins with different properties compared with
those generated by the full-length transcript. PTT can negatively
regulate expression of the full-length transcript and especially controls
genes encoding transcriptional regulators. Factors triggering PTT
include not only canonical RNA 3′ processing and termination factors, but
also other players. Many metazoan factors oppose PTT, thus limiting its
damaging potential”
(Kamieniarz-Gdula and Proudfoot 2019, doi:10.1016/j.tig.2019.05.005).
“Analogous to alternative splicing, alternative polyadenylation (APA) has
long been thought to occur independently at proximal and distal polyA
sites ... we unexpectedly identified several hundred APA genes in human
cells whose distal polyA isoforms are retained in chromatin/nuclear
matrix and whose proximal polyA isoforms are released into the cytoplasm
... [We found] evidence that the strong distal polyA sites are processed
first and the resulting transcripts are subsequently anchored in
chromatin/nuclear matrix to serve as precursors for further processing at
proximal polyA sites ... Therefore, unlike alternative splicing, APA
sites are recognized independently, and in many cases, in a sequential
manner. This provides a versatile strategy to regulate gene expression in
mammalian cells”
(Tang, Yang, Li et al. 2022, doi:10.1038/s41594-021-00709-z).
“In eukaryotes, poly(A) tails are present on almost every mRNA. Early
experiments led to the hypothesis that poly(A) tails and the cytoplasmic
polyadenylate-binding protein (PABPC) promote translation and prevent
mRNA degradation, but the details remained unclear. More recent data
suggest that the role of poly(A) tails is much more complex:
poly(A)-binding protein can stimulate poly(A) tail removal
(deadenylation) and the poly(A) tails of stable, highly translated mRNAs
at steady state are much shorter than expected. Furthermore, the rate of
translation elongation affects deadenylation. Consequently, the interplay
between poly(A) tails, PABPC, translation and mRNA decay has a major role
in gene regulation. In this Review, we discuss recent work that is
revolutionizing our understanding of the roles of poly(A) tails in the
cytoplasm. Specifically, we discuss the roles of poly(A) tails in
translation and control of mRNA stability and how poly(A) tails are
removed by exonucleases (deadenylases), including CCR4–NOT and PAN2–PAN3”
(Passmore and Coller 2022, doi:10.1038/s41580-021-00417-y).
“Alternative polyadenylation (APA) generates transcript isoforms that
differ in the position of the 3′ cleavage site, resulting in the
production of mRNA isoforms with different length 3′ UTRs ... We
identified >500 Drosophila genes that express mRNA isoforms with a long
3′ UTR in proliferating spermatogonia but a short 3′ UTR in
differentiating spermatocytes due to APA. We show that the stage-specific
choice of the 3′ end cleavage site can be regulated by the arrangement of
a canonical polyadenylation signal (PAS) near the distal cleavage site
but a variant or no recognizable PAS near the proximal cleavage site. The
emergence of transcripts with shorter 3′ UTRs in differentiating cells
correlated with changes in expression of the encoded proteins, either
from off in spermatogonia to on in spermatocytes or vice versa. Polysome
gradient fractionation revealed >250 genes where the long 3′ UTR versus
short 3′ UTR mRNA isoforms migrated differently, consistent with dramatic
stage-specific changes in translation state. Thus, the developmentally
regulated choice of an alternative site at which to make the 3′ end cut
that terminates nascent transcripts can profoundly affect the suite of
proteins expressed as cells advance through sequential steps in a
differentiation lineage”
(Berry, Olivares, Gallicchio et al. 2022, doi:10.1101/gad.349689.122).
“Alternative cleavage and polyadenylation (APA) is a widespread mechanism
to generate mRNA isoforms with alternative 3′ untranslated regions
(UTRs). The expression of alternative 3′ UTR isoforms is highly cell type
specific and is further controlled in a gene-specific manner by
environmental cues. In this Review, we discuss how the dynamic,
fine-grained regulation of APA is accomplished by several mechanisms,
including cis-regulatory elements in RNA and DNA and factors that control
transcription, pre-mRNA cleavage and post-transcriptional processes.
Furthermore, signalling pathways modulate the activity of these factors
and integrate APA into gene regulatory programmes. Dysregulation of APA
can reprogramme the outcome of signalling pathways and thus can control
cellular responses to environmental changes. In addition to the
regulation of protein abundance, APA has emerged as a major regulator of
mRNA localization and the spatial organization of protein synthesis. This
role enables the regulation of protein function through the addition of
post-translational modifications or the formation of protein–protein
interactions”
(Mitscka and Mayr 2022, doi:10.1038/s41580-022-00507-5).
Two years later: “Recent studies suggest that the protein–RNA interaction network involved in PAS [polyadenylation site] recognition is more complex than previously thought, which raises many important questions for future studies”. Caption of figure at right: “Context-dependent regulation of PAS recognition. Regulatory factors bound at different locations relative to the core PAS sequence have different effects on PAS recognition by the mRNA 3′ processing factors. Positive effects are indicated by an arrow, and negative effects are indicated by a vertical line” (Shi and Manley 2015, doi:10.1101/gad.261974.115).
• “The 3′ UTRs of many protein-coding genes harbor multiple polyadenylation signals that are differentially selected based on the physiological state of cells, resulting in alternative mRNA isoforms with differing mRNA stability.
• “m6A is the most abundant base modification in eukaryotic mRNA but many functional impacts of m6A on mRNA fate, mRNA stability in particular, have been discovered only recently.
• “Codon usage in mRNA open-reading frames (ORFs) influences gene expression, with the proportion of optimal and nonoptimal codons helping to fine-tune mRNA stability in a process that is coupled to translation” (Chen and Shyu 2016, doi:10.1016/j.tibs.2016.08.014).
As the preceding only begins to suggest, many wide-ranging factors bear on RNA degradation, and therefore on gene expression. Here we list some of them, but provide explanation only for those not described elsewhere in this document:
“The current model posits that NMD is stimulated when the TC [termination codon] occurs in a microenvironment of the mRNP that is unfavorable for translation termination ... The majority of NMD-sensitive transcripts do not contain PTCs but are ordinary mRNAs coding for seemingly full-length functional proteins ... There is ample evidence that NMD can target both normal and erroneous transcripts. Among the NMD-inducing features, the presence of the 3′-most exon–exon junction >50 nt [nucleotides] downstream from the TC is the feature with the strongest predictive value for NMD susceptibility. PTCs resulting from mutations in the ORF [open reading frame] or from aberrant or alternative splicing, as well as genes with an intron in the 3′UTR mostly belong to this class of NMD targets. In addition, long 3′UTRs (>1000 nt in mammalian cells), the presence of actively translated short upstream ORFs (uORFs), or selenocysteine codons (UGA) in cells grown in the absence of selenium are also features that can — but not always do — trigger NMD. uORF translation often inhibits translation of the main ORF, either constitutively or in response to stress. Under such circumstances, ribosomes terminate at the TC of the uORF with usually several EJCs [exon junction complexes] remaining bound further downstream on the mRNA, which creates an NMD-promoting translation termination environment. Despite all these empirically determined features, it has so far remained impossible to computationally predict NMD targets with high confidence.” (Karousis and Mühlemann 2019, doi:10.1101/cshperspect.a032862)
• “Optimal codons are associated with more efficient translation and correspond to cognate tRNA species that are more abundant and that are readily accommodated by the ribosome during translation.
• “The use of non-optimal codons can influence protein production by reducing ribosome translocation rates and causing ribosome collisions that can feed back to the translation initiation site.
• “Conserved, specific patterns of optimal and non-optimal codon use help to guide efficient co-translational folding and to minimize errors in translation.
• “Codon usage affects mRNA stability, and codon-influenced elongation stalling is sensed by the DEAD-box helicase Dhh1, which mediates codon-dependent variation in mRNA stability.
• “The interdependence between variable codon usage and the composition, charge status and post-transcriptional modifications of the tRNA pool enables global control of translation, which can be used to shape protein production to favour specific cellular programmes and to maintain homeostasis in conditions of stress or changes in nutritional status” (Hanson and Coller 2018, doi:10.1038/nrm.2017.91).
In sum ... There are “complicated, multifactorial webs of regulatory events that coordinate the half-lives of cellular mRNAs, depending on the stage of organismal development, the type of tissue and the surrounding environmental conditions”. And a caveat: “Although mRNAs largely function to produce proteins, there is growing support for the idea that they can also serve as sinks for regulatory proteins and antisense ncRNAs such as miRNAs by functioning as ‘competing endogenous RNAs’ [see “Competing endogenous RNAs” above]. This indicates that the regulation of mRNA decay may cast a very broad net and affect as yet unappreciated cellular processes” (Schoenberg and Maquat 2012).
“The A-rich element that Kwon et al. discovered also highlights some emerging concepts. Firstly, although this element stabilizes mRNAs by stimulating translation, this process is relatively inefficient, and many such mRNA molecules do not engage ribosomes. For these non-translating mRNAs, the A-rich element actually leads to destabilization, which is in agreement with a previous study. Thus, the same element can both stabilize and destabilize an mRNA via mechanisms that are translation-dependent and translation-independent, respectively. Furthermore, stabilizing and destabilizing effects both depend on the poly(A)-binding protein PAPB1, demonstrating that readers of RNA regulatory elements also play opposing roles. Dualities such as this are a recurrent theme in post-transcriptional regulation and highlight how RNA fate decisions involve the integration of many signals and kinetic competition between translation and decay ...
“These results portray post-transcriptional regulation as relatively chaotic, with many sequence elements and readers, each performing diverse roles. However, cells do manage to impart some order ...
“The work of Kwon et al. and others also demonstrates that regulatory interactions occur along the entire length of an mRNA transcript, rather than being concentrated in a few regions. This raises questions of whether and how signals are interpreted differently when located in distinct regions of an mRNA or, indeed, within different mRNAs altogether” (Bühler and Tuck 2020, doi:10.1038/s41594-020-0482-9).
And so increased phosphorylation of a translation initiation factor, which tends to reduce the concentration of that factor and thus repress the translation of many genes, has the opposite effect on genes such as ATF4.
“CPE–CPEB mediated translational regulation is widespread. Computational analysis with some empirical validation indicates that as many as 20–40% of mRNAs in Xenopus and mammals, including humans, may be subject to such regulation. However, not all CPE-containing mRNAs are regulated in the same way. Some ’early’ mRNAs are polyadenylated and translationally activated at meiotic prophase I, whereas ‘late’ mRNAs are activated at metaphase I. Still, other CPE-containing mRNAs do not undergo translational regulation. Whether and how a particular CPE-containing mRNA is regulated depends not only on the CPE and its distance from the polyadenylation signal (PAS) but also on other cis-acting elements in its 3' UTR, leading to the concept of a ‘combinatorial code’ of regulatory motifs. Other relevant cis-acting elements include additional CPEs, AU-rich elements (AREs) and elements that bind the RNA-binding proteins Pumilio and Msi (PBEs and MBEs, respectively)”.
Among those other elements in Drosophila is Msi, containing (like CPEB) two RNA recognition motifs. Depending on context, Msi can act as a translation activator or repressor. It turns out that the nature of the interaction between CPEB and Msi affect whether an mRNA is translated. Most studies on translational regulation to date have focused on single molecules. But the suggestion now is that “cooperative interactions between CPEB1 and Msi1 are widespread”, and that interactions of this sort could come to bear on a substantial percentage of mRNAs (Lasko 2017, doi:10.1038/nsmb.3445).
The authors add that, “unlike transcriptional control, which is restricted to the nucleus, translational control mechanisms operate throughout the cell and can regulate expression of cytoplasmic proteins to ensure that they are present at the positions and times that they are required”. There is a powerful argument against genocentrism in all this.
The following is just a sampling of the extraordinarily wide-ranging repertoire of cellular miRNA activity.
“NORAD can bind to PUM2 with great affinity as a result of the presence of 15 Pumilio-binding motifs ... Given the high abundance of NORAD, and the presence of multiple binding sites on each transcript, Lee and colleagues propose that NORAD functions as a PUM2/PUM1 decoy, preventing these RNA-binding proteins from interacting with and destabilizing their targets ... If these results convincingly establish the functional relevance of the NORAD/Pumilio interaction, what remains unclear are the molecular mechanisms through which loss of this interaction leads to chromosomal instability. As noted by Lee and colleagues, several of the PUM2 targets that are downregulated upon NORAD inactivation are known to control genomic stability ...
“NORAD could exert some of its functions through additional mechanisms; and it is possible that the interaction with PUM1/2 is not exclusively inhibitory, but could rather modulate the activity of Pumilio (and NORAD). This is especially relevant because although Lee and colleagues found a significant enrichment for PUMILIO targets among the genes downregulated in NORAD–/– cells, the correlation was not absolute, with a large number of genes regulated by NORAD not being PUMILIO targets and vice versa” (Ventura 2016, doi:10.1016/j.tig.2016.04.002).
The researchers go on to report these broad classes of enhancer RNA function:
“Overall, the current study fundamentally changes the discourse around eRNA functions, by demonstrating that these RNAs can have major, locus-specific roles in enhancer activity that do not require a particular RNA-sequence context or abundance. Furthermore, by providing strong evidence that CBP interacts with eRNAs as they are being transcribed, this study highlights the value of investigating nascent RNAs for understanding enhancer activity” (Adelman and Egan 2017, doi:10.1038/543183a).
“cheRNAs show several molecular characteristics that are distinct from those of eRNAs. Whereas most eRNAs are bidirectionally transcribed from the prototypical enhancers, cheRNAs show a specific strand bias. Moreover, eRNAs are marked by the histone H3K4 monomethylation (H3K4me1) and H3 lysine 27 acetylation (H3K27ac), whereas cheRNAs are associated with H3K4me3. Finally, cheRNAs are longer than eRNAs (median length of ~2,000 as compared to ~350 nucleotides) ... A majority of these cheRNAs remain attached to chromatin through interactions with RNA polymerase II ... cheRNAs are expressed in a cell-type-specific manner, and ... these RNAs promote changes in chromatin architecture and thereby contribute to the expression of nearby genes. cheRNA profiling in three divergent cell lines, HEK293, K562, and H1hESCs, showed that proximity to cheRNAs was a better predictor of cis-gene expression than features such as chromatin modification signatures or the expression of eRNAs or other lncRNAs” (Gayen and Kalantry 2017, doi:10.1038/nsmb.3430).
Also, see Alternative cleavage, polyadenylation, and deadenylation under Post-Transcriptional Decision-Making and Alternative coding sequences (transcription start and termination) under Decision-Making Relating to Translation.]
Article highlights: “Non-coding transcription directs loop extrusion; non-coding transcription dictates compartmentalization; non-coding transcription directs enhancer-promoter communication; non-coding transcription establishes T cell identity and blocks lymphoid malignancy” (Isoda, Moore, He et al. 2017, doi:10.1016/j.cell.2017.09.001).
— “By multiple mechanisms, Hsp70 family members sense perturbation of proteostasis and then modulate aspects of mRNA metabolism”.
— “Hsp70 family members promote nascent protein folding and when defective this leads to inhibition of translation elongation”.
— “Hsp70 family members can be titrated by unfolded proteins leading to the activation of stress responsive signal transduction systems that modulate the transcriptome”.
— “Hsp70 family members can modulate the protein composition of individual mRNPs, thereby affecting their function”.
— “Hsp70 proteins promote disassembly of stress granules and are important for recovery of translation after stress”.
In sum: “ the most current analyses on Alu impacts on biology are mainly focused on fixed Alu insertions in germ lines. However, Alu retrotransposon might be active in somatic tissues that continues to affect gene expression and even causes diseases, such as cancer, after birth. Thus, it will be of interest to comprehensively scrutinize how Alu insertion reshapes our genome and transcriptome in different tissues and during the lifespan in a primate-specific manner. While the impacts of some Alu repeats on the human genome have been affirmatively revealed by recent studies, the influence of other less-characterized Alus and their specific underlying mechanisms are still awaiting to be investigated. For instance, even a single point mutation in the LINE/Alu overlapped sequence of a human lncRNA could lead to lethal infantile encephalopathy. Collectively, the widespread Alu elements largely increase the complexity of gene expression and the plasticity of the human genome”.
(Many topics covered elsewhere in this document bear on the role of three-dimensional structure in gene regulation.)
Image from Dowen, Jill M., Zi Peng Fan, Denes Hnisz et al. (2014). “Control of Cell Identity Genes Occurs in Insulated Neighborhoods in Mammalian Chromosomes”, Cell vol. 159, no. 2 (Oct. 9), pp. 374-87. doi:10.1016/j.cell.2014.09.030
But all this needs to be put in a dynamic context. As the authors summarize the matter in a video abstract of their paper in the journal Cell, “A loop that turns a gene on in one cell type might disappear in another. A domain may move from subcompartment to subcompartment as its flavor changes. No two cell types [have their chromosomes] folded alike. Folding drives function. Epigenetics is origami”. (Rao, Huntley, Durand et al. 2014, doi:10.1016/j.cell.2014.11.021).
The key take-home lesson: “folding drives function”. This is a long way from the old view that “sequence tells us everything we need to know about function”. It’s the difference between stasis, on the one hand (imagine the bits of a computer program, statically embodied in transistors or optical disks), and a physical embodiment that is at the same time a sculptural and balletic performance, on the other hand. In the latter case, the performance governs the functional story. Analogizing this to a computer would require the computer chips to writhe and dance, thereby imparting to the individual bits their functional meaning.
It may be that nuclear lamina-directed genome organization is more important in differentiated cell types, as pluripotent cells lacking B-type lamins exhibit surprisingly modest defects in genome organization. Compared with LADs, NUP-binding profiles are considerably narrower and cover a smaller and more variable portion of the genome. NUPs often bind at the level of individual regulatory elements associated with a specific gene, including enhancers, promoters and transcription start sites. Unlike LADs, NUP [nucleoporin protein] interactions can be rapidly induced or disrupted by environmental stimuli.
Clearly, cells balance several strategies for organizing genomes ...” (Buchwalter, Kaneshiro and Hetzer 2018, doi:10.1038/s41576-018-0063-5)
“Recent growing evidence has shifted our view of chromatin from one in which it has a static crystal-like structure to one in which it occupies a more dynamic liquid-like state” (Maeshima, Ide, Hibino and Sasai 2016, doi:10.1016/j.gde.2015.11.006).
“RNA can serve as ‘trigger’, aggregating BioMCs, as ‘glue’, scaffolding BioMCs, as ‘exchange material’ that associates with established BioMCs, and potentially as ‘access point’ for activities changing BioMC architecture and possibly dissolving BioMCs ... Direct molecular ‘manipulation’ of expressed RNA molecules and proteins is likely the means for achieving the dynamic regulation of the multitude of non‐membranous organelles” (Drino and Schaefer 2018, doi:10.1002/bies.201800085).
And another toc blurb: “Germ granules are perinuclear condensates in germ cells. Ouyang et al. report that germ (P) granules in C. elegans harbor transcripts required for RNA-mediated interference. Localization to P granules protects these transcripts from piRNA-initiated silencing, identifying a mechanism for regulating RNAi responses in animals” (Ouyang, Folkmann, Bernard et al. 2019, doi:10.1016/j.devcel.2019.07.026).
“Postzygotic mutation is a common occurrence. The developmental stage and timing of new mutations influence their phenotypic effects and likelihood of transmission. All major classes of mutations are observed in the mosaic state”.
And from their abstract: “Nearly all of the genetic material among cells within an organism is identical. However, single-nucleotide variants, small insertions/deletions, copy-number variants, and other structural variants continually accumulate as cells divide during development. This process results in an organism composed of countless cells, each with its own unique personal genome. Thus, every human is undoubtedly mosaic. Mosaic mutations can go unnoticed, underlie genetic disease or normal human variation, and may be transmitted to the next generation as constitutional variants”.
Mitochondria present a whole additional world of genes, gene expression, translation, and so on. This is barely touched on in this document. But here are a few notes hinting at the significant phenomena thus overlooked:
Perhaps rather ambitiously, Gilbert and co-authors write: “Several reports indicate that probiotics can treat anxiety and posttraumatic stress disorder in mouse models ... Therapies that target our microbial side may hold the key to making progress against a wide range of notoriously difficult psychiatric illnesses”.
“miRNA pathways converge with diverse non-miRNA regulatory mechanisms to regulate common targets.
“Convergent regulation at different levels of gene expression imparts tissue-specific functions, synergy, and precise temporal gene regulation.
“Proteolytic pathways provide a strong commitment in gene regulation since proteolysis is irreversible without new protein synthesis.
“Genetic redundancy remains a critical barrier in understanding genomic architecture and regulatory networks.
“Genes with known functions may have additional important non-canonical functions” (Weaver and Han 2017, doi:10.1016/j.tig.2017.09.009).
This illustrates a good deal about the principle of wholeness in organisms — the way processes are globally interwoven in the interests of the overall unity of the organism. For example, any given function of a molecule tends to involve roles by many other sorts of molecule; and any given molecule tends to have multiple functions. To imagine how this works is to imagine all activity under the “guiding hand” of the whole.
“Our results indicate that the KZFP/KAP1 complex maintains heterochromatin and DNA methylation at ICRs [imprinting control regions] and TEs [transposable elements] in naïve embryonic stem cells partly by protecting these loci from TET-mediated demethylation. Our study further unveils an unsuspected level of complexity in the transcriptional control of the endovirome by demonstrating often integrant-specific differential influences of histone-based heterochromatin modifications, DNA methylation and 5mC oxidation in regulating TEs expression” (Coluccio, Ecco, Duc 2018, doi:10.1186/s13072-018-0177-1).
There’s also a role for noncoding RNA: “It was recently shown that a non-coding regulatory RNA, mapped in the 5' UTR of VEGF-A mRNA, plays a function in tumour development by affecting the expression of other genes independently of VEGF-A translation”.
And a G-quadruplex structure located within a VEGF-A internal ribosome entry site (IRES-A) appears to play a role as well. Mutations disrupting the structure “inactivated IRES-A function, suggesting the requirement of this structure to maintain IRES-A activity ... IRESs ensure translation activation of mRNA during stress”.
Unity, complexity, interactivity, holism
“Our initial ‘modular’ notion of a gene has been challenged by the realization
that: (i) multiple layers of regulatory information permeate the transcribed
region; (ii) eukaryotic genomes are pervasively transcribed, generating an
ensemble of transcripts from any given locus; (iii) each of these transcripts
might in turn undergo multiple rounds of cleavage to generate even greater
complexity; and (iv) this panoply of transcripts can perform diverse biological
roles. The overlapping nature of the genetic information and transcripts
associated with a single locus limits the value of studies of any component in
isolation. We therefore suggest that each gene must now be regarded as a
system, comprising a genomic region with the corresponding network of control
regions and ensemble of transcripts" (Tuck and Tollervey 2011).
And that is not to mention about 98 percent of the regulatory goings-on briefly pointed to in all the foregoing — goings-on that bring virtually every aspect of the organism to bear upon DNA and its transcripts.
Variability, change, dynamism
Re: Changes in yeast gene transcripts during development: “Our analyses reveal
extensive changes to both the coding and noncoding transcriptome, including
altered 5' ends, 3' ends, and splice sites. Additionally, 3910 (46.5%)
unannotated [previously undocumented] expressed segments were identified.
Interestingly, subsets of unannotated RNAs are located across from introns
(anti-introns) or across from the junction between two genes (anti-intergenic
junctions). Many of these unannotated RNAs are abundant and exhibit
sporulation-specific changes in expression patterns. ... Our high-resolution
transcriptome analyses reveal that coding and noncoding transcript
architectures are exceptionally dynamic in S. cerevisiae and suggest a
vast array of novel transcriptional and post-transcriptional control mechanisms
that are activated upon meiosis and sporulation”. “Functionally distinct
changes in [transcript] architecture frequently occur in response to signaling
cascades. Therefore, identification of dynamic changes to transcript
architecture often requires the study of a dynamic process” (Guisbert, Zhang,
Flatow et al. 2012).
In sum: the authors speak of “dramatic architecture changes” and an “unexpectedly dynamic” transcriptome. “Not only do unannotated transcripts exhibit dynamic expression patterns in terms of both architecture and overall expression, but the regulation of this expression appears to play a critically important role in the progression of meiosis”. “We predict that many other stress conditions or developmental cascades will also induce novel examples of transcriptome regulation”.
The organism is an activity, not a collection of things
This truth is emerging on all fronts. An illustration: researchers looked at a
“spectacular example of convergent evolution and phenotypic plasticity” —
namely the independent arising of queen and worker castes in bees, ants, and
wasps. The common notion that such cases must involve deeply conserved genes
was not supported by this work. Rather, “Overall, we found few shared caste
differentially expressed transcripts across the three social lineages. However,
there is substantially more overlap at the levels of pathways and biological
functions. Thus, there are shared elements but not on the level of specific
genes. Instead, the toolkit appears to be relatively “loose,” that is,
different lineages show convergent molecular evolution involving similar
metabolic pathways and molecular functions but not the exact same genes.
Additionally, our paper wasp data do not support a complementary hypothesis
that “novel” taxonomically restricted genes are related to caste differences”
(Berens, Hunt and Toth 2015, doi:10.1093/molbev/msu330).
Genes, in other words, are not master controllers or bearers of controlling instructions, but rather represent resources that evolving organisms can employ in their own ways. The same genes can be put to very different uses, and different genes can be caught up in the service of similar ends. The determination of a gene’s meaning is made by the organism as a whole, based on its patterns of activity.
This document: https://bwo.life/org/support/genereg.htm
Steve Talbott :: How the Organism Decides What to Make of Its Genes