January 10.1038/s41586-022-04436-3 TDP-43 nuclear loss in FTD/ALS causes widespread alternative polyadenylation changes Yi Zeng 1 Anastasiia Lovchykoval Tetsuya Akiyama 1 Chang Liu 1 Caiwei Guo 1 Vidhya Maheswari Jawahar 2 3 Odilia Sianto 1 Anna Calliari 2 3 Mercedes Prudencio 2 3 Dennis W. Dickson 2 3 Leonard Petrucelli 2 3 Aaron D. Chan Zuckerberg Biohub 3 San Francisco , San Francisco, CA , USA Department of Genetics, Stanford University School of Medicine , Stanford, CA , USA Department of Neuroscience, Mayo Clinic , Jacksonville, FL , USA Neuroscience Graduate Program, Mayo Clinic Graduate School of Biomedical Sciences , Jacksonville, FL , USA 2024 22 2024 -

USA In frontotemporal dementia and amyotrophic lateral sclerosis, the RNA-binding protein TDP-43 is depleted from the nucleus. TDP-43 loss leads to cryptic exon inclusion but a role in other RNA processing events remains unresolved. Here, we show that loss of TDP-43 causes widespread changes in alternative polyadenylation, impacting expression of disease-relevant genes (e.g., ELP1, NEFL, and TMEM106B) and providing evidence that alternative polyadenylation is a new facet of TDP-43 pathology.

Main

TDP-43 binds to uridine/guanine (UG)-rich motifs in RNA transcripts1,2 and plays a critical role in various aspects of RNA metabolism335. Defects in RNA metabolism are considered central to frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS) pathogenesis 3. However, it is not yet fully understood what genes TDP-43 regulates, what aspect of their RNA metabolism TDP-43 regulates, and how their dysregulation promotes neurodegeneration. A major role of TDP-43 has emerged as a repressor of so-called crypitc exons during splicing6. Cryptic exons reside in introns of genes and are normally excluded from mature mRNAs. When TDP-43 is dysfunctional (i.e., when depleted from the nucleus in FTD/ALS), these cryptic exons are spliced into final mRNAs, often leading to frameshifts, decreased RNA stability, or evnethe production of novel peptide sequences7313. Importantly, some of these cryptic exons are in genes critical for neuronal functions (e.g.,STMN2739,14) or genes that harbor disease-associated variants that sensitize them to crypti c splicing upon loss of TDP-43 (e.g.,UNC13A10,11). Cryptic splicing events could serve as powerful biomarkers for TDP-43 dysfunction12,13 or even as therapeutic targets15,16.

Besides its well-established role in splicing, TDP-43 also plays a critical role in other aspects of RNA processing. Do some of these additional TDP-43-dependent RNA processing pathways also contrbiute to disease? One potential disease-relevant RNA processing pathway is alternative polyadenylation (APA) 3 a major layer of gene regulation that occurs in >60% of human gene1s7. When a gene is transcribed into mRNA, it is cleaved and polyadenylated, which functions to stabilize the mRNA, facilitateits nuclear export, and regulate its translation18. If alternative polyadenylation (polyA) sites are utilized, it could produce mRNA isoforms that have different 39 untranslated region (UTR) lengths, impacting RNA/protein levels, subcellualr localization, or even protein functions17,19. If APA occurs prematurely, it can truncate mRNAs, reducing full-length protein levels17,19. Previous genome-wide TDP-43 binding studies revealed that TDP-43 binding is enriched not only in introns of genes but also in 39 UTR1s,2, suggesting a role in regulating polyadenylation. Indeed, TDP-43 knockdown or disease-associated TDP-43 mutations affect polyadenylation20322, including its own polyadenylation2,23 and premature polyadenylation in STMN27,8. Notably, widespread APA changes have been observed in ALS patient samples using bulk RNA-sequencing (RNA-seq)24 and single nucleus RNA-seq25, although it is unclear whether these changes are directly owing to TDP-43 dysfunction.

To test the hypothesis that APA caused by TDP-43 dysfunction contributes to FTD/ALS, wefirst performed APA analysis in an RNA-seq dataset26, in which neuronal nuclei with and without TDP-43 were sorted from FTD/ALS postmortem brain samples for RNA-seq F(ig. 1a, top panel). We recently re-analyzed this dataset to discover TDP-43-dependent cryptic splicing events11. Here, we 8re-re-analyzed9 it to search for APA events (Fig. 1a, bottom panel). We used two different APA analysis programs and identified 41 APA changes using APAlyzer27 and 62 APA changes using QAPA28 (Fig. 1b, S1a; Table S1; adjusted p value < 0.1). Only two genes with significant APA changes were common between both programs:LRFN1 and MARK3 (see also Fig. S2k and Arnold et al.29), likely because of different polyA databases used by each program and the low resolution of RNA-seq for APA analysis. We thus considered APA changes identified by ethi er program, which revealed APA events in genes critical for neuronal function, such asLRFN1 and SYN2. Some of these APA events are associated with gene expression changes (Fig. S1b), likely impacting their corresponding protein levels (see below).

To test if TDP-43 directly regulates APA, we knocked down TDP-43 levels in cortical neurons differentiated from human stem cells (iNeurons) and performed RNA-seq for APA analysisFi(g. 1c-e). TDP-43 knockdown caused widespread APA changes (Fig. 1e; Table S2; |&PUI| > 0.1 and adjusted p value < 0.05 from APAlyzer), including 24 changes also observed in FTD/ALS postmortem brain samples (Fig. 1b, S1a). We also observed APA changes in induced pluripotent stem cell (iPSC)-derived motor neurons (iMNs) upon TD P-43 knockdown7 (Fig. S1c), in iNeurons carrying a pathogenic mutation (TDP-43-K263E) 30 (Fig. S1d), and in iMNs carrying a different pathogenic mutation (TDP-43-M337V2)1 (Fig. S1e). Thus, TDP-43 regulates APA in neurons and TDP-43 loss of function or ALS-linked mutations result in widespread APA changes.

To comprehensively map TDP-43-dependent APA events and define how TDP-43 dysfunction contributes to neurodegeneration via APA, we sought a more sensitive assay than RNA-seq for APA analysis, because RNA-seq cannot map novel polyadenylation events, identify actual polyA sites, or capture 8internal9 premature polyadenylation events (i.e., ones that truncate the RNA/protein). We used a s pecialized transcriptomic method, 39 end-seq, to map polyA sites with single-nucleotide resolution in iNeurons with or without TDP-43 knockdown (Fig. 2a, 1d). We obtained a total of 117,823 putative polyA sites with a known upstream polyA signal, of which 46,760 sites are novel (Fig. S2a), when compared to a compendium of polyA site annotations31. Consistent with previous findings2,20, the majority of polyA sites we identified are in 39 UTRs (Fig. S2b). 39 end-seq confirmed that loss of TDP-43 lengthened the 39 UTR oLfRFN1 (Fig. 2b, top panel), which we also observed in our analysis of FTD/ALS postmortem brain samples (Fig. 2b, top panel; Fig. 1b). Notably, 39 end-seq also captured premature polyadenylation in STMN2, which was previously identified but not apparent in standard RNA-seq data (Fig. 2b, bottom panel), demonstrating the power of 39 end-seq for studying premature polyadenylation, the most detrimental form of APA since it truncateshte RNA/protein.

In total, TDP-43 knockdown altered the usage of 8,169 polyA sites (&|PUI| > 0.1 and adjusted p value < 0.05 from LAPA, a 39 end-seq data focused program) and caused APA changes in 2,220 genesF(ig. 2c; Table S3). By cross-referencing APA sites with a curated genome-wide TDP-43 binding site dataset32, we found that ~70% of genes with APA events have at least one TDP-43 binding site, consistent with a directrole of TDP-43 in regulating APA of these genes. Like recent findings in ALS sample2s4 and FTD samples (Fig. 1b, S1a) as well as our RNA-seq data from iNeurons (Fig. 1e), the majority of APA changes (1,881) detected by 39end-seq lengthened RNA transcripts; 340 APA events were associated with at least a 1.5-fold change in RNA level (Fig. 2d), suggesting that APA changes could alter gene expression. Notably, several of these APA changes were also observed in FTD/ALS postmortem brain samples and were in genes connected to ALS or FTD (Table S4; see below).

To define whether polyA site strength influences TDP-43 regulated APA, we estimated theprobability of cleavage and polyadenylation for each identified polyA site using a deep residual neural ntework-based model, Aparent233, that accurately predicts polyA site usage. Interestingly, we found that polyA sites wtih reduced usage upon TDP-43 knockdown are weaker than the ones with increased usage upon TDP-43 knockdown (Fig. 2e), indicating that TDP-43 either promotes the usage of weak polyA sites or suppresses theusage of strong ones. Whereas genes with significant 39 UTR shortening show no difference in polyA strengtbhetween proximal and distal polyA sites, genes with significant 39 UTR lengthening have stronger distlasites compared to proximal ones (Fig. 2f). Given the co-transcriptional nature of cleavage and polyadenylation, our findings suggest that in the case of 39 UTR lengthening upon TDP-43 knockdown, a stronger distal polyA site might compensate for its positional disadvantage over the proximal polyA site during transcription and therefore get activated upon TDP-43 knockdown. The impact of polyA site strength on shaping TDP-43-regulated APA provides a toehold to prioritize antisense oligonucleotide-based therapeutic strategies to target specific APA events akin to current approaches underway to target cryptic splicing events15.

39 end-seq also empowered the discovery of <cryptic= polyA sites (ones not used under normal conditions but revealed upon TDP-43 knockdown). TDP-43 knockdown activated 457 cryptic polyA sites in 424 genes (Table S5); 163 cryptic polyA sites occurred downstream of a gene9s annotated 39 end, such as the FTD risk factorRFNG34 (Fig. S2c), SIX3, TLX1, and ELK1 (Fig. S2d-f, and see also Bryce-smith et al.35) and 153 cryptic events induced premature polyadenylation in genes such as TNIP1, EGFR, SLC24A3, and GSTO2 (Fig. S2g-j). Like coordinated activation of splicing and polyadenylation in intron 1 ofSTMN2, we observed coordinated activation in PIGL, HNRNPA1, and ARHGAP32 (Fig. 2g), the latter of which is observed by Fratta and colleagues35. Cryptic splicing ofARHGAP32 was recently reported12, and we also confirmed this in RNAseq data from FTD/ALS postmortem brain samples (Fig. 2g). These results suggest a potential coupling between cryptic splicing and the activation of cryptic polyadenylation sites.

TDP-43 depletion increased the usage of two distal polyA sites inELP1 that lengthened its 39 UTR (Fig. 3a, left panel; Fig. S3a) and such lengthening is present across different datasets F(ig. S3b; see also Arnold et al.29). ELP1, also called IKBKAP, encodes a subunit of the elongator complex, which has been functionally and genetically linked to ALS36338. In addition to ELP1, we observed APA changes in two other subunits of the elongator complex (ELP3 and ELP6) upon TDP-43 knockdown ( Fig. S3c, S3d). These APA changes are associated with protein levels; TDP-43 knockdown led to increased protein levels of ELP1 and ELP3 (Fig. 3a, S3c). Together with a recent finding of reduced aminoacylation of tRN APhe in FTD/ALS39, our observations suggest that altered tRNA metabolism might be an important mechanism contributing to FTD/ALS.

Previous studies found that TDP-43 directly binds to the 39 UTR ofNEFL to stabilize its RNA levels40. NEFL encodes neurofilament light chain (NF-L), which has emerged as a sensitive prognostic biomarker for diverse neurodegenerative diseases41, including FTD/ALS42. We observed three major polyA sites in NEFL, with the most distal one being the most frequently used (Fig. 3b, left panel). TDP-43 knockdown not only reduced NEFL RNA levels (Fig. S3e), consistent with previous findings43, but also further shifted the apparent polyA site usage from proximal to distal (Fig. 3b, left and mid panels; Fig. S3f) and reduced NF-L protein levels (Fig. 3b, right panel). These observations suggest that together with NEFL 39 UTR-targeting microRNAs44, loss of TDP-43 induced APA changes might further reduce NF-Llevels. A direct role of TDP-43 in regulating NF-L levels adds complexity to the use of this biomarker in TDP-43 proteinopahties.

TDP-43 knockdown also shifted polyA site usage from proximal to distal in another FTD/ALS-linked gene, SFPQ (Fig. 3c; Fig. S3g). SFPQ encodes a ubiquitously expressed RNA-binding protein that plays key roles in RNA metabolism45,46. SFPQ encodes two protein isoforms; the shorter isoform uses a stop codon downstream of the proximal polyA site (Fig. 3c) and lacks a nuclear localization signal 45. Intriguingly, depletion of SFPQ from the nucleus has been observed in sporadic ALS spinal cord47, and its interaction with FUS, another FTD/ALS risk gene, is impaired in neuronal nuclei in FTLD-TDP48,49. Thus, loss of TDP-43 promotes the usage ofSFPQ9s distal polyA site (Fig. 3c), which might upregulate the shorter NLS-lacking SFPQ isoform and contribute to its nuclear depletion in disease.

The sensitivity of 39 end-seq further revealed that TDP-43 knockdown lengthenedTMEM106B9s 39 UTR (Fig. 3d), which we also confirmed by qRT-PCR (Fig. S3h). TMEM106B is a top genetic risk factor that emerged in a genome-wide association study for FTLD-TDP50. C-terminal fragments of TMEM106B were recently found to form amyloid fibrils in brains of older individuals and patients with neurodegenetriave disorders51354. By targeted analysis of RNA-seq read coverage across theTMEM106B 39 UTR, we confirmed that loss of TDP-43 is associated with a longerTMEM106B 39 UTR in FTD/ALS postmortem brain samples (Fig. 3e). We also analyzed the TMEM106B 39UTR in a series of 83 frontal cortex brain samples from the Mayo Clinic Brain Bank using qRT-PCR and found a significant increase in longerTMEM106B 39 UTRs in the frontal cortices of patients with FTLD-TDP compared with healthy controls (Fig. 3f), indicating that the increased 39 UTR length ofTMEM106B might be functionally relevant in FTD/ALS.

Because TMEM106B protein levels are altered in disease50,55, we next asked whether TDP-43 knockdown affected TMEM106B protein levels. Using a condition that has been shown previously to perserve TMEM106B dimers on SDS-PAGE55 (see also Fig. S3i), we found that TDP-43 knockdown did not affect levels of TMEM106B monomers (~42 KDa) but reduced dimer levels (Fig. 3g). A recent proteomics study also detected decreased TMEM106B levels caused by TDP-43 knockdown12. TDP-43 knockdown only modestly reduced TMEM106B RNA levels (Fig. S3j), suggesting that the longer 39 UTR might affect TMEM106B protein levels via translation. To test this hypothesis, we cloned short and long versions of the TMEM106B 39 UTR into a dual-luciferase reporter. We mutated the proximal polyA site in the long 39 UTR-containing reporter to prevent its usage and confirmed the production of the intended long 39 UTRF(ig. S3k). The long 39 UTRcontaining reporter had significantly lower luciferase activity and higher RNA levels thanhte short 39 UTRcontaining reporter (Fig. 3h), providing evidence that the loss of TDP-43-induced longerTMEM106B 39 UTR reduces translation efficiency. But how could this reduce the formation of TMEM106B dimersT?MEM106B APA might impact dimer levels by influencing the subcellular localization ofTMEM106B mRNA and its ability as a scaffold for protein-protein interactions, as has been shown for other APA even1t7s. Recent studies discovered an Alu element insertion in the TMEM106B 39 UTR, in perfect linkage with the top FTLD-TDP risk allele at this locus56358. Given the proximity of these variants toTMEM106B9s distal polyA site, future work will explore if and how these disease-associated genetic variants impact APA.

In the present study, we applied 39 end-seq to comprehensively map neuronal polyadenylation on a transcriptomic scale with base-pair resolution and discovered that TDP-43 dysfunction causeswidespread APA changes, some of which are in genes critical for neuronal function and in disease-associated gene.sThese changes are highlighted by coupled cryptic splicing and premature polyadenylation inARHGAP32 (Fig. 2g), changes observed in multiple subunits of the elongator complex (Fig. 3a, S3c, S3d), and 39 UTR lengthening in NEFL, SFPQ, and TMEM106B (Fig. 3b-d). This study is accompanied by two independent manuscripts also presenting widespread APA changes associated with TDP-43 dysfunction in FTD/ALS (Fratta and colleagues35; La Spada and colleagues29). Despite using different approaches, each of the three studies observed common APA changes (Fig. 2g, S2d-f, S2k-m), underscoring the importance of APA changes during neurodegeneration.

Our application of 39 end-seq complements and extends RNA-seq based APA analysis because it enables de novo identification of polyadenylation events with base-pair resolution, revealing complex and highly esnsitive APA changes (e.g., NEFL, ELP3, and TMEM106B). Together, the findings of these three studies provide evidence that TDP-43 pathology contributes to disease pathogenesis through not only cryptic splicingbut now also changes in alternative polyadenylation. level changes on the y-axis, illustrating that APA changes could alter RNA levels. Genes with significant APA changes are plotted; genes with significant RNA level changes are labeled in red when favoring distal polyA sitesor blue when favoring proximal polyA sites.e, Cumulative plots show that for TDP-43 regulated APA sites, distal polyA sites are stronger than proximal polyA sites. f, Cumulative plots show that for genes with signifciant 39 UTR lengthening upon TDP-43 KD, their distal polyA sites are stronger than their proximal ones. PolyA site scores (in panels e and f) were calculated using Aparent2, expressed as log odds ratio, and reflect polyA site strengths.g, TDP-43 KD activated cryptic splicing and premature polyadenylation in intron 12 ofARHGAP32. The <polyA DB= track marks annotated polyA sites 32 and the <TDP-43 sites= track marks the observed TDP-43 binding . Gene structure is shown on the bottom. 59 43 knockdown (KD) samples were used. b, Pie chart shows the distribution of 39 end-seq identified polyA sites. Note that <intergenic= represents polyA sites not associated with annotated 39 UTRs, exons, or introns. c-f, Examples of the use of an unannotated distal polyA site upon TDP-43 KD. g-j, Examples of premature polyadenylation upon TDP-43KD. k, Example of complex usage change of multiple polyA istes upon TDP-43 KD. l. Example of increased proximal polyA usage upon TDP-43 KD. m, Example of 39 UTR shortening upon TDP-43 KD. Details, as in Fig. 2g.

TMEM106B (h), confirmed by qRT-PCR.i, WB shows that as the temperature increases, a ~75 KDa band collapses to a 42 KDa band. Cell lysates were incubated at respective temperatures for 10 min before electrophoresis.j, Bar plots show that TDP-43 KD modestly reduced TMEM106B RNA levels. The adjusted p value was calculated in DEseq2. k, Bar plots show that the reporter with the long TMEM106B 39 UTR produced the long 39 UTR, as designed. RNA levels were measured by qRT-PCR. Unless stated otherwise, p-values were calculated by Student9s t test. ns (not significant),p > 0.05; *, p f 0.05; **, p f 0.01; ***, p f 0.001; ****, p f 0.0001.

List of supplementary tables

Table S1: Alternative polyadenylation changes detected in Liu et al. 2019 Table S2: Significant APA changes in 39 UTRs upon TDP-43 knockdown detected by RNA-seq Table S3: Significant APA changes upon TDP-43 knockdown detected by 39 end-seq Table S7: Summary of patient data

Acknowledgments

We thank members of the Gitler lab and the Petrucelli lab for helpful discussions and commentosn the manuscript. We thank Abigail Song and Maylin Fu for help with experiments. We thank Ziwei Chen for hleping with Aparent2. Y.Z. is supported by a postdoctoral scholar award from The Phil and Penny Knight Initiative for Brain Resilience at the Wu Tsai Neurosciences Institute, Stanford University, and afellowship grant from the Larry L. Hillblom foundation. T.A. is supported by NIH (2T32AG047126-06A1) and a fellowship from the Takeda Science Foundation. C.G. is supported by Milton Safenowitz Postdoctoral Fellowship Program. Work in A.D.G. is supported by NIH (grants R35NS097263, U54NS123743, R01AG064690) and Target ALS. A.D.G. is a Chan Zuckerberg Biohub 3 San Francisco Investigator. Work in L.P. is supported by NIH (U54NS123743, R35NS097273, P01NS084974) and Target ALS. Work in M.P. is supported by NIH (U54NS123743, RF1NS120992) and Target ALS. Nat Neurosci 14, 4523458 (2011). from loss of TDP-43.Nat Neurosci 14, 4593468 (2011).

RNA and Protein Homeostasis. Neuron 79, 4163438 (2013). Neuropathy. Trends in Neurosciences 44, 4243440 (2021).

Seminars in Cell & Developmental Biology 99, 1933201 (2020). is compromised in ALS-FTD. Science 349, 6503655 (2015).

growth and repair. Nat Neurosci 22, 1673179 (2019). Clin Invest 130, 608036092 (2020).

1243130 (2022). 10. Brown, A.-L. et al. TDP-43 loss and ALS-risk SNPs drive mis-splicing and depletion of UNC13A.Nature 11. Ma, X. R. et al. TDP-43 represses cryptic exon inclusion in the FTD3ALS gene UNC13A. Nature 603, 12. Seddighi, S. et al. Mis-spliced transcripts generate de novo proteins in TDP-43-related ALS/FTD. 13. Irwin, K. E. et al. A fluid biomarker reveals loss of TDP-43 splicing repression in pre-symptomatic ALS.

2023.01.23.525202 Preprint at https://doi.org/10.1101/2023.01.23.525202 (2023). sclerosis and frontotemporal dementia: Towards therapeutic targets and biomarkers.Clin Transl Med 12, 17. Mitschka, S. & Mayr, C. Context-specific regulation and function of mRNA alternative polyadenylation.

Nat Rev Mol Cell Biol 1318 (2022) doi:10.1038/s41580-022-00507-5.

Rev Mol Cell Biol 1314 (2021) doi:10.1038/s41580-021-00417-y. 18. Passmore, L. A. & Coller, J. Roles of mRNA poly(A) tails in regulation of eukaryotic gene expressoi n. Nat 19. Tian, B. & Manley, J. L. Alternative polyadenylation of mRNA precursors.Nat Rev Mol Cell Biol 18, 18330 20. Rot, G. et al. High-Resolution RNA Maps Suggest Common Principles of Splicing and Polyadenylation

Regulation by TDP-43. Cell Reports 19, 105631067 (2017). 21. Imaizumi, K., Ideno, H., Sato, T., Morimoto, S. & Okano, H. Pathogenic Mutation of TDP-43 Impairs RNA

Processing in a Cell Type-Specific Manner: Implications for the Pathogenesis of ALS/FTDL. eNeuro 9, 22. Modic, M. et al. Cross-Regulation between TDP-43 and Paraspeckles Promotes Pluripotency

Differentiation Transition.Molecular Cell 74, 951-965.e13 (2019). 23. Ayala, Y. M. et al. TDP-43 regulates its mRNA levels through a negative feedback loop.EMBO J 30, 2773 288 (2011).

Neurosci 18, 117531182 (2015). 24. Prudencio, M. et al. Distinct brain transcriptome profiles in C9orf72-associated and sporadic ALS.Nat 25. McKeever, P. M. et al. Single-nucleus multiomic atlas of frontal cortex in amyotrophic lateral sclerosis with a deep learning-based decoding of alternative polyadenylation mechanisms. 2023.12.22.573083 Preprint at

Reports 27, 1409-1421.e6 (2019).

isoforms.Bioinformatics 36, 390733909 (2020).

misprocessing in ALS/FTD and related disorders. 29. Arnold, F. et al. TDP-43 dysregulation of polyadenylation site selection is a defining feature of RNA 30. Dafinca, R.et al. Impairment of Mitochondrial Calcium Buffering Links Mutations in C9ORF72 and TARDBP in iPS-Derived Motor Neurons from Patients with ALS/FTD.Stem Cell Reports 14, 8923908 (2020).

Nucleic Acids Res 48, D1743D179 (2020). 31. Herrmann, C. J. et al. PolyASite 2.0: a consolidated atlas of polyadenylation sites from 39 end sequencing. 32. Zhao, W. et al. POSTAR3: an updated platform for exploring post-transcriptional regulation coordinated by RNA-binding proteins. Nucleic Acids Res 50, D2873D294 (2022).

polyadenylation using APARENT2. Genome Biology 23, 232 (2022). 33. Linder, J., Koplik, S. E., Kundaje, A. & Seelig, G. Deciphering the impact of genetic variation on human 34. Ferrari, R. et al. A genome-wide screening and SNPs-to-genes approach to identify novel genetic risk factors associated with frontotemporal dementia.Neurobiol Aging 36, 2904.e13326 (2015). 35. Bryce-Smith, S. et al. TDP-43 loss induces cryptic polyadenylation and increased protein synthesis in ALS. 36. Figley, M. D., Bieri, G., Kolaitis, R.-M., Taylor, J. P. & Gitler, A. D. Profilin 1 associateswith stress granules and ALS-linked mutations alter stress granule dynamics. J Neurosci 34, 808338097 (2014). 37. Bento-Abreu, A. et al. Elongator subunit 3 (ELP3) modifies ALS through tRNA modification.Hum Mol Genet 27, 127631289 (2018).

degeneration. Hum Mol Genet 18, 4723481 (2009). 38. Simpson, C. L. et al. Variants of the elongator protein 3 (ELP3) gene are associated with motor neuron 1683182 (2009).

Psychiatry 90, 8703881 (2019). (TDP-43), 14-3-3 proteins and copper/zinc superoxide dismutase (SOD1) interact to modulate NFL mRN A stability. Implications for altered RNA processing in amyotrophic lateral sclerosis(ALS). Brain Res 1305, 41. Gaetani, L. et al. Neurofilament light chain as a biomarker in neurological disorders.J Neurol Neurosurg 42. Rojas, J. C. et al. Plasma Neurofilament Light for Prediction of Disease Progression in Familial Frontotemporal Lobar Degeneration. Neurology 96, e22963e2312 (2021).

Molecular and Cellular Neuroscience 35, 3203327 (2007). 43. Strong, M. J. et al. TDP43 is a human low molecular weight neurofilament (hNFL) mRNA-binding protein. 44. Ishtiaq, M., Campos-Melo, D., Volkening, K. & Strong, M. J. Analysis of Novel NEFL mRNA Targetnig microRNAs in Amyotrophic Lateral Sclerosis. PLoS One 9, e85653 (2014).

Neuronal Function and Neurodegeneration. Int J Mol Sci 21, 7151 (2020).

45. Lim, Y. W., James, D., Huang, J. & Lee, M. The Emerging Role of the RNA-Binding Protein SFPQ in 46. Gordon, P. M., Hamid, F., Makeyev, E. V. & Houart, C. A conserved role for the ALS-linked splicing factor

SFPQ in repression of pathogenic cryptic last exons.Nat Commun 12, 1918 (2021). 47. Luisier, R. et al. Intron retention and nuclear loss of SFPQ are molecular hallmarks of ALS.Nat Commun 9, 48. Ishigaki, S. et al. Altered Tau Isoform Ratio Caused by Loss of FUS and SFPQ Function Leads to FTLD2010 (2018). like Phenotypes. Cell Reports 18, 111831131 (2017). 50. Van Deerlin, V. M. et al. Common variants at 7p21 are associated with frontotemporal lobar degeneration 51. Chang, A. et al. Homotypic fibrillization of TMEM106B across diverse neurodegenerative diseases.Cell 53. Schweighauser, M. et al. Age-dependent formation of TMEM106B amyloid filaments in human brains. 54. Fan, Y. et al. Generic amyloid fibrillation of TMEM106B in patient with Parkinson9s disease dementia nad 49. Ishigaki, S. et al. Aberrant interaction between FUS and SFPQ in neurons in a wide range of FTLD spectrum diseases. Brain 143, 239832405 (2020). with TDP-43 inclusions. Nat Genet 42, 2343239 (2010). 185, 1346-1355.e15 (2022).

normal elders. Cell Res 32, 5853588 (2022). 52. Jiang, Y. X. et al. Amyloid fibrils in disease FTLD-TDP are composed of TMEM106B not TDP-43.Nature 55. Chen-Plotkin, A. S. et al. TMEM106B , the Risk Gene for Frontotemporal Dementia, Is Regulated by the microRNA-132/212 Cluster and Affects Progranulin Pathways.J. Neurosci. 32, 11213311227 (2012). 56. Salazar, A. et al. An AluYb8 retrotransposon characterises a risk haplotype of TMEM106B associated in 57. Rodney, A., Karanjeet, K., Benzow, K. & Koob, M. D. A common Alu element insertion in the 39UTR of 58. Chemparathy, A. et al. A 39 UTR Deletion Is a Leading Candidate Causal Variant at the TMEM106B Locus Reducing Risk for FTLD-TDP. 2023.07.06.23292312 Preprint at 59. Wang, R., Nambiar, R., Zheng, D. & Tian, B. PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes.Nucleic Acids Res 46, D3153D319 (2018). 60. Bieri, G. et al. LRRK2 modifies ³-syn pathology and spread in mouse models and human neurons.Acta Neuropathol 137, 9613980 (2019). 61. Çelik, M. H. & Mortazavi, A. Analysis of alternative polyadenylation from long-read or short-read RNANeurodegener 18, 57 (2023).

Methods Cell culture

HEK293T cells were maintained in DMEM (1x) + GlutaMax-I media (Gibco, 10564011) with 10% FBS (Gibco, 16000044) and 100 U/mL Penicillin-Streptomycin (Gibco, 15140122).

Stem cell maintenance and differentiation into iNeurons

Human embryonic stem cells (hESCs; H1) were maintained in mTeSR1 plus media (StemCel l Technologies, 100-0276) on Matrigel (Corning, 354230). hESCs were fed every two days and split every 437 days using ReLeSR (StemCell Technologies, 100-0483) according to the manufacturer9s instructions. The differenattiion of hESCs to neurons by forcingNGN2 over-expression was carried out as previously described60. In brief, cells were transduced with a Tet-On induction system to drive the expression of the transcription factor NGN2. Cells were dissociated on day 3 of differentiation and replated on Matrigel-coated tissue culture platsein Neurobasal Medium (Thermo Fisher, 21103049) containing neurotrophic factors, BDNF and GDNF (R&D Systems).

TDP-43 knockdown in iNeurons

After seven days of being cultured for differentiation, iNeurons were transduced with lentivirus expreinssg scramble shRNA or TDP-43 shRNA, cultured for additional seven days or 12 days, and then collected for downstream analyses. The knock-down efficiency was assessed by Western blotting.

Immunoblotting

After seven days of TDP-43 knockdown, cells were lysed at 4 ÚC for 15 min in ice-cold RIPA buffer (SigmaAldrich R0278) supplemented with a protease inhibitor cocktail (Thermo Fisher 78429) and phosphatase inhibitor (Thermo Fisher 78426). After pelleting lysates at 20,000xg on a table-top centrifuge for 15)min at 4)°C, the supernatant was used for bicinchoninic acid (Invitrogen 23225) assays to determine protein concentrations. Unless stated otherwise, 5-10 ug of protein lysates of each sample was denatured for 10)min at 70C)° in LDS sample buffer (Invitrogen NP0008) containing 2.5% 2-mercaptoethanol (Sigma-Aldrich). These samples were loaded onto 4312% Bis3Tris Mini gels (Thermo Fisher NP0335BOX) for gel electrophoresis and then transferred onto 0.45-¿m nitrocellulose membranes (Bio-Rad 162-0115) using semi-dry transfer method (Bio

Rad Trans-Blot Turbo Transfer System, 1704150) or at 100)V for 2)h at 4 ÚC using the wet transfer method (BioRad Mini Trans-Blot Electrophoretic Cell 170-3930). Membranes were blocked in EveryBlot Blocking Buffer (Bio-Rad 12010020) or 5% non-fat dry milk in TBST for 1)h then incubated overnight at 4ÚC in blocking buffer containing antibodies against TMEM106B (1:500, Cell Signaling Technology 93334), TDP-43 (1:1,500, Proteintech 10782-2-AP), ELP1 (1:500, Cell Signaling Technology 5071S), ELP3 (1:1000, Proteintech 170161-AP), NEFL (1:1000, Thermo Fisher MA5-14981), GAPDH (1:2,000, Sigma-Aldrich G8795), or GAPDH (D16H11) XP (1:1000, Cell Signaling Technology 8884S). Membranes were subsequently incubated in blocking buffer containing horseradish peroxidase (HRP)-conjugated anti-mouse IgG (H+L) (1:5,000, Fisher 62-6520) or HRP-conjugated anti-rabbit IgG (H+L) (1:5,000, Life Technologies 31462) for 1 h. Amersham ECL Prime kit (Cytiva RPN2232) or SuperSignal# West Femto Maximum Sensitivity Substrate (Thermo Fisher 34094) was used to develop blots and imaged using ChemiDox XRS+ System (Bio-Rad). The intensity of bands was quantified using Fiji, and then normalized to the corresponding controls.

Immunoblotting of TMEM106B

After 12 days of TDP-43 knockdown, cells were lysed and normalized as described above. To detect the impact of TDP-43 knockdown on the dimer level of TMEM106B, normalized cell lysates were mixed with 2x laemmli buffer (Bio-Rad) containing 5% 2-mercaptoethanol (Sigma-Aldrich) and loaded directly onto 4-20% TrisGlycine mini gels (Thermo Fisher XP04205BOX) for gel electrophoresis on ice. To detect the temperature sensitivity of TMEM106B dimer, cell lysates were incubated at 4ÚC, 37ÚC, 70ÚC, and 85ÚC for 10 min and then loaded onto a 4-20% Tris-Glycine mini gel for gel electrophoresis on ice. After electrophoresis, asmples were transferred onto 0.45-¿m PVDF membranes (Bio-Rad 162-0115) at 250 mA for 2)h using the wet transfer method (Bio-Rad Mini Trans-Blot Electrophoretic Cell 170-3930). Membranes were blocked in 5% non-fat dry milk in TBST for 1)h and then incubated overnight at 4ÚC in blocking buffer containing the antibody against TMEM106B (1:500, Cell Signaling Technology 93334). The secondary antibody, imaging, and quantitation are the same as described above.

Total RNA extraction from iNeurons

Total RNA was extracted using Trizol according to the manufacturer9s instructions. TotlaRNA was then treated with Turbo DNase (Thermo Fisher AM2238) and cleaned by Zymo9s clean and concentration columns. The quality of RNA was examined by on a High Sensitivity RNA ScreenTape (Agilent, Tapestaiton). qRT-PCR from iNeurons Total RNA (500 ng) was reverse transcribed to cDNA using the PrimeScript# RT Reagent Kit with gDNA Eraser (Takara, RR047A). qPCR was carried out using PowerUp# SYBR# Green Master Mix kit and detected by QuantStudio3 system (Thermo Fisher). Primers are listed in Table S6.

Total RNA sequencing

Total RNAs from scramble shRNA or TDP-43 shRNA treated cells were used to construct RNA-sequencing libraries using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian kit (Takara 634411), according to the manufacturer9s instructions. The resulting libraries were quantitatde, pooled, and sequenced on a Nextseq 500 machine using the 150-cycle high output kit in a 75bp paired-end mode (Illumina).

Gene expression analysis

gene expression analysis using Salmon and DESeq2.

Splicing analysis

Adaptors in FASTQ files were trimmed using fastp. The adapter trimmed FASTQ files were used for differential The adapter-trimmed FASTQ files were mapped to human genome (hg38) following ENCODE recommended settings using STAR. The unique-mapping, properly paired reads were then used for splicing analysis usni g leafcutter. Cryptic splicing events were called by leafcutter.

APA analysis using RNA-seq data

Both in-house RNA-seq and publicly available datasets were used for the analysis. The publicly available datasets were downloaded from GEO or SRA (GSE126542, GSE121569, GSE147544, GSE196144, and ERP126666). The adapter trimmed FASTQ files were mapped to human genome (hg38) following ENCODE recommended settings using STAR. The unique-mapping, properly paired reads were then used for APA detection by either APAlyzer or QAPA and then differential analysis by DEXseq. APAlyzer uses polyA DB3 (https://exon.apps.wistar.org/PolyA_DB/) and QAPA uses PolyASite 2.0 (https://polyasite.unibas.ch/) as polyA site reference for detecting APA events. 39end-seq To comprehensively map polyadenylation and quantify alternative polyadenylation, three control samples and three TDP-43 knockdown samples were used for 39 end-seq. For each sample, 500 ng of total RNA was used to construct the 39 end-seq library using Quantseq REV kit from Lexogen, according to the manufacturer9s instructions. The resulting 39 end-seq libraries were quantified by Qubit, checked for libraryiszes on a D1000 high sensitivity chip (Agilent Technologies, Tapestation), pooled, and sequenced with on a Nextseq 500 or Nextseq 2000 machine using the 150-cycle high output kit in a 75bp paired-end mode (Illumina).

APA analysis using 39 end-seq data

The sequenced 39 end-seq libraries were adapter-trimmed, and quality filtered using bbduk. The fltiered reads were mapped to human genome (hg38) using STAR and then extracted for unique-mapping, properly paired reads using Samtools. If a read was mapped to a region that is immediately upstream of six conescutive As or of a 10-bp region with at least 60% of As, it was considered as a mis-primed read and removed. The resulting filtered reads were analyzed using a modified version of LAPA61 that used the default setting, except requiring a replication rate cutoff at 0.75 and can identify polyA site-containing reads mapped to the reversetrsand. The change in the polyA site usage was considered significant if the usage difference betweewnot conditions was >10% with adjusted p value < 0.05. PolyA sites were considered cryptic if their usage was f5% under the control condition but g10% under the TDP-43 knockdown condition and their usage increase was g10%. To identify TDP-43 dependent APA events, genes were considered only if they had at least two identifidepolyA sites and if they had more than two polyA sites, they were filtered for two polyA sites with the two largest usage changes and used for plotting (Table S3). Cryptic polyA sites further include polyA sites that became activated upon TDP-43 knockdown and are located downstream of a gene9s annotated 39 end, which is extracted from a curated gene annotation database. Cryptic splicing events detected in RNA-seq data and 39 end-seq data were used to search for coordinated cryptic polyadenylation events.

Calculation of polyA site score

For each polyA site identified in 39 end-seq libraries, a 205 bp sequence centered at site was extracted and used to calculate the polyA site score using Aparent2, which was then converted to log odds ratio. Evaluation of TMEM106B 39 UTR length in FTLD-TDP brain samples by qRT-PCR Our study cohort included a total of 83 postmortem cases classified into two main groups: healthy ocntrols (n=27) and Frontotemporal dementia (FTD) cases (n=56). Summary of patient data is included in Table S7. RNA was extracted from the frontal cortices of the healthy or the FTD patients following the manucftaurer9s protocol using the RNAeasy Plus Mini Kit (Qiagen) and as previously described9,11,62. Up to three cuts of the sample was used for extraction and only the high-quality RNA samples were processed for downstream analysis. RNA concentration was measured by using Nanodrop technologies (Thermo Fisher) and the RNA integrity number (RIN) was evaluated by Agilent 2100 bioanalyzer (Agilent Technologies). Subsequently, 500 ng of the total RNA extracted was reverse transcribed into cDNA using the High-Capacity cDNA Transcription Kit (Applied Biosystems) per the manufacturer9s instructions. cDNA samples, in triplicates, with SYBR GreenER qPCR SuperMix (Invitrogen), were further evaluated for the quantitative real-time PCR (qRT-PCR) on a QuantStudio# 7 Flex Real-Time PCR System (Applied Biosystems). Relative quantification ofthe long TMEM106B 39 UTR and total TMEM106B levels was determined using the ——Ct method and normalized to two endogenous controls, GAPDH and RPLP0. All the statistical analyses were performed using the GraphPad Prism 10 (GraphPad Software). For comparison of the frontal cortex RNA levels between the healthy and FTD cases, Mann-Whitney test was used. Primers are listed in Table S6.

Construction of luciferase reporters

The short TMEM106B 39 UTR and the longTMEM106B 39 UTR were amplified from human genomic DNA (H1) and then cloned into the pmirGLO Dual-Luciferase Vector (Promega, E1330) using Gibson assembly. The proximal polyA site in the long TMEM106B 39 UTR was then mutated using Gibson assembly. Mutations that disrupt the proximal polyA site were identified using Aparent2. All of the reporters were ocnfirmed to have correct sequences by whole-plasmid sequencing.

Measurement of luciferase activity

HEK293T cells were plated on a 96-well plate or on a 24-well plate and transfected with theluciferase reporters that have no insert, the short TMEM106B 39 UTR, or the longTMEM106B 39 UTR using Lipofectamine 3000 (Thermo Fisher, L3000001). Two days later, the transfected cells on a 96-well plate were measured for the luciferase activities using Dual-Glo® Luciferase Assay System (Promega, E2920), accordni g to the manufacturer9s instructions; the transfected cells on a 24-well plate were used for qRT-PCRsadescribed above.

Quantitation and statistical analysis

All quantification and statistical analyses were done in R. Analysis details can befound in figure legends and the main text. Genome tracks were prepared using IGV. All plots were prepared using ggplot2, ggpubr, patchwork, and ggrepel in R.

Data and code availability

The sequencing data generated in this study will be available at GEO. The data and code supporting the findings of this study are available from the corresponding authors upon reasonable request. The following publicly available data are used in this study: GSE126542, GSE121569, GSE147544, GSE196144, and ERP126666.