INTRODUCTION

September

10.1101/2023.06.20.545441

Mycotools: An Automated and Scalable Platform for Comparative Genomics

Zachary Konkel

Jason C. Slot

slot.1@osu.edu 0 0 Department of Plant Pathology, The Ohio State University , Columbus, OH, 43210 , United States of

2023

12 2023 132 150

Comparative genomics comprises analyses that investigate the genetic basis of organismal biology and ecology, which have also been applied to high throughput trait screening for applied purposes. The number of fungal genomes deposited in publicly available databases are currently in exponential growth. Due to the limited cutting-edge software availability and size or efficiency constraints of webbased analyses, comparative genomics research is often conducted on local computing environments. There is thus a need for an efficient standardized framework for locally assimilating, curating, and interfacing with genomic data. We present Mycotools as a comparative genomics database software suite that automatically curates, updates, and standardizes local comparative genomics. Mycotools incorporates novel analysis pipelines that are built on a suite of modules that streamline routine-tocomplex comparative genomic tasks. The Mycotools software suite serves as a foundation for accessible and reproducible large-scale comparative genomics on local compute systems.

INTRODUCTION

Comparative genomic analyses are widely applied in healthcare, agriculture, and ecological studies. In healthcare, phylogenomic analysis has allowed the monitoring of strain evolution for the public health response to SARS-CoV-2 (1–3). At a broad taxonomic scope, phylogenomics has radically transformed the understanding of organismal relationships across life ( 4 ). In agricultural research, comparative genomics has unveiled population-specific virulence factors in pathogens, while genome-wide association studies have generated gene targets for breeding pest resistance (5–7).

The rapid accumulation of publicly available genomes, made possible by advances in wholegenome sequencing, has expanded the potential scope and resolution of comparative genomics (Figure 2.1). Web-hosted public genome resources, such as GenBank at the National Center for Biotechnology Information (NCBI) and MycoCosm at Joint Genome Institute (JGI), are central repositories for thousands of fungal genomes (8). GenBank’s fungal genome database appears to be in exponential growth since 2000, currently growing at a rate faster than one genome per day (Figure 1). These web databases incorporate important analysis tools, such as RefSeq BLAST (Madden, 2013) and the MycoCosm phylogeny (8) However, genomic methods have rapidly evolved in parallel with genome availability, making it impractical for web databases to implement most modern software. Therefore, online genome data is often locally assimilated and combined with genomes generated in lab in order to take advantage of cutting edge comparative genomics software. 44 45 46 47 48 49 50 51 52 53 54 55 56 57 (ordinary least squares log(genomes) v. time R2 = 0.994).

Locally analyzing genomic data is cumbersome because standardized local database software is not broadly available. Local genome databases are built by manual curation or in-house database assimilation scripting ( 9–14 ) resulting in both redundancy and non-standardized file formats. Despite the large effort to create local databases, the rapid growth of available data quickly renders these databases obsolete. Acquiring available data is made more difficult by numerous legacy genome annotation file formats and discrepant formatting between different genome repositories. The complexity with assimilating genome data locally is a significant inefficiency that may lead to using rapidly outdated or incomplete datasets.

We present Mycotools, a software suite designed to increase the efficiency, accessibility, and scalability of comparative genomics by implementing a standardized comparative genomics database format and interface. Mycotools is centered around MycotoolsDB (MTDB, Figure 2), a systematically assimilated local database of publicly available genome data. MTDB automatically downloads and curates GenBank and JGI genomic data in the mtdb format. The mtdb format contains the metadata and 59 60 61 62 63 64 65 66 67 68 genomic data that is required for input to the Mycotools software suite and external software. This simplifies analysis by incorporating a single input file that is also a log of the metadata associated with each analysis. Mycotools scripts and libraries are built around the mtdb format, which enables pipelining routine to complex comparative genomic analyses. Mycotools includes practical pipelines built from these modules, including homology search algorithms coupled with accession extraction, automated phylogenetic analysis, and synteny analysis pipelines. The Mycotools software suite is poised to serve as a foundation for a standardized comparative genomics interface.

DESIGN AND IMPLEMENTATION MycotoolsDB increases the accessibility of large-scale comparative genomics

MycotoolsDB (MTDB) is designed for ease-of-use and keeping pace with accumulating genome data. MTDB is initialized and updated by assimilating publicly available genomes from GenBank and MycoCosm with local data using the script mtdb update (Figure 2). This script initializes 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 by dereplicating MycoCosm and NCBI data using the assembly accession (GenBank), submitter (GenBank), date modified (GenBank), and Portal ID (MycoCosm) fields. Dereplication references an existing MTDB, the primary MycoCosm genome table and/or NCBI primary table (prokaryotes.txt/eukaryotes.txt). NCBI genomes that contain “Joint Genome Institute” or “JGI” in the submitter field are removed as redundant overlap with MycoCosm. Once data is dereplicated, Mycotools downloads the gene coordinates gff and masked assembly fasta. Each genome is systematically assigned a readable, informative “ome” code comprised of the first three letters of the genus, the first three letters of the species, and a unique accession number for that codename (e.g., the first added Amanita muscaria genome is denoted amamus1). Version updates append a version number to the code (e.g., amamus1 to amamus1.1 following the first update). Local genomes are added to the primary MTDB by submitting a genome-delimited spreadsheet of metadata, assembly paths, and gene coordinate gff paths, which are then curated into the same format as downloaded genomes.

Mycotools increases the accessibility and software compatibility of comparative genome analyses by systematically and uniformly curating the general features format (gff) annotation files and assembly fasta. The gff format needs curation because it is slow to assemble the gff hierarchical structure (15) and the format has been through three primary versions, with discrepant formatting that breaks software compatibility. Mycotools expedites gff parsing by curating and applying MTDB accessions to each entry, which directly links related entries without requiring hierarchical assimilation. For example, GFF data by default has to be iteratively parsed to tie CDS entries to their parent genes by first linking each CDS to their parent RNA entry. Mycotools curation additionally improves software compatibility by updating legacy gff versions to gff3, and establishing discrete, uniform guidelines for formatting. Legacy gff versions are brought to gff3 by curating introns into exons and translating start and stop codon coordinates into gene entries. RNAs, exons, genes, pseudogenes, and CDSs are other acceptable gff entry types. The original assembly contig accessions are retained and prepended with the 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 genome’s ome code. Full formatting requirements are detailed in the Mycotools usage guide (github.com/xonq/mycotools).

MycotoolsDB standardizes interfacing with large-scale comparative genomics

MycotoolsDB (MTDB) genome database formatting standardizes and streamlines comparative genomics by providing a file format that can scale with the exponential growth of genomic data. MTDB files serve as the sole input that references genome data to each analysis, which increases efficiency and swift analysis scalability by extracting rows of interest for subset analyses. This format is flexible for future updates, can operate as uniform standalone reference databases, and is an easilydisseminated log of genome metadata for data reporting. MTDB files are scalable tab delimited reference files with each row containing a unique genome, its taxonomic metadata, and assembly and annotation file paths (Table 1). Mycotools analysis scripts input the primary MTDB by default, though subsets of the primary MTDB can be generated by extracting rows (genomes) of interest for an analysis. For example, a BLAST or profile model search can be executed against the primary MTDB via db2search, or an analysis can be restricted to a particular lineage by extracting an MTDB file of a taxonomic lineage using mtdb extract and inputting the extracted MTDB file to db2search. If external software do not have built-in compatibility for the MTDB format, reference MTDB files can still be used as a database because Mycotools curates the data and db2files copies requested genome data referenced in MTDB files. Due to the simplicity of the format, MTDB files are also amenable to standard shell scripting.

MTDB Column

ome genus species strain taxonomy version source biosample published fna faa gff3

MTDB Data Example

athter1.2

Athelia termitophila

TMB5 {‘kingdom’: ‘Fungi’, ‘phylum’: ‘Basidiomycota’, ‘subphylum’: ‘Agaricomycotina’, ‘class’: ‘Agaricomycetes’, ‘order’: ‘Atheliales’, ‘family’: ‘Atheliaceae’} 1.0 ncbi SAMN15352002 20230101 $MTDB_PATH/../data/fna/ athter1.fna $MTDB_PATH/../data/faa/ athter1.faa $MTDB_PATH/../data/gff3/ athter1.gff3 POSIX character constraint [a-z0-9\.] [a-zAz-] [a-zA-Z] [a-zA-Z0-9\.] [a-zA-Z] [0-9\.] [a-z0-9\.] [a-zA-Z0-9] [0-9]

Data source First 3 letters genus (ath), first 3 letters species (ter), unique

accession number (1), “.” and

MTDB version number (2) Source metadata Source metadata Source metadata JSON of taxonomy derived from querying genus and kingdom to NCBI’s taxonomy hierarchy Source metadata Source metadata Source metadata Date assimilated [YYYYmmdd] Curated MTDB assembly Curated MTDB proteome Curated MTDB gene

coordinates assembly_acc GCA_014898675.1 [a-zA-Z0-9_\.]

Source metadata

Konkel et al., 2021 [a-zA-Z0-9-_\.,; ] Source metadata

Mycotools pipelining modules streamline comparative genomic analyses

Mycotools modules form the foundation for complex comparative genomics pipelines. Table 2 delineates Mycotools modules, which manipulate the mtdb Python object to perform routine tasks that serve as the foundation for pipelining complex analyses. For example, accession acquisition in desired formats is streamlined through the acc2fa/acc2gff/acc2gbk/acc2locus scripts (Table 2), while synteny diagram generation is implemented in gff2svg, and automated phylogenetic reconstruction that includes alignment, trimming, and tree building modules, is built into fa2tree. fa2clus is an example pipeline built from Mycotools modules and external dependencies, which iteratively clusters sequences based on sequence similarity until a cluster within a minimum and maximum number of sequences is obtained. fa2clus is additionally a module for the Cluster Reconstruction and Phylogenetic Analysis (CRAP) pipeline (16), which analyzes the evolution of a input locus by integrating locus and gene accession acquisition, homolog identification, sequence similarity clustering, phylogenetic reconstruction, and synteny analysis (Figure 3). Mycotools also implements a novel maximumlikelihood microsynteny phylogeny pipeline, db2microsyntree, which reconstructs a tree data structure that recapitulates divergence in gene order (17). When external software is used, the db2files module symbolically links MTDB genome data for seamless file acquisition (Figure 3).

Script

acc2fa acc2gff

Operation

Generates fasta of input accession Generates gff of input accession

Output(s)

fasta gff3

Dependen Citation cies s

acc2gbk acc2locus add2gff assemblyStats annotationStats coords2fa crap db2files acc2fa, acc2gff Generates gbk of input accession gbk Acquires locus of accession fasta, gff3 Add edited gene models from programs such as Exonerate to MTDB Generates assembly statistics for MTDB/fasta Generates annotation statistics for MTDB/gff3 Extracts coordinates from nucleotide fasta(s) mtdb, gff3 assembly statistics annotation statistics fasta Recapitulates the evolution of a gene cluster on a gene-by-gene acc2fa, basis; Inputs a query of gene acc2locus, cluster genes, searches for acc2gff, locus mapped homologs, implements fa2clus to db2search phylogeny truncate large homolog sets , fa2clus, graphics, below an inputted maximum fa2tree, newick sequence number, reconstructs gff2svg, phylogenies of each query, and FastTree, maps locus synteny diagrams ETE onto phylogeny tips Copies/symbolic links fna, faa, files and/or gffs from an MTDB input Reconstructs a maximum likelihood microsynteny tree db2microsyntree from a microsynteny alignment of neighborhoods surrounding near single-copy orthologs db2search extract__mtdb

Queries an accession/profile model via a search algorithm (BLAST, diamond, or MMseqs) against an MTDB and compiles a fasta output of the results Generates a sub-MTDB from the primary MTDB via taxonomy or other input parameters (mtdb extract) newick fasta mtdb (18, 19) (22–25)

Cogent3,

IQ-TREE, (20–22)

MMseqs,

acc2fa,

BLAST, DIAMON D, MMseqs, HMMER

fa2clus fa2hmmer2fa fa2tree gff2seq gff2svg jgiDwnld manage_mtdb mtdb ncbiDwnld ome2name

Translates ome codes to genus, edited input Clusters proteins via amino acid alignment similarity; optionally fa2clus will iteratively cluster around a focal gene until a cluster within a minimummaximum set of sequences is acquired and output a fasta of the resulting cluster Queries an hmm database, such as Pfam, against a fasta, extracts hits, and outputs a fasta of either the full gene hits or querycovered coordinates Generates a maximum likelihood phylogeny of an input fasta/alignment by aligning, trimming, and tree generation Acquires nucleotide or amino acid sequences from a gff and fna fasta input Generates a locus diagram svg from a gff or list of gffs and annotates via the “product=” attribute svg Downloads JGI data (transcripts, transcript assemblies, gene coordinate fasta, protein files, or proteomes) directly via fasta, Portal ID queries or an MTDB assembly input fasta, gff3 Primary MycotoolsDB management utility (mtdb manage) MycotoolsDB interface and central utility Downloads NCBI data transcript (accessions, assemblies, fasta, protein transcripts, gene coordinate files, fasta, or proteomes) directly via unique assembly accession IDs or an MTDB input fasta, gff3, fastq fasta

MMseqs, scikitlearn (22, 26)

fasta

HMMER

(24) phylogeny

MAFFT, ClipKIT, IQ-TREE, FastTree DNA Features Viewer Biopython

(29) ( 19, 21, 27, 28 ) (30) Biopython (29) 140 predb2mtdb update_mtdb species, strain, and/or assembly accession ID Curates and submits in-house genomes to primary MTDB (mtdb predb2db) Initializes/updates primary MTDB, assimilates publicly available genome data, and updates locally curated data (mtdb update) data mtdb mtdb

Testing Mycotools throughput using a dataset of a fungal subphylum

To demonstrate the standardization and efficiency of Mycotools-facilitated comparative genomics, we used a test MTDB of the fungal subphylum, Ustilaginomycotina (Basidiomycota). We used the Mycotools software suite to acquire genome assembly statistics, annotation statistics, reconstruct phylogenies and synteny diagrams of the nitrate assimilation gene cluster, identify singlecopy orthologs, reconstruct a 14-gene species phylogeny, and generate a microsynteny tree that recapitulates gene order divergence between species. All computation was conducted using 12 Quad Core Intel Xeon 6148 Skylake processors and used 1.43 GB RAM.

RESULTS

Mycotools automatically curates and assimilates public genomic data into a local database. The primary Mycotools-formatted database (MTDB) of all publicly-available fungi contains 3,709 fungal genomes (2023, April 11) and the prokaryote MTDB contains 267,857 (2022, September 22). Data assimilation is primarily limited by the query rate of GenBank (600 queries per minute) and

MycoCosm (1 query per minute).

To demonstrate the efficiency of Mycotools for routine-to-complex genomic analyses, we performed test analyses (Figure 3) referencing 42 publicly available Ustilaginomycotina (Basidiomycota, Fungi) genomes. Each analysis is easily executed with a single command. The local assimilation of the reference dataset MycotoolsDB was completed in 83 minutes (m) using mtdb update. Annotation statistics for the entire database were obtained in 3 seconds (s) using annotationStats, while assembly statistics were gathered in 5s using assemblyStats. Gene phylogenies and synteny diagrams of Ustilaginomycotina nitrate assimilation gene cluster reconstructed using crap took 2m 23s. For generating a multigene species phylogeny, we identified 14 single-copy orthologs in 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 all genomes in 31s using db2search. Then a robust multigene partition phylogeny generated using fa2tree with 1000 ultrafast IQ-TREE bootstrap replicates took 924m 5s. Finally, a maximum-likelihood microsynteny tree representing gene order divergence between species reconstructed using db2microsyntree took 2m 33s.

Beta versions of Mycotools have been implemented to facilitate novel biological discovery using large-scale local comparative genomic analysis. The Mycotools CRAP pipeline is a start-to-finish phylogenetic and synteny analysis pipeline (16)that has been used to identify horizontal transfer of the neuroactive ergot alkaloid biosynthetic gene cluster across multiple taxonomic classes (31). A separate analysis identified a prospective mechanism of fungal horizontal transfer by identifying transposases flanked with gene clusters using MTDB to enable large scale gene cluster identification and homology searching across a database of 1,649 publicly available fungal genomes (5). Mycotools also facilitated the identification of lineage-specific duplications of the industrially-important flavor genes of shiitake mushrooms by pipelining an iterative phylogenetic reconstruction analysis using 451 genomes of the mushroom-forming order, Agaricales (Basidiomycota, Fungi) (32). In addition to these analyses, Mycotools is directly integrated with the gene cluster detection algorithm, CLOCI as the source of data input for a pilot analysis of 2,247 fungal genomes (33). Mycotools has also been used to expedite data acquisition, genome annotation, and phylogenetic analysis of genes for novel genomes (34–36).

AVAILABILITY AND FUTURE DIRECTIONS

Mycotools is freely available under a BSD 3-clause public license at github.com/xonq/mycotools. The MTDB design is taxonomy-agnostic, and is poised to suit all taxonomic domains in the future. Future iterations will improve efficiency and scalability by vectorizing serial computing functions and converting to compiled languages, such as Rust.

ACKNOWLEDGEMENTS

We would like to acknowledge the Ohio Supercomputer Center for offering cutting-edge 194 computing resources and Emile Gluck-Thaler, Kelsey Scott, Guillermo Valero David, Isabel Emmanuel, Lauren Slattery, Nicolle Omiotek, and Hannah Toth for implementing and testing the alpha

Mycotools software suite.

S. W. Attwood, S. C. Hill, D. M. Aanensen, T. R. Connor, O. G. Pybus, Phylogenetic and

phylodynamic approaches to understanding and combating the early SARS-CoV-2 pandemic. Nat Rev Genet. 23, 547–562 (2022).

A. Lai, A. Bergna, C. Acciarri, M. Galli, G. Zehender, Early phylogenetic estimate of the effective reproduction number of SARS-CoV-2. Journal of Medical Virology. 92, 675–679 (2020). A. Nemudryi, A. Nemudraia, T. Wiegand, K. Surya, M. Buyukyoruk, C. Cicha, K. K. Vanderwood, R. Wilkinson, B. Wiedenheft, Temporal Detection and Phylogenetic Assessment of SARS-CoV-2 in Municipal Wastewater. Cell Reports Medicine. 1, 100098 (2020). resolving the phylogeny and metabolic potential of the representative of a deeply branching, uncultivated lineage. ISME J. 10, 833–845 (2016).

E. Gluck-Thaler, T. Ralston, Z. Konkel, C. G. Ocampos, V. D. Ganeshan, A. E. Dorrance, T. L. Niblack, C. W. Wood, J. C. Slot, H. D. Lopez-Nicora, A. A. Vogan, Giant Starship Elements Mobilize Accessory Genes in Fungal Genomes. Molecular Biology and Evolution. 39, msac109 Tavera, C. P. Sansaloni, J. Burgueño, C. Ortiz, C. L. Aguirre-Mancilla, J. G. Ramírez-Pimentel, P. Vikram, S. Singh, GWAS to Identify Genetic Loci for Resistance to Yellow Rust in Wheat PreBreeding Lines Derived From Diverse Exotic Crosses. Frontiers in Plant Science. 10 (2019) (available at https://www.frontiersin.org/articles/10.3389/fpls.2019.01390).

S. Wyka, S. Mondo, M. Liu, V. Nalam, K. Broders, A large accessory genome and high recombination rates may influence global distribution and broad host range of the fungal plant pathogen Claviceps purpurea. PLOS ONE. 17, e0263496 (2022).

I. V. Grigoriev, R. Nikitin, S. Haridas, A. Kuo, R. Ohm, R. Otillar, R. Riley, A. Salamov, X. Zhao, F. Korzeniewski, T. Smirnova, H. Nordberg, I. Dubchak, I. Shabalov, MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Research. 42, D699–D704 (2014).

G. Koczyk, J. Pawłowska, A. Muszewska, Terpenoid Biosynthesis Dominates among Secondary Metabolite Clusters in Mucoromycotina Genomes. Journal of Fungi. 7, 285 (2021).

Groenewald, C. W. Dunn, C. T. Hittinger, X.-X. Shen, A. Rokas, A genome-scale phylogeny of the kingdom Fungi. Current Biology. 31, 1653-1665.e5 (2021). 11. J. C. Navarro-Muñoz, J. Collemare, Evolutionary Histories of Type III Polyketide Synthases in

Fungi. Front. Microbiol. 10 (2020), doi:10.3389/fmicb.2019.03018. 12. A. Rokas, J. H. Wisecaver, A. L. Lind, The birth, evolution and death of metabolic gene clusters in fungi. Nat Rev Microbiol. 16, 731–744 (2018).

Oakley, Genome-Based Deletion Analysis Reveals the Prenyl Xanthone Biosynthesis Pathway in

Corradi, I. Grigoriev, A. Gryganskyi, T. Y. James, K. O’Donnell, R. W. Roberson, T. N. Taylor, J. Uehling, R. Vilgalys, M. M. White, J. E. Stajich, A phylum-level phylogenetic classification of zygomycete fungi based on genome-scale data. Mycologia. 108, 1028–1046 (2016).

Bioinformatics. 26, 841–842 (2010). 16. J. C. Slot, A. Rokas, Horizontal Transfer of a Large and Highly Toxic Secondary Metabolic Gene

Cluster between Fungi. Current Biology. 21, 134–139 (2011). genome microsynteny-based phylogeny of angiosperms. Nat Commun. 12, 3498 (2021).

Phylogenomic Data. Molecular Biology and Evolution. 33, 1635–1638 (2016). 19. M. N. Price, P. S. Dehal, A. P. Arkin, FastTree 2 – Approximately Maximum-Likelihood Trees for

Large Alignments. PLoS ONE. 5, e9490 (2010). 20. R. Knight, P. Maxwell, A. Birmingham, J. Carnes, J. G. Caporaso, B. C. Easton, M. Eaton, M.

Hamady, H. Lindsay, Z. Liu, C. Lozupone, D. McDonald, M. Robeson, R. Sammut, S. Smit, M. J. Wakefield, J. Widmann, S. Wikman, S. Wilson, H. Ying, G. A. Huttley, PyCogent: a toolkit for making sense from sequence. Genome Biol. 8, R171 (2007).

Lanfear, IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the DIAMOND. Nat Methods. 18, 366–368 (2021). searching. Nucleic Acids Res. 39, W29–W37 (2011).

Acids Research. 34, W6–W9 (2006).

Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, É. Duchesnay, Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 12, 2825–2830 (2011). 27. K. Katoh, K. Misawa, K. Kuma, T. Miyata, MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002). trimming software for accurate phylogenomic inference. PLOS Biology. 18, e3001007 (2020). 29. B. Chapman, J. Chang, Biopython: Python Tools for Computational Biology. SIGBIO Newsl. 20, 15–19 (2000).

Slot, Endophyte genomes support greater metabolic gene cluster diversity compared with non

Steenwyk, A. Rokas, J. Carro, S. Camarero, P. Ferreira, G. Molpeceres, F. J. Ruiz-Dueñas, A. Serrano, B. Henrissat, E. Drula, K. W. Hughes, J. L. Mata, N. K. Ishikawa, R. Vargas-Isla, S. Ushijima, C. A. Smith, J. Donoghue, S. Ahrendt, W. Andreopoulos, G. He, K. LaButti, A. Lipzen, V. Ng, R. Riley, L. Sandor, K. Barry, A. T. Martínez, Y. Xiao, J. G. Gibbons, K. Terashima, I. V. Grigoriev, D. Hibbett, A global phylogenomic analysis of the shiitake genus Lentinula.

Proceedings of the National Academy of Sciences. 120, e2214076120 (2023).

33. Z. Konkel, L. Kubatko, J. Slot, CLOCI: Unveiling cryptic gene clusters with generalized detection 34. I. B. Emanuel, Z. M. Konkel, K. L. Scott, G. E. Valero David, J. C. Slot, F. Peduto Hand, WholeGenome Sequence Data for the Holotype Strain of Diaporthe ilicicola, a Fungus Associated with Latent Fruit Rot in Deciduous Holly. Microbiology Resource Announcements. 11, e00631-22 (2022). 35. Z. Konkel, K. Scott, J. C. Slot, Draft Genome Sequence of the Termite-Associated “Cuckoo Fungus,” Athelia (Fibularhizoctonia) sp. TMB Strain TB5. Microbiology Resource

Announcements. 10, e01230-20. First Report of Colletotrichum sansevieriae Causing Anthracnose of Snake Plant (Dracaena trifasciata) in Ohio and its Draft Genome. Plant Dis (2022), doi:10.1094/pdis-10-22-2476-pdn.

Hiras ,

Y.-W.

Wu ,

S. A.

Eichorst ,

B. A.

Simmons ,

S. W.

Singer , Refining the phylum Chlorobi by

10.

Li ,

J. L.

Steenwyk ,

Chang ,

Wang ,

T. Y.

James ,

J. E.

Stajich ,

J. W.

Spatafora , M. 13 . J. F. Sanchez , R.

Entwistle , J.-H.

Hung , J.

Yaegashi , S.

Jain , Y. -M. Chiang , C. C. C.

Wang , B. R.

Aspergillus nidulans . J. Am. Chem. Soc . 133 , 4010 - 4017 ( 2011 ). 14. J. W. Spatafora , Y.

Chang , G. L.

Benny , K.

Lazarus , M. E.

Smith , M. L.

Berbee , G. Bonito, N.

15 . A. R. Quinlan , I. M. Hall , BEDTools: a flexible suite of utilities for comparing genomic features . 17. T. Zhao , A.

Zwaenepoel , J.-Y.

Xue , S.-M.

Kao , Z.

Li , M. E.

Schranz , Y. Van de Peer, Whole18. J. Huerta-Cepas , F. Serra , P. Bork, ETE 3: Reconstruction , Analysis , and Visualization of 21. B. Q.

Minh , H. A.

Schmidt , O.

Chernomor , D.

Schrempf , M. D.

Woodhams , A. von Haeseler , R.

Genomic

Era . Molecular Biology and Evolution . 37 , 1530 - 1534 ( 2020 ). 22. M. Steinegger , J. Söding,

MMseqs2 enables sensitive protein sequence searching for the analysis

of massive data sets . Nat Biotechnol . 35 , 1026 - 1028 ( 2017 ). 23. B. Buchfink , K. Reuter , H.-G. Drost, Sensitive protein alignments at tree-of-life scale using 24 . R. D. Finn , J.

Clements , S. R.

Eddy , HMMER web server: interactive sequence similarity 25 . J. Ye , S.

McGinnis , T. L.

Madden , BLAST: improvements for better sequence analysis . Nucleic 26 .

Pedregosa ,

Varoquaux ,

Gramfort ,

Michel ,

Thirion ,

Grisel ,

Blondel , P.

28. J. L. Steenwyk , T. J. B.

Iii , Y.

Li , X.-X.

Shen , A. Rokas,

ClipKIT: A multiple sequence alignment 30 . V. Zulkower , S.

Rosser, DNA Features Viewer: a sequence annotation formatting and plotting

library for Python. Bioinformatics . 36 , 4350 - 4352 ( 2020 ). 31.

Scott ,

Konkel ,

Gluck-Thaler ,

G. E. V.

David ,

C. F.

Simmt ,

Grootmyers ,

Chaverri , J. 32 . S. Sierra-Patev , B.

Min , M.

Naranjo-Ortiz , B.

Looney , Z.

Konkel , J. C.

Slot , Y.

Sakamoto , J. L.