The project was granted a no cost extension of Year 1 into July 2020; we are now into Year 2. A comprehensive overview of progress since project inception through October 2020 follows below. Relevant Objectives are summarized below.Year 1: Objective 1: Produce suitable plant materials, isolate high-quality genomic DNA samples for sequencing, generate PacBio genome sequence data, and assemble PacBio sequence reads as sequencing data are received. Objective 2: Prepare RNA libraries for transcriptome sequencing, generate full-length transcript sequencing data on Nanopore sequencers. Year 2: Objective 1: Produce HI-C sequence data, polish and finalize genome assemblies at the chromosomal scale. Objective 2: Assemble transcriptomes, analyze gene structures, and annotate. Objective 3: Complete haplotype phasing, conduct structural and relational genomic studies, generate a list of all genes in each of the five genomes. Write manuscripts. Results Plant materials were produced and HMW-DNA samples were prepared from Valencia orange (S, sensitive), Ruby Red grapefruit (S), Clementine mandarin (S), LB8-9 Sugar Belle® mandarin hybrid (T, tolerant), and Lisbon lemon (T). We generated raw sequence data for all 5 genomes and preliminary assemblies and analyses were carried out. For four of the five genomes, the results exceeded the quality of any other publicly available citrus reference genomes, even before Dovetail Hi-C proximity ligation sequencing to finalize assembly at the chromosome level. However, the quantity of grapefruit sequence was insufficient, so we prepared a new sample Ruby Red grapefruit HMW-DNA. Technical issues with the PacBio Sequel II platform at the UCB sequencing facility, and it turns out at other sequencing centers as well, were encountered and resolved; there is now sufficient grapefruit sequence to proceed with assembly and other downstream activities. Because of reduced sequencing costs, we were able to enter additional important genomes into the pipeline beyond those originally proposed, including Carrizo citrange, sour orange, and Shekwasha (an important breeding parent for HLB tolerance). We performed Hi-C sequencing with two genomes and incorporated these data with PacBio sequence of one of our target genomes resulting in an improved chromosome scale assembly. The two parental chromosomes of the target genome have been phased/separated using Illumina short reads from citrons, pummelos and mandarins. By genome alignment and comparison to the Poncirus assembly (see below), minor assembly errors in repetitive regions have been fixed, resulting in a polished assembly; materials have been collected for transcript sequencing for annotation (i.e. identify all the genes within the genome). The availability of high-quality assemblies for the 3 basic species (C. medica, reticulata, and maxima) will allow a more thorough and complete characterization of large-scale structural variation (SVs: deletions, insertions, etc.) in genomes of commercial interest. These SVs are the driving force for phenotypic diversity especially among somatic mutants (e.g. different oranges, grapefruits). A manuscript is in preparation on this work. A previously funded CRDF project supported the initiation of a project producing the first ever high-quality reference genome of Poncirus trifoliata using the same pieline, and under this current project the task was completed; a manuscript was accepted for publication in The Plant Journal, and the sequence will be released to the global citrus research community through Phytozome upon publication. By mining this new genome, we identified candidate genes within previously identified chromosomal regions for HLB tolerance, including a transcription factor gene and one disease resistance-like gene that are up-regulated by CLas and positively selected in trifoliate orange. These genes are promising candidate genes for further research. Conclusions1. We completed all genome sequencing work under Year 1, Objective 1. Additional important genomes have been entered into the pipeline because of reduced costs for sequencing, and sequence reads have been produced.2. We have not yet generated all the full-length transcript sequence data, as proposed for Year 1. This goal was compromised for several reasons beyond our control, but progress is now being made.3. Hi-C sequencing for proximity ligation was completed for two genomes, and along with Illumina short reads resulted in a phased chromosome scale assembly of one target genome. 4. We have produced a genome assembly of Poncirus, an important source of resistance to CLas and HLB that exceeds the quality of all previously produced citrus genomes, using our pipeline. This genome assembly was used to repair minor assembly errors in repetitive regions of the genome mentioned in 3 above, and by mining the sequence we have identified genes to be targeted for HLB resistance.