Comparative genomics of Pseudomonas syringae reveals convergent gene gain and loss associated with specialization onto cherry (Prunus avium)

Genome-wide analyses of the effector- and toxin-encoding genes were used to examine the phylogenetics and evolution of pathogenicity amongst diverse strains of Pseudomonas syringae causing bacterial canker of cherry ( Prunus avium ), including pathovars P. syringae pv morsprunorum ( Psm ) races 1 and 2, P. syringae pv syringae ( Pss ) and P. syringae pv avii. (cid:1) Phylogenetic analyses revealed Psm races and P. syringae pv avii clades were distinct and were each monophyletic, whereas cherry-pathogenic strains of Pss were interspersed amongst strains from other host species. (cid:1) A maximum likelihood approach was used to predict effectors associated with pathogenicity on cherry. Pss possesses a smaller repertoire of type III effectors but has more toxin biosynthesis clusters than Psm and P. syringae pv avii . Evolution of cherry pathogenicity was correlated with gain of genes such as hopAR1 and hopBB1 through putative phage transfer and horizontal transfer respectively. By contrast, loss of the avrPto/hopAB redundant effector group was observed in cherry-pathogenic clades. Ectopic expression of hopAB and hopC1 triggered the hypersensitive reaction in cherry leaves, conﬁrming computational predictions. (cid:1) Cherry canker provides a fascinating example of convergent evolution of pathogenicity that is explained by the mix of effector and toxin repertoires acting on a common host.

Comparative genomics of Pseudomonas syringae reveals convergent gene gain and loss associated with specialization onto cherry (Prunus avium) Introduction Pseudomonas syringae is a species complex, associated with plants and the water cycle, comprising several divergent clades that frequently recombine (Young, 2010;Parkinson et al., 2011;Berge et al., 2014;Baltrus et al., 2017). It is a globally important pathogen, causing disease on over 180 different host species. P. syringae is responsible for recurring chronic diseases in perennial crops, such as cherry canker (Lamichhane et al., 2014), and also sporadic outbreaks on annual crops, such as bacterial speck of tomato (Solanum lycopersicum) (S ßahin, 2001). Individual strains are reported to be specialized and assigned to over 60 hostspecific pathovars; some of these are further divided into races which show host genotype specificity (Joardar et al., 2005). Strains also exist that can infect a variety of crop species, indicating that specialization is not always the norm (Monteil et al., 2013;Bartoli et al., 2015a,b). This complexity makes P. syringae an important model to study the evolution of host specificity (O'Brien et al., 2011;Mansfield et al., 2012).
High-throughput sequencing has become a major tool in bacteriology (Edwards & Holt, 2013). With the increasing number of genomes available, population-level analyses allow complex evolutionary questions to be addressed, such as how disease epidemics emerge and what ecological processes drive the evolution of pathogenicity Monteil et al., 2016). Before genomic methods were available, mutational studies of P. syringae were used to identify functional virulence factors in pathogenesis, such as type III secretion system effector (T3E) repertoires and toxins (Lindgren, 1997;Bender et al., 1999). Some T3Es were also shown to act as avirulence (avr) factors when detected by a corresponding pathogen recognition (R) protein in the host, leading to effectortriggered immunity (ETI) (Jones & Dangl, 2006). ETI is often associated with the hypersensitive response (HR) which is a cell death mechanism important in preventing pathogen spread (Morel & Dangl, 1997). Evidence suggests that P. syringae has evolved a functionally redundant repertoire of effectors, which is postulated to allow pathogen populations to lose/modify expendable avr elicitors, with minimal impact on overall virulence (Arnold & Jackson, 2011). It is proposed that as pathogen lineages specialize, they finetune their effector repertoires to maximize fitness in this niche by ensuring adequate growth and transmission, whilst avoiding detection by the plant immune system. Host range becomes restricted because the pathogen may lose effectors important for disease on other hosts or gain effectors detected by other plant species (Schulze- This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. Lefert & Panstruga, 2011). Many genomics studies have therefore focused on linking variation in virulence gene complements with particular diseases (Baltrus et al., 2011(Baltrus et al., , 2012O'Brien et al., 2012). Much of the understanding of P. syringae-plant interactions has been achieved using herbaceous plant models. Woody pathosystems provide a greater challenge (Lamichhane et al., 2014). Population genomics of P. syringae pv actinidiae, the causal agent of kiwifruit canker, revealed that three pathogenic clades, with distinct effector sets, have arisen during kiwifruit cultivation (McCann et al., 2013. A study of the olive pathogen P. syringae pv savastanoi revealed that the hopBL effector family is overrepresented in wood-infecting pathovars (Matas et al., 2014). Genes involved in the metabolism of aromatic compounds, phytohormone production, tolerance to reactive oxygen species and sucrose metabolism have also been associated with virulence on woody tissues (Green et al., 2010;Bartoli et al., 2015b;Buonaurio et al., 2015;Nowell et al., 2016).
This study used genomics to examine the evolution of strains that cause bacterial canker on sweet and wild cherry (both Prunus avium). Clades of P. syringae that constitute the main causal agents of bacterial canker include P. syringae pv morsprunorum (Psm) race 1 and race 2 and P. syringae pv syringae (Pss) (Bull et al., 2010;Bultreys & Kaluzna, 2010). In addition, P. syringae pv avii causes bacterial canker of wild cherry (M enard et al., 2003). The cherrypathogenic clades of P. syringae are reported to exhibit differences in virulence, host range and lifestyle (Crosse & Garrett, 1966;Scortichini, 2010), making the P. syringae-cherry interaction a good pathosystem to study convergent gain of pathogenicity. The genomic analysis has been coupled with robust pathogenicity testing (Hulin et al., 2018) and functional analysis of potential avirulence (avr) genes. This study provides a proof of concept that genomics-based methods can be used to identify candidate genes involved in disease and will likely become the major tool in disease monitoring, diagnostics and host range prediction.

Bacteria, plants and pathogenicity tests
Methods used for bacterial culture and sources of plants were as described in Hulin et al. (2018) and are detailed in Supporting Information Methods S1. Plant species utilized included P. avium L. and Nicotiana tabacum L. Pseudomonas strains are listed in Table 1. Escherichia coli was grown on lysogeny broth (LB) agar plates and grown overnight at 37°C. Antibiotic concentrations: kanamycin, 50 lg ml À1 ; gentamycin, 10 lg ml À1 ; spectinomycin, 100 lg ml À1 ; nitrofurantoin, 100 lg ml À1 . X-gal was used at a concentration of 80 lg ml À1 . Tables S1-S3 list the P. syringae mutants, plasmids and primers used in this study.
Pathogenicity tests, performed on detached cherry leaves and analysed as in Hulin et al. (2018) are described in Methods S1.

Genome sequencing, assembly and annotation
Bioinformatics commands for analyses performed in this paper are available on Github (https://github.com/harrisonlab/pseud omonas/). Genome sequencing using Illumina and genome assembly were performed as in Hulin et al. (2018). For long-read sequencing, PacBio (Pacific Biosystems, Menlo Park, CA, USA) and MinION (Oxford Nanopore, Oxford, UK) were used. High molecular weight DNA was extracted using a cetyltrimethylammonium bromide method (Feil et al., 2012). For the PacBio sequencing of strains R1-5244, R2-leaf and syr9097, DNA samples were sent to the Earlham Institute (Norwich) for PacBio RSII sequencing.
Plasmid profiling was performed using an alkaline-lysis method and gel electrophoresis (Moulton et al., 1993;Neale et al., 2013). Genomes were submitted to GenBank and accession numbers are listed in Table 1.

Orthology analysis
OrthoMCL (Li et al., 2003) was used to identify orthologous genes. All genomes were re-annotated using RAST (Aziz et al., 2008) to ensure similar annotation quality. For this reason, the Illumina short-read assemblies of the four long-read sequenced genomes (R1-5244, R1-5300, R2-leaf and syr9097) were used in orthology analysis. OrthoMCL was run with default settings and a 50 residue cut-off length. All RAST annotated protein files used in this analysis are available on Github (https://github.com/ harrisonlab/pseudomonas/).

Phylogenetic and genomic analysis of Pseudomonas syringae
Nucleotide sequences of single-copy genes present in all genomes were aligned using CLUSTALW (Larkin et al., 2007) and trimmed using GBLOCKS (Castresana, 2000). Gene alignments were concatenated using GENEIOUS (Kearse et al., 2012). RAxML-AVX v.8.1.15 (Stamatakis, 2014) was used in partitioned mode to build the maximum likelihood phylogeny, with a general time reversible (GTR) gamma model of substitution and 100 nonparametric bootstrap replicates. To detect core genes that may have undergone recombination, the program GENECONV (Sawyer, 1989) was used as in Yu et al. (2016). Whole-genome alignments were performed using PROGRESSIVEMAUVE (Darling et al., 2010).

Virulence and mobility gene identification
All T3E-encoding protein sequences were downloaded from pseudomonas-syringae.org, including the recent classification of Hulin et al.
Hulin et al.
Hulin et al.
Hulin et al.
Hulin et al.
Hulin et al.
Hulin et al.

New Phytologist
HopF effectors into four alleles (Lo et al., 2016). tBLASTN (Altschul et al., 1990) was used to search each genome for homologues with a score of ≥ 70% identity and ≥ 40% query length to at least one sequence in each effector family. Nucleotide sequences were extracted and manually examined for frameshifts or truncations. Disrupted genes were classed as pseudogenes. A heatmap of effector presence was generated using heatmap.2 in gplots (Warnes et al., 2016). Interproscan (Quevillon et al., 2005) identified protein domains, and Illustrator for Biological Sequences was used for visualization (Liu et al., 2015). Genomic regions containing effectors were aligned using MAFFT (Katoh et al., 2002). Pathovar designation, phylogroup, isolation information, cherry pathogenicity (reference for when tested; nt, not tested), publication of genome sequence and NCBI accession numbers are listed. Strains in bold were considered pathogenic in cherry. cv, cultivar of sweet cherry or plum. Long-read sequenced genomes are highlighted with shading. Strains are ordered, first with those sequenced in this study, followed by other Pseudomonas strains from cherry and plum used in plasmid profiling analysis and previously pathogenicity tested in Hulin et al. (2018). Next, further strains isolated from cherry and plum sequenced elsewhere are listed. Finally, the remaining strains were only used in comparative analysis. Note that all 108 genomes were used in initial orthology analysis but only 102 were used in the final phylogeny and comparative genomics. *The pathogenic status of MAFF302280 on cherry is debated. This strain is reported to be the pathotype strain of P. syringae pv morsprunorum (Psm; Sawada et al., 1999), so is assumed to be equivalent to CFBP 2351, NCPPB2995, ICMP5795 and LMG5075. The strain NCPPB2995 was reported to be potentially nonpathogenic (Gardan et al., 1999). Whilst, the 'same' strain LMG5075 tested positive for pathogenicity in a recent publication (Gilbert et al., 2009). There is no definite link showing that MAFF302280 is the same strain as the others listed as it is not linked to them in online databases (http:// www.straininfo.net/) or taxonomy-focused publications (Bull et al., 2010). It is assumed to be putatively pathogenic in this study owing to its close relatedness to other Psm R2 strains; however, further pathogenicity tests would be required to fully confirm this. A similar analysis was performed for phytotoxin and auxin biosynthesis, wood degradation, ice nucleation and plasmidassociated genes. Protein sequences were obtained from NCBI (Table S4) and blasted against the genome sequence as noted earlier. Prophage identification was performed using PHASTER (Arndt et al., 2016).

Gain and loss analysis
Gain loss mapping engine (GLOOME) was used to plot the gain and loss of genes on the core-genome phylogeny (Cohen et al., 2010). Effector genes were considered present even if predicted to be pseudogenes, as these can still be gained and lost. The optimization level was set to 'very high', a mixture model allowing variable gain and loss distributions was used and the distribution type was GENERAL_GAMMA_PLUS_INV. Highly probable events (P ≥ 0.80) on the branches leading to cherry-pathogenic strains were extracted.

BayesTraits analysis
BAYESTRAITS v.2 was used to correlate T3E gene evolution with pathogenicity (Pagel, 2004). A binary matrix was created of effector family presence and pathogenicity of each strain. The effector matrix was collapsed into effector families, as the different alleles likely perform similar biological functions in planta (Cunnac et al., 2011). Putative pseudogenes were considered absent, as they may be nonfunctional. The BAYESTRAITS methodology followed an approach as in Press et al. (2013) and is described in detail in Methods S1.

Horizontal gene transfer analysis
For each effector family, best-hit nucleotide sequences were aligned using CLUSTALW (Larkin et al., 2007). RAxML was used to build a phylogenetic tree with a GTR model of evolution and 1000 bootstrap replicates. Incongruence with the core-genome tree was examined visually. To further assess horizontal transfer, a species-gene tree reconciliation method RANGER-DTL v.2 (Bansal et al., 2012) was used, as in Bruns et al. (2018). Full details are described in Methods S1.

Identification of genomic islands
Genomic islands (GIs) were identified in the PacBio-sequenced cherry pathogenic strains using ISLANDVIEWER3 (Dhillon et al., 2015). Islands were manually delimited as in McCann et al. (2013). BLASTN was utilized to determine if these GIs were present in other P. syringae genomes. As most GIs exceeded 10 kb and most genome assemblies were highly fragmented, the islands were split into 0.5 kb sections before analysis to prevent false negatives due to contig breaks. An island was concluded as likely to be present if all sections produced hits with a query length > 30%. To validate this approach, the Illumina-sequenced genome assemblies of the PacBio-sequenced strains were searched for their own islands.

General DNA manipulations and bacterial transformations
Cloning and other molecular biology techniques, including ectopic expression of potential avr genes, were as described in earlier studies (Staskawicz et al., 1984;Arnold et al., 2001;. Details are provided in Methods S1.

Genome assembly and sequencing statistics
Eighteen P. syringae strains isolated from cherry and plum were phenotyped for pathogenicity and genome sequenced in a previous study (Hulin et al., 2018). To increase this sample, nine strains isolated from wild cherry and five additional non-Prunus out-groups were genome sequenced using the Illumina MiSeq. The genomes of eight cherry strains sequenced elsewhere (Baltrus et al., 2011;Nowell et al., 2016) were also downloaded from NCBI.
Information on the origin and pathogenicity of each strain is summarized in Table 1. Twenty-eight were considered pathogenic to cherry, including all Pss isolated from cherry and plum. By contrast, three Psm race 1 strains from plum (R1-5300, R1-9326 and R1-9629) and one from cherry, strain R1-9657, failed to induce canker on cherry following tree inoculations; and three strains of unknown taxonomy isolated from plum and cherry (Ps-9643, Ps-7928C and Ps-7969) were also nonpathogenic (references in Table 1). The cherry pathogens are referred to as their described pathovar names throughout this study. To simplify figures, cherry pathogens are highlighted and the first few letters of the pathovar name were used. 'Pss' becomes 'syr', as otherwise Pss could refer to other pathovars beginning with 's' (e.g. savastanoi).
All strains included in this study were sequenced using Illumina MiSeq. Three representative strains (R1-5244, R2-leaf and syr9097) were sequenced using PacBio and the nonpathogenic Psm R1 strain R1-5300 was sequenced using Oxford Nanopore, to obtain more complete genomes. Table 2 summarizes the genome assembly statistics for all strains sequenced in this study and Hulin et al. (2018). Illumina genomes assembled into 23-352 contigs, whilst the long-read sequenced genomes assembled into one to six contigs. The number of plasmids in each strain was determined by plasmid profiling (Fig. S1). Psm R1 and R2 strains possessed between two and six plasmids, P. syringae pv avii 5271 possessed six plasmids, whereas, apart from three strains (syr5275, syr7928A, syr9644) with one plasmid each, most cherry-pathogenic strains of Pss did not possess plasmids. The strain syr9097, which was sequenced using PacBio, lacked plasmids. The genomes sequenced with long-read technology all assembled into the correct number of contigs corresponding to chromosome and plasmids, apart from R1-5300. The chromosome of this strain was separated into two contigs (tig0 and tig75), based on a whole-genome alignment with Psm R1 strain R1-5244 (Fig. S2).

Core-genome phylogenetic analysis
To examine the relatedness of strains, an analysis of core genes was carried out using 108 genomes of strains from the wellstudied phylogroups 1-3 isolated from both plants and aquatic environments. A maximum likelihood phylogeny based on 1035 concatenated core genes was constructed (Fig. S3). There was low support for certain P2 and P3 clades based on bootstrap analysis. To determine if particular taxa were causing low support, the analysis was systematically repeated for the two phylogroups, with non-cherry strains removed. Support and tree likelihood values were compared (Table S8). Within P3, the removal of P. syringae pv eriobotryae or P. syringae pv daphniphylli improved support, whilst the removal of P. syringae pv syringae 1212 improved support values in P2 (Figs S4, S5). The global analysis was then repeated with these taxa removed (Figs S6-S9). The final phylogeny (Fig. 1), with the highest mean branch support (92.8%), lacked P. syringae pv eriobotryae. The phylogeny, built using a 611 888 bp alignment, contained 102 taxa due to the removal of strains found to be identical to others (dendro4219, syr9630, R1-9629, R1-9326 and R1-5269). Most support values exceeded 70%, with good support for branches leading to cherrypathogenic clades.
One explanation for the low support within P2 and P3 was that these clades have undergone core-genome recombination. The program GENECONV (Sawyer, 1989) showed that 140 genes had putatively recombined (127 288 bp total length, 20.8% of the alignment). Table S9 lists the number of recombination events per phylogroup. The most frequent core gene recombination occurred in P3 (73 genes affected), followed by 31 genes in P2 and only 13 in P1.
Cherry pathogens were found in all three phylogroups. The two Psm races (R1 in P3, R2 in P1) and P. syringae pv avii (P1)

Research
New Phytologist formed monophyletic clades. Within Psm R1, strains pathogenic to cherry formed a clade distinct from previously classified nonpathogenic strains (Hulin et al., 2018), indicating that there has been divergence in their core genomes. By contrast, Prunusinfecting strains of Pss were found across P2, interspersed with strains isolated from other plants and aquatic environments. To ensure that genomic comparisons between P2 strains were based on differential pathogenicity, several related non-Prunus strains were pathogenicity tested on cherry leaves. In planta bacterial populations of non-Prunus strains were reduced compared with Prunus Pss strains ( Fig. S10; Table S10).

Search for virulence factors
The hrp pathogenicity island All sequenced strains contained the hrp pathogenicity island required for conventional Type III secretion. Core effector genes from the adjacent conserved effector locus (Alfano et al., 2000), such as avrE1, hopM1 and hopAA1, were present (Fig. S11), However, hopAA1 was truncated in both Psm R1 and R2 due to inversion events. The hopAA1 gene was truncated in Psm R2, whilst in Psm R1 both hopAA1 and hopM1 were truncated (Fig. S12).
Type III effectors and other virulence genes Genomes were then scanned for known virulence genes and a heatmap of presence, absence and pseudogenization was constructed (Fig. 2). In terms of T3Es, there was variation both between and within the different cherry-pathogenic clades. Notably, Psm R1, which contained strains pathogenic and nonpathogenic on cherry, showed clear differentiation in effector repertoires (Table S11). Psm R1, R2 and P. syringae pv avii possessed 24-34 effector genes, whereas Pss strains possessed nine to 15. The reduced effector repertoire of Pss was representative of P2 strains as previously noted (Baltrus et al., 2011;Dudnik & Dudler, 2014). Table 3 lists the effectors in each long-read genome assembly in order of appearance.
Non-T3 virulence factors were identified. All pathogenic Psm R1 strains possessed the coronatine biosynthesis clusters, which were plasmid borne in Psm R1-5244. All cherry-pathogenic Pss strains possessed at least one biosynthesis gene cluster for the toxins syringomycin, syringolin and syringopeptin, with several strains possessing all three. Strains within clade P2b possessed the biosynthesis genes for mangotoxin. The nonpathogenic cherry P2b strains Ps7928C and Ps7969 lacked all toxin biosynthesis clusters.
A cluster of genes named WHOP (woody hosts and Pseudomonas) thought to be involved in aromatic compound (lignin) degradation (Caballo-Ponce et al., 2016) was present in Psm R1 and R2, whilst P. syringae pv avii and most Pss strains contained no WHOP homologues. Two cherry P2d strains (syr2339 and syr7924) did, however, possess the catechol catBCA cluster. The genomes were also searched for the ice nucleation gene cluster. Members of Psm R1, Pss and P. syringae pv avii strains all possessed genes involved in ice nucleation (Fig. 2), whilst Psm R2 lacked the complete set of ice nucleation genes.
Associating type III effector evolution with host specificity T3E evolution was statistically correlated with cherry pathogenicity, using BAYESTRAITS and GLOOME (Pagel, 2004;Cohen et al., 2010). BAYESTRAITS takes a binary matrix of two traits within a phylogeny and determines if changes in the two characters (effector gene and pathogenicity) have evolved independently or dependently. Fig. 3(a) shows the likelihood ratio of cherry pathogenicity being correlated with each effector family's evolution, with significantly associated effectors highlighted.
BAYESTRAITS analysis using the core-genome phylogeny predicted the evolution of six T3E families was linked to cherry pathogenicity. These were hopBF, hopAB, hopH, hopAR, avrPto and hopBB. To account for any phylogenetic uncertainty, the program was also run 100 times on the full set of 100 bootstrapped trees generated by RAxML. The evolution of T3Es hopBF, hopAR and hopAB was always associated with pathogenicity for all 100 trees in all runs, indicating strong association. However, the T3E genes avrPto, hopBB and hopH were only significantly correlated for 88%, 77% and 62% of trees respectively, averaged across the different runs (Fig. S13). To determine how these genes had been gained or lost across the phylogeny, the program GLOOME was used (Cohen et al., 2010). Fig. 3(b) illustrates the predicted gain and loss of these T3Es on the branches leading to clades pathogenic to cherry. Those putatively associated with pathogenicity (high probability of gain in cherry-pathogenic clades) included hopAR1, hopBB1, hopBF1 and hopH1. The T3Es hopAB1 and avrPto1 were found to be lost from cherry pathogenic Psm R1, whilst the hopAB1 and hopAB3 alleles were pseudogenized in Psm R2 and P. syringae pv avii (Fig. 3b). All effector gain and loss events are presented in Fig. S14 and Table S12. Fig. S15 shows the phylogeny with branch labels used in GLOOME. GLOOME predicted that key effectors have been gained in multiple clades. The hopAR1 gene has been gained in Psm R1, Psm R2, Pss and P. syringae pv avii. The T3E hopBB1 was present in the majority of strains within Psm R1, R2 and P. syringae pv avii but was absent from Pss strains. It showed high probability of gain on branches leading to both Psm R2 and P. syringae pv avii. However, GLOOME predicted loss in two Psm R1 strains, indicating that the gene may have experienced dynamic evolution in cherry pathogens. The hopBB1 effector is closely related to members of the hopF family and avrRpm2 (Lo et al., 2016). In addition to the significant acquisition of hopBB1 homologues, the hopF family was expanded in cherry pathogens. Pathogenic strains in Psm R1 and R2 all possessed two hopF alleles each (hopF3 and hopF4, and hopF2 and hopF4; see Fig. 2). P. syringae pv avii did not possess any hopF homologues, but had gained hopBB1. By contrast, Pss strains lacked all hopF members.

Research
New Phytologist gene transfer (HGT) between the pathogenic clades, as their sequences clustered together. There has been possible effector exchange between Psm R1, R2 and P. syringae pv avii. To predict precisely where transfers had occurred on the phylogeny, the program RANGER-DTL was utilized (Bansal et al., 2012). Table 4 reports T3Es that exhibited evidence of HGT between cherry pathogens (gene trees are presented in Figs S16, S17). Full transfer events are listed in Table S13, and Fig. S18 shows the phylogeny with branch labels used in RANGER-DTL. The BAYESTRAITS pathogenicity-correlated T3Es, hopBB and hopBF, both showed evidence of HGT. Fig. 4(a) shows examples of T3Es putatively undergoing HGT between cherry pathogenic clades highlighted in red. Alignments of the flanking regions (Fig. 4b) showed homology between the cherry pathogens and included mobile elements likely involved in recombination events. Putatively transferred effectors were mostly plasmid encoded in the longread genomes (Table 3). In R1-5244, several of these genes were encoded on one plasmid (Contig 3), whilst in R2-leaf they were found on two plasmids (Contig 6 and 8).
The pathogenicity-associated T3E gene hopAR1 was present in 23/28 cherry pathogens and showed probable gain in pathogenic clades. Phylogenetic analysis of this T3E (Fig. 5a) showed that the sequences for the different cherry pathogenic clades did not cluster with each other, indicating convergent acquisition. Prophage identification (Table S14) did, however, reveal that this T3E is predicted to have been gained in Psm R1 and R2 within different phage sequences, whilst in Pss it is on a genomic island (Fig. 5b), and so has been acquired via distinct mechanisms. The Psm R1 phage is 51.5 kb, described as intact, and contains both hopAR1 and a truncated version of hopBK1. The Psm R2 phage sequence was 37.1 kb and was described as 'incomplete', indicating it did not have all the components of an active prophage. Further analysis of this region in Psm R2 and P2 strains revealed a shared adjacent tRNA-Thr gene (Fig. S19a,b). Within P2, although cherry Pss strains lacked the phage, several strains isolated from bean (syr2675, syr2676 and syr2682) possessed the hopAR1 gene within a phage homologous to that in Psm R2. The syr2675 hopAR1 sequence was also the most closely related homologue of Psm R2 hopAR1 (Fig. 5a). This evidence suggests that this effector gene may have been transferred via phage between phylogroups.
Many T3Es are mobilized between bacteria on GIs. GIs were identified for the three PacBio-sequenced strains of Psm R1, Psm R2 and Pss (Tables S15-S17). R1-5244 GIs contained the coronatine biosynthesis cluster and six T3Es. In R2-leaf, eight T3E genes were located on GIs, whilst in syr9097 three T3Es were found on genomic islands. These GIs were then searched for in other P. syringae genomes to identify potential sources of transfer, and Fig. S20 shows heatmaps of GI presence. The Psm R1 GIs included several found only in pathogenic Psm R1 strains, differentiating them from the nonpathogens. These included the coronatine biosynthesis cluster (GI1), hopF3 (GI6) and hopAT1 (GI14). Most Psm R1 GIs produced hits across P. syringae, particularly in P1 and P3. Psm R2 GIs were most commonly found in P1. Several were shared with other cherry-pathogenic clades, including those containing hopAF1 (GI36), hopAT1 (GI3) and hopD1 (GI6). Finally, although most islands identified in syr9097 were commonly found across the species complex, those containing T3Es (GI30, GI23 and GI26) appeared to be P2 specific, indicating that cherry-pathogenic strains likely gained these islands from other members of P2.

Functional analysis of potential avr genes
To validate predictions from genome analysis, cloning was used to identify avirulence factors active in cherry. The effector genes avrPto and hopAB were absent from cherry pathogens, and their evolution was theoretically linked to pathogenicity. Several other candidate avirulence effectors were identified that were absent from cherry pathogens but present in close out-groups (Fig. 6).
Avirulence-gene identification focused on Psm R1, as any T3E variation within this clade may be due to differences in host specificity rather than phylogenetic distance. Potential avirulence T3E genes included avrA1, avrPto1, hopAA1, hopAB1, hopAO2 and hopG1, which had full-length homologues in nonpathogenic Psm R1 strains but were absent from or truncated in pathogens. These genes were cloned from the strain R1-5300 (except hopAO2, which was cloned from R1-9657). The effector avrRps4 was also cloned from P. syringae pv avellanae (Psv) BPIC631, a close relative of Psm R2. This effector was absent from most cherry-pathogenic strains (Fig. 6). Several pathogens possessed the full-length gene (R2-leaf, R2-9095 and P. syringae pv avii), but lacked the KRVY domain that functions in planta (Fig. S21) (Sohn et al., 2009). The hopAW1 gene was cloned from Pph1448A as this T3E has undergone two independent mutations in Pss strains, disrupting the beginning of the gene (Fig. S22). Finally, hopC1 was cloned from the Aquilegia vulgaris pathogen RMA1, which is basal to the Psm R2 clade as it is absent from all cherry-pathogenic strains.
Nine effectors were cloned into pBBR1MCS-5 and conjugated into three pathogenic strains (R1-5244, R2-leaf and syr9644). The presence of the plasmids did not affect multiplication in vitro. Knock-out strains for the T3SS gene hrpA were obtained for R1-5244 and R2-leaf to act as nonpathogenic controls that could not secrete T3Es and failed to cause the HR on tobacco (Fig. S23).
Bacterial multiplication experiments were conducted in cherry leaves. The transconjugants expressing HopAB1 or HopC1 failed to multiply to the same levels as the pathogenic empty vector (EV) controls or produce disease lesions. The ectopic expression The values are obtained from means of 100 independent runs of the program with error bars showing AE SE above and below the mean. Those effectors that were not significantly associated with pathogenicity are coloured in grey. Coloured bars were associated with pathogenicity (P ≤ 0.05 in > 90% of runs). Those genes that were hypothesized to be gained in cherry pathogens (from gain loss mapping engine (GLOOME) analysis) are coloured in shades of blue, whilst where the significant gene was absent in cherry pathogenic clades the bar is coloured in shades of red. (b) Gain and loss of BAYESTRAITS-associated T3Es in cherry-pathogenic clades on the core-genome phylogeny predicted using GLOOME (P ≥ 0.8). The phylogeny and heatmap of these effector genes is presented (heatmap as in Fig. 2, effector gene names are colour coded based on the bar colours in (a) and cherry-pathogenic strains are highlighted by pink horizontal shading of columns). Phylogroups (P1-P3) are labelled. Strains with long-read sequenced genomes are in black boxes. *The probability of this effector being gained/lost was slightly < 0.8, but exceeded 0.65 (see Supporting Information Table S12 for details). # The hopAR1 gene has been gained on the branch leading to P. syringae pv morsprunorum (Psm) R1 (including pathogens and nonpathogens). (2018)

Research
New Phytologist of AvrA1, AvrRps4 and HopAW1 also caused significant reductions in growth, but this reduction was not consistently seen across all three pathogenic strains (Fig. 7a). As the addition of the hopAB1 gene reduced pathogenicity, full-length hopAB2 and hopAB3 genes were also cloned from PsvBPIC631 and RMA1, and were also found to reduce pathogen multiplication (Fig. 7b).
To investigate the induction of the HR by the HopAB family and HopC1, inoculations were performed at high concentrations (2 9 10 8 colony-forming units (CFU) ml À1 ) as in Hulin et al. (2018). In Psm R1 and R2, the addition of these T3Es led to more rapid tissue collapse than observed in EV controls, indicative of HR induction (Fig. 7c,d); HopC1 and HopAB1 were particularly effective. With Pss, however, EV transconjugants themselves caused rapid tissue collapse, making it impossible to recognize an induced HR as symptom development was not significantly different.
The hopAB1 gene is found in a mobile-element-rich c. 40 kb region in the nonpathogenic Psm R1-5300, missing from the pathogen Psm R1-5244 (Fig. 8a). Meanwhile, Psm R2 and P. syringae pv avii possessed putatively pseudogenized hopAB3 alleles (Fig. 8b), and P. syringae pv avii also possessed a truncated hopAB1 gene (Fig. S24). hopAB3 is truncated in Psm R2 due to a 2 bp insertion (GG at position 1404 bp) leading to a premature stop codon, whilst in P. syringae pv avii a 218 bp deletion has disrupted the C-terminus. If expressed, the E3-ubiquitin ligase domain is completely absent from the Psm R2 protein and disrupted in P. syringae pv avii (Fig. 8c). Both HopAB3 alleles were also divergent enough that the Pto-interacting domain (PID) was not identified by Interproscan. To determine if the truncated Psm R2 HopAB3 allele induced any resistance response in cherry leaves, the gene was expressed in Psm R1-5244 and population growth measured. The addition of this gene did not lead to a significant reduction in growth compared with the EV control, unlike other hopAB alleles (Fig. 8d), and the transconjugant was still able to induce disease symptoms 10 d post inoculation (dpi) (Fig. 8e).
Overall, the data supported the conclusion that expressing alleles of hopAB and hopC reduced bacterial multiplication in cherry and were consistent with HR induction by Psm R1 and R2. However, it should be noted that any growth changes exhibited might have been influenced by aberrant transcription or translation of these effectors in the plant due to expression in trans.

Core-genome phylogenetics
Phylogenetic analysis confirmed that cherry pathogenicity has evolved multiple times within P. syringae. Psm R1, R2 and P. syringae pv avii each formed distinct monophyletic clades, whereas cherry-pathogenic Pss strains were distributed across the P2 clade, indicating that cherry pathogenicity has either evolved multiple times within P2 or that this clade is not particularly specialized. To confirm this genomic prediction of pathogenicity, several additional P2 strains isolated from bean, pea and lilac were tested for pathogenicity in cherry. They each produced lower population levels in cherry leaves than cherry pathogens, suggesting that strains isolated from cherry and plum are more pathogenic to their hosts of origin (Fig. S10). Many P2 strains have previously been named Pss on the basis of lilac pathogenicity, despite being pathogenic to other plant species (Young, 1991). A new naming system within this phylogroup is desirable.

Search for candidate effectors involved in cherry pathogenicity
Gains and losses of T3Es were closely associated with pathogenicity. Virulence-associated effectors hopAR1, hopBB1, hopH1 and hopBF1 had been gained in multiple cherry-pathogenic clades. The hopAR1 effector has been studied in the bean pathogen P. syringae pv phaseolicola R3 (1302A), as a GI-located avr gene (avrPphB) whose protein is detected by the corresponding R3 resistance protein in planta (Pitman et al., 2005;Neale et al., 2016). HopAR1 also acts as a virulence factor as a cysteine protease which targets receptor-like kinases to interfere with plant R2/P. syringae pv avii Next to cluster of mobile elements (next to hopT1) -Y hopT1 R2/P. syringae pv avii Next to cluster of mobile elements (next to hopO1) -Y Where the effector gene is present in the PacBio-or MinION-sequenced strains, its chromosomal or plasmid location is indicated. The type III effector genes hopO1 and hopT1 were not present in the PacBio-sequenced strains and therefore it is uncertain if they are on plasmids or chromosomal. *Effector gene is disrupted in some strains and is labelled as a pseudogene. PAMP-triggered immunity (PTI) responses (Zhang et al., 2010). This effector could play a similar role in PTI suppression in cherry. HopBB1 and other members of the HopF family were abundant in cherry pathogens. All HopF members share an Nterminus and myristoylation sites for plant cell membrane localization (Lo et al., 2016) and interfere with PTI and ETI in model plants (Wang et al., 2010;Wu et al., 2011;Hurley et al., 2014). The presence of multiple hopF homologues in cherry pathogens and specific gain of hopBB1 suggested the importance of their function. In comparison, HopH1 and HopBF1 are understudied. HopH1 is a protease, homologous to the Ralstonia (a) (b) Fig. 4 Horizontal gene transfer has played a key role in the evolution of cherry pathogenicity. (a) Putative horizontal transfer of the hopBB1 and hopBF1 genes between different cherry-pathogenic clades based on the Pseudomonas syringae core-genome phylogeny. The different phylogroups are labelled (P1-P3), with P2 collapsed to concentrate on the other phylogroups. Strains that possess the effector gene are coloured in blue, and those that are cherrypathogenic are highlighted in red. Strains with long-read sequenced genomes are in black boxes. The transfer events predicted by RANGER-DTL are shown by purple arrows. The bar shows substitutions per site. (b) DNA alignments of genomic regions containing these two effector genes, showing similar flanking regions between cherry pathogens. Alignments are colour coded based on similarity; identical residues are in grey, whereas dissimilar residues appear in black. The effector gene is coloured in red, mobile element genes are in green and other coding sequences are in blue. Cherry-isolated strains are named in pink, whilst the nonpathogenic plum strain R1-5300 is in blue. Gene name abbreviations: Hypo, hypothetical protein gene; ISPsy4, insertion sequence; ME, mobile element. (2018) (Nahar et al., 2014). This T3E gene was found on GI37 in Psm R2-leaf and was within 3 kb of hopF4 (Fig. S25), indicating that these two T3Es may have been gained together. HopBF1 was first discovered in P. syringae pv aptata and oryzae (Baltrus et al., 2011), but its role is undetermined. This study therefore identified candidate T3Es important for cherry pathogenicity that should be the focus of future functional studies. Cherry and plum isolated strains are highlighted in pink and blue respectively; those names followed by single asterisks were nonpathogenic on cherry in controlled pathogenicity tests. Strains with long-read sequenced genomes are in black boxes. Bootstrap supports < 99% are shown. The bar is nucleotide substitutions per site. Double asterisks point to the clustering of P. syringae pv morsprunorum (Psm) R2 sequences with syr2675. (b) Genomic locations of the hopAR1 gene in the three PacBio-sequenced cherry pathogens. The gene is located within prophage sequences in Psm R1 and R2 (see Table S14 for details), whereas in syr9097 it is on a genomic island (GI) adjacent to a transfer RNA (tRNA) gene. Effector genes are coloured in red, other coding sequences in blue, phage genes predicted by PHASTER and mobile element genes are in green, tRNA genes in pink and GIs predicted (GI14 in Psm R2 and GI23 in P. syringae pv syringae (Pss)) in light blue. Predicted phage att sites are in dark green, with sites homologous to R2-leaf in Pss 9097 also indicated even though a phage is not predicted here. The ends of predicted prophage sequences are denoted with dashed green lines. # hopBK1 is a pseudogene in this strain. CDS, coding sequence; ME, mobile element; T3E, type III effector. Phytotoxin biosynthesis gene clusters were also identified. Coronatine is present on a plasmid in pathogenic Psm R1 and may be one of the factors that differentiate pathogens from nonpathogens in this clade. Coronatine functions in virulence by downregulating salicylic acid defence signalling (Grant & Jones, 2009). Necrosis-inducing lipodepsipeptide toxins were common in P2. All cherry-pathogenic Pss strains possessed at least one biosynthesis cluster. The ability of Pss strains to cause necrosis on cherry fruits has been linked to toxins (Scholz-Schroeder et al., 2001). Interestingly, two nonpathogenic P2b cherry strains lacked all phytotoxins, a deficiency that probably contributes to their lack of pathogenicity.

New Phytologist
All cherry-pathogenic Pss strains had reduced effector repertoires. This observation supports the hypothesis that a phenotypic trade-off exists, with strains retaining few T3Es, whilst relying more on phytotoxins for pathogenicity (Baltrus et al., 2011;Hockett et al., 2014). If this pathogenic strategy has evolved in the P2 clade, it raises the question as to how it affects host specificity and virulence. P2 strains often infect more than one host species (Rezaei & Taghavi, 2014). These strains probably possess fewer ETI-inducing avirulence factors that restrict effector-rich strains to particular hosts, so may be more successful generalists. The reduction in T3E repertoire, however, may be limiting, as strains may be less capable of the long-term disease suppression required at the start of a hemi-biotrophic interaction.
Most cherry-pathogenic clades possessed genes involved in aromatic compound degradation, shown to be important in virulence on olive (Caballo-Ponce et al., 2016), and ice nucleation genes that stimulate frost damage (Lamichhane et al., 2014). The fact that not all cherry-pathogenic clades possessed these genes suggests they are not essential requirements for bacterial canker; however, they may contribute to niche persistence. For example, Crosse & Garrett (1966) observed that Psm R1 survived in cankers for longer than Pss. Increased persistence might be linked to genes involved in woody-tissue adaptation.
Horizontal gene transfer has been important in the acquisition of key effectors HGT is important for effector shuffling within P. syringae (Arnold & Jackson, 2011). Pathogenicity-associated T3Es hopBB1 and hopBF1 were plasmid encoded and showed evidence of HGT between the cherry-pathogenic clades in P1 and P3. Plasmid profiling revealed that cherry pathogens in these phylogroups possessed native plasmids, some of which were putatively conjugative, indicating the importance of plasmids in gene exchange. By contrast, most cherry-pathogenic Pss strains lacked plasmids.
The T3E hopAR1 was chromosomal in all long-read sequenced genomes. This gene was found within distinct prophage sequences in Psm R1 and R2. To our knowledge this is the first reported example of a plant pathogen T3E located within a prophage sequence. Interestingly, the Psm R2 hopAR1 gene homologue was most similar to hopAR1 from a P2 bean strain syr2675, which is a close relative of cherry Pss. This strain possessed a homologous phage to Psm R2, indicating that HGT of this T3E between phylogroups may have been phage mediated. This striking example of convergent acquisition of hopAR1 in the cherry pathogens, putatively through distinct prophages in Psm R1 and R2, and a GI in Pss indicates that this T3E may have important roles in virulence. The well-characterized Identification of avirulence factors activating effector-triggered immunity in cherry. (a) Boxplot of an initial 10-d population count analysis of cherry pathogens (R1-5244, R2-leaf and syr9644) transconjugants expressing candidate avirulence genes. The data presented are based on one experiment, with three leaf replicates and three nested technical replicates (n = 9). Boxplots show median and interquartile range (IQR) and whiskers extend to values 1.59 IQR above and below the median. All data points are plotted with circles. Controls included the wild-type strain, a strain containing the empty pBBR1MCS-5 vector and a ΔhrpA deletion mutant (for R1-5244 and R2-leaf). A separate ANOVA was performed for each cherry pathogen (R1-5244, R2-leaf and syr9644) and the Tukey-HSD significance groups (P = 0.05; confidence level: 0.95) for each strain are presented above each boxplot. (b) Boxplot of 10-d population counts of cherry pathogens (R1-5244, R2-leaf and syr9644) expressing different HopAB alleles and HopC1. The data presented are based on three independent experiments (n = 27). Tukey-HSD significance groups are presented above each boxplot. (c) Symptom development of R1-5244, R2leaf, syr9644 transconjugants. Mean symptom score values are presented and represent two independent experiments (n = 6). Symptoms assessed as degree of browning of the infiltration site: 1, limited browning; 2, < 50%; 3, > 50%; 4, 100% of the infiltrated area brown. Analysis was based on area under disease progress curve (AUDPC) values (0-48 h). An ANOVA was performed on AUDPC values, with asterisks indicating significantly different from the empty vector (EV) control. (d) Symptom development over time on a representative leaf inoculated with R1-5244 transconjugants. HPI, hours post inoculation. The order of strains: 1, EV; 2, hopAB1; 3, hopAB2; 4, hopAB3; 5, hopC1. Arrows indicate the first appearance of symptoms associated with each strain and are coloured based on the graph in (c). ANOVA tables for all statistical analyses are presented in Tables S18-S24, and AUDPC values are in Table S25. CFU, colony-forming units. www.newphytologist.com P. syringae pv phaseolicola R3 homologue is not associated with a phage, but has been shown to undergo dynamic evolution on a mobile genomic island in planta in resistant bean cultivars (Neale et al., 2016). Several T3Es in Psm R1, R2 and Pss were located on GIs. To determine the likely source of GIs in cherry strains, all other P. syringae strains were searched for homologous sequences. There was evidence of Psm R1 and R2 islands being shared between cherry pathogen clades indicative of HGT events occurring between strains occupying the same ecological niche.

Functional genomics revealed convergent loss of an avr factor
Genes from the hopAB and avrPto families form a redundant effector group (REG) vital for early PTI suppression in herbaceous species (Jackson et al., 1999;Lin & Martin, 2005;Kvitko et al., 2009). Both effectors also trigger ETI by interacting with the serine-threonine kinase R protein Pto in tomato (Kim et al., 2002).
Across the P. syringae complex, the REG was common (Fig. S26), but cherry pathogens all lacked full-length genes. The hopAB1 gene has been lost from Psm R1, whilst the Psm R2 and P. syringae pv avii predicted HopAB3 proteins lacked the PID and E3-ubiquitin ligase domains through contrasting mutations. P. syringae pv avii also possessed a truncated hopAB1 gene (Fig. S24), lacking the PID domain. The lack of a PID in cherry pathogen HopAB proteins suggested that they could have diverged to avoid a Pto-based recognition system in cherry.
Full-length members of this REG were expressed in cherry pathogens to determine their role in planta. The addition of HopAB alleles (HopAB1-3) consistently reduced population growth of pathogenic strains in planta and triggered a response consistent with the HR. If this effector does trigger immunity in cherry, there may have been selection pressure for its loss or pseudogenization in cherry pathogens in order to reduce avirulence activity. The truncated version of HopAB3 in R2-leaf was found not to exhibit avirulence activity as its expression did not reduce the growth of R1-5244 in planta. Although AvrPto is part of the same REG, its expression had no effect on the ability of cherry pathogens to multiply in leaves. The absence of AvrPto from cherry pathogens is therefore unlikely to be driven by avirulence, but could be due to the lack of HopAB virulence targets in planta.  Table S26). (e) Representative image of symptoms 10 d post inoculation (dpi) with the different R1-5244 transconjugants when inoculated at a low concentration (2 9 10 6 colony-forming units (CFU) ml À1 ) to observe pathogenicity. Arrows point to pathogenic symptoms in the strain expressing hopAB3 R2-leaf and the empty vector (EV) strain, colour coded as in (d).

New Phytologist
As this REG is vital for early disease suppression in model strains, cherry pathogens must rely on other T3Es to fulfil this role.
The variation in hopAB1 presence in Psm R1 is intriguing. Psm R1 strains may be pathogenic on both cherry and plum (DhopAB1) or just pathogenic on plum (possessed hopAB1) as recorded in Hulin et al. (2018). This suggests that the host proteins in cherry that detect the presence of HopAB are not present/functioning in plum. Future studies may determine how the two host immune responses diverged and could examine hopAB diversity across Prunus pathogens. This study focused on bacterial canker of P. avium; however, strains isolated from additional Prunus spp. that cause other diseases were included, such as P. syringae pv cerasicola (bacterial gall of hybrid cherry Prunus 9 yedoensis; Kamiunten et al., 2000), P. syringae pv morsprunorum FTRSU7805 (canker of apricot), P. syringae pv amygdali (canker of almond) and P. syringae pv persicae (decline and canker of peach) ( Table 1). All apart from P. syringae pv amygdali 3205 and P. syringae pv persicae lacked HopAB (Fig. 2), indicating that there may be a conserved resistance mechanism regulating ETI activated by this effector family in Prunus species.
Linking genomics to the evolution of cherry pathogenicity Cherry pathogenicity has arisen independently within P. syringae, with strains using both shared and distinctive virulence strategies. Cherry-pathogenic clades in P1 and P3 have large effector repertoires. Cherry Pss were found across P2 with reduced T3Es and several phytotoxin gene clusters. Key events in the evolution of cherry pathogenicity (Fig. 9) appear to be the acquisition of virulence-associated effectors, often through HGT. Putatively important T3Es included hopAR1, members of the hopF family such as hopBB1 and the other T3Es hopBF1 and hopH1. Significantly, the loss/pseudogenization of HopAB effectors has also occurred in multiple clades. Within P2, the different cherry-infecting Pss clades have slight differences in their virulence factor repertoires that may reflect their convergent gain of pathogenicity. Clades differed in T3E content, phytotoxin genes and possession of genes for catechol degradation (Fig. 2), and thus pathogenicity was achieved with variable virulence factor repertoires. This study demonstrates that populations genomics can be used to examine a complex disease of a perennial plant species. A huge dataset was The key gains and losses of associated virulence genes in strains pathogenic to cherry are described based on gain loss mapping engine (GLOOME) analysis. Asterisks indicate the probability of this effector being gained/lost predicted using GLOOME was slightly lower than the threshold of 0.8, but exceeded 0.65 (see Table S12 for details).

Fig. S1
Plasmid profiles of all Prunus-infecting sequenced strains and some out-groups for comparison.           Fig. S13 Percentage of phylogenetic trees that supported the dependant model of evolution in the BAYESTRAITS analysis for T3E families that were significantly associated with pathogenicity.

Fig. S14
Predicted gain and loss of effector genes occurring on each branch leading to cherry pathogenic clades on the core genome phylogenetic tree.

Fig. S15
Core genome phylogenetic tree used in GLOOME gain and loss analysis with the branch labels added.          Table S1 All Pseudomonas syringae transconjugants and gene deletion mutants generated in this study   (Altschul et al., 1990) analysis to identify virulence factors with NCBI accession number, gene name and abbreviation and source organism