Is there foul play in the leaf pocket? The metagenome of floating fern Azolla reveals endophytes that do not fix N2 but may denitrify

Summary Dinitrogen fixation by Nostoc azollae residing in specialized leaf pockets supports prolific growth of the floating fern Azolla filiculoides. To evaluate contributions by further microorganisms, the A. filiculoides microbiome and nitrogen metabolism in bacteria persistently associated with Azolla ferns were characterized. A metagenomic approach was taken complemented by detection of N2O released and nitrogen isotope determinations of fern biomass. Ribosomal RNA genes in sequenced DNA of natural ferns, their enriched leaf pockets and water filtrate from the surrounding ditch established that bacteria of A. filiculoides differed entirely from surrounding water and revealed species of the order Rhizobiales. Analyses of seven cultivated Azolla species confirmed persistent association with Rhizobiales. Two distinct nearly full‐length Rhizobiales genomes were identified in leaf‐pocket‐enriched samples from ditch grown A. filiculoides. Their annotation revealed genes for denitrification but not N2‐fixation. 15N2 incorporation was active in ferns with N. azollae but not in ferns without. N2O was not detectably released from surface‐sterilized ferns with the Rhizobiales. N2‐fixing N. azollae, we conclude, dominated the microbiome of Azolla ferns. The persistent but less abundant heterotrophic Rhizobiales bacteria possibly contributed to lowering O2 levels in leaf pockets but did not release detectable amounts of the strong greenhouse gas N2O.


Introduction
Our growing global population is rapidly escalating the demand for nutritious food, requiring highly prolific and sustainable primary production. In tandem, the need for renewable feedstocks for the industry derived from primary production is also growing. To sustain future food and feedstock production, we need to explore novel crops that comply with limitations imposed by climate change, agrosystem inputs (e.g. water, fertilizers) and available arable land. Of particular concern is the ubiquitous requirement for nitrogen fertilizer in current agriculture that is tied to high input costs and negative climate consequences (Jensen et al., 2012). No crop plant is capable of fixing atmospheric dinitrogen (N 2 ) autonomously. In leguminous crops such as soybean, plants recruit free-living, nitrogen-fixing bacteria in the order Rhizobiales from their environment anew each host generation (Vance, 2002), and it is within the legume's specialized root nodules that these symbiotic heterotrophic bacteria fix dinitrogen sufficient for host and bacteria. N 2 -fixation requires large amounts of energy derived from the oxidation of plant sugars, which are a limiting factor. Under intensive agriculture, therefore, most leguminous crops are supplied with surplus nitrogen fertilizer to improve bean yields beyond 5 t ha À1 yr À1 . N 2 -fixing cyanobacteria are known to form symbioses with other plants such as cycads, ferns and bryophytes (Adams et al., 2013). These symbiotic cyanobacteria can use light, as well as plant-derived sugars, as an energy source to drive N 2 -fixation. A single fern genus, Azolla, benefits from such a cyanobacterial symbiosis, and stands out for its prolific growth, resulting in high protein biomass without nitrogen fertilizer. For example, Azolla filiculoides produced 39 t ha À1 yr À1 dry weight (DW) biomass containing up to 25% protein (Becerra et al., 1990;Brouwer et al., 2017), whereas clover, Trifolium pratense, a high-yielding forage legume that is commonly grown with low fertilizer applications (150 kg ha À1 yr À1 ), produced up to 15 t ha À1 yr À1 DW biomass containing similar protein amounts (Anglade et al., 2015). The cyanobacterial symbiont Nostoc azollae is key to the fern's remarkable productivity. With its genome highly degraded (containing 31.2% pseudogenes and over 600 transposable elements), N. azollae is unable to survive without its host (Ran et al., 2010), and its spores are vertically transmitted to the next fern generation via the fern megaspores (sensu Nagalingum et al., 2006) during sexual reproduction. The cyanobacteria reside within specialized leaf pockets of Azolla, where they form heterocysts with high frequency and utilize photosystem I to drive N 2 -fixation. A colony of motile cyanobacteria typically resides at the meristematic tip of the branch where the leaf pockets of the young developing leaves are still open, allowing cyanobacteria to migrate inside (Perkins & Peters, 1993). Such a specialized environment that is attractive to cyanobacteria may also attract other bacteria. Electron micrographs of leaf and megasporangiate sori cross-sections have revealed the presence of other bacteria in addition to the cyanobacteria (Carrapic ßo, 1991;Zheng et al., 2009). Immunohistochemical detection of nitrogenase using polyclonal antisera did not, however, unequivocally reveal nitrogenase in these bacteria (Braun-Howland et al., 1988;Lindblad et al., 1991). Moreover, the number of species and taxonomy of the cyanobacteria associated with Azolla has been controversial (Pereira & Vasconcelos, 2014). The focus of our present study is to characterize the microbiome associated with A. filiculoides.
Plant-microbe interactions research has evolved rapidly over the last few decades (Turner et al., 2013). Most studies focus on microbial interactions between plant roots and the rhizosphere, whereas microbes in the phyllosphere, the above-ground plant organs, have been less thoroughly examined (Peñuelas & Terradas, 2014). Symbiotic bacterial endophytes, nonpathogenic organisms that colonize intercellular spaces in plants, have been rediscovered with next-generation sequencing approaches that permit in situ studies. Endophytes are ubiquitous and frequently found in plant species (Stone et al., 2000).
Although microbial species associated with a particular plant species can be numerous and diverse, a core microbiome can be identified, as illustrated for Arabidopsis thaliana (Lundberg et al., 2012). Microbiomes in the rhizosphere and phyllosphere are not the same, however, reflecting disparate niches in the different plant organs, each subject to its own developmental and environmental influences (Turner et al., 2013). These microbiome differences are not just opportunistic, but often convey beneficial properties to the host plant. Persistent mutualistic microbes have been shown in specific cases to increase the fitness of plants; for example, in legumes non-nitrogen-fixing rhizobia decrease grazing, thus eliminating the fitness costs of the mutualistic interaction (Simonsen & Stinchcombe, 2014). Arthrobacter species have been isolated repeatedly from Azolla and one strain was shown to produce the auxin IAA, possibly affecting fern development and growth (Forni et al., 1992); an Agrobacterium strain isolated from surface-sterilized A. filiculoides assimilated ammonium, possibly sequestering the growth inhibitory nutrient when it accumulates in the leaf pockets in excess (Plazinski et al., 1990). The capabilities of the host and microbiome, the holobiont, should therefore be viewed as one unit reflected in the metagenome, evolving through myriad environmental constraints. This idea inspired the coining of Azolla as a 'superorganism' (Carrapic ßo, 2010).
Recently, shotgun sequencing of DNA extracted from microbial communities without PCR and subsequent metagenome assembly have become feasible, allowing for functional analyses of multiple genomes in addition to taxonomic assignments (Castelle et al., 2013;Wrighton et al., 2014). Assembly of short sequencing reads obtained from metagenome shotgun sequencing into long scaffolds, ideally representing near complete microbial genomes, however, remains elusive (Tyson et al., 2004;Charuvaka & Rangwala, 2011;Zependa Mendoza et al., 2015). Metagenome assembly quality is primarily influenced by the number and diversity of organisms present, as well as the length of reads. Improved assemblies with long scaffolds can be obtained by subcloning DNA into fosmids before sequencing or by using long-read technologies such as those that were validated in studies of gut microbes (Mizuno et al., 2013;Leonard et al., 2014). The presence of an organism in an environmental sample may then be computed by recruiting short reads from the environmental sample onto the assembled genome of that particular organism, as was successfully demonstrated with phage genomes from the ocean or bacterial genomes from salt brines (Pa si c et al., 2009;Mizuno et al., 2013).
The focus of the present study was to characterize the identity and function of microbes persistently associated with A. filiculoides using metagenomics shotgun sequencing of total DNA from samples collected in their natural environment, and also from cultured species of Azolla.

Plant materials
A. filiculoides Lam was obtained from the Galgenwaard ditch in Utrecht, the Netherlands. In addition, six Azolla species were obtained from the bio-fertilizer germplasm collection at the International Rice Research Institute (IRRI) in the Philippines (Table 1; Watanabe, 1992).

Collection and processing of samples from the natural environment
Whole plants of A. filiculoides, its enriched leaf pocket contents and water filtrates from the surrounding water (13°C, pH 7.2) were collected as triplicate replicates from the Galgenwaard ditch in Utrecht (Table 1) on 28 October 2015. Plant and water replicates were carried from the collection site in separate containers and treated separately. Ferns were filtered using sieves of 4 mm mesh size to remove contaminating aquatic plants and animals, then washed by vortexing at full speed for 60 s in 0.5% Tween-20, in batches of 5 g fresh weight (FW). For whole plant samples, one plant of 200 mg FW and two 3-mm-diameter glass beads were placed into tubes, snap frozen and then homogenized by a TissueLyser II (Qiagen). Leaf pocket-enriched fractions were prepared from washed ferns as described by Orr & Haselkorn (1982). Ditch water (1 litre) from every replicate was passed through a 0.45 lm filter, and the biomass on the filter was then resuspended in 500 ll water and frozen (À80°C) until DNA extraction. DNA was extracted using the Mobio PowerLyzer PowerSoil kit (Qiagen), according to the manufacturer's protocol.

Fern cultures and processing
Cultures of different Azolla species were obtained from the International Rice Research Institute (Philippines) except for A. filiculoides (Table 1; Watanabe, 1992). All Azolla species were grown on liquid medium without nitrogen and under long-day light with a far-red component as described by Brouwer et al. (2014), except where stated otherwise. To obtain sterilized cultures of A. filiculoides, explants (<1 mm 3 ) of leaves from the ditch plants were surface-sterilized using bleach at 1% available chlorine for 40 s, with four consecutive rinses in sterile water before cultivation on agar medium (0.6% (w/v) agarose, Duchefa, Haarlem, the Netherlands). Azolla without N. azollae (referred to here as A. filiculoides-Sterilized) was first cultured on solid agar medium with 60 lg ml À1 erythromycin and 2 mM NH 4 NO 3 ; the absence of N. azollae was verified using confocal microscopy and by quantitative PCR (see Fig. 4; Forni et al., 1991;Brouwer et al., 2014). Sterilized cultures were grown in enclosed glass containers with a stream of air (78 l h À1 ) pumped through 0.45 lm filters using aquarium pumps (SuperFish Air flow mini); A. filiculoides-Sterilized were used for DNA sequencing, the 15 N 2 -fixation experiments and d 15 N determinations of Azolla biomass. DNA extractions from cultured plants were, following enrichment for nuclei (Lutz et al., 2011), combined with the Genomic-tip 100/G protocol (Qiagen) for the long read sequencing, or directly using the Genomic-tip 100/G protocol for the short-read Illumina sequencing.

Sequencing library preparations and sequencing of DNA
Libraries for short-read sequencing (in paired-end mode) were made after shearing the DNA as per the recommended protocol (TruSeq Nano DNA Library Prep Kit, Illumina, Madison, WI, USA). For Azolla samples from the ditch, care was taken to shear the DNA to c. 800 bp (Covaris, Woburn, MA, USA) to improve EMIRGE assemblies. Sequencing was performed using the Illumina NextSeq500 desktop sequencer, yielding c. 3 Gb sequence information per replicate (Supporting Information Table S1). For cultured Azolla samples, libraries of 250, 500 and 800 bp were generated and sequenced at high coverage such that the data needed to be sub-sampled to 10 and 30 million reads, for comparison with data obtained from ditch Azolla.
Libraries for PacBioRS II (Pacific Biosciences, Palo Alto, CA, USA) sequencing of the nuclear DNA from a single plant of A. filiculoides-Sterilized (described under 'Fern cultures and processing') were generated after size separation with a cut-off at 14 kb (Blue Pippin, Sage Science, Beverly, MA, USA) according to the PacBio RS II protocol and sequenced using P5-C3 chemistry, reaching 57 times coverage of the 750 Mb genome.
Taxonomic assignments based on small ribosomal RNA (sRNA) sequences Short-read sequences were sorted according to biological replicates and paired-end reads were trimmed using Trimmomatic (parameters 'LEADING:5 TRAILING:5 SLIDINGWINDOW:4:15 MINLEN:36'; Bolger et al., 2014). All reads passing quality control (QC) were processed in parallel by RIBOTAGGER, which directly assigns taxonomy from variable regions of rRNA genes found in single reads using a subset of the Silva database containing the V4-V7 variable regions as reference (Tange, 2011;Xie et al., 2016). Nearly whole-length rRNA genes were assembled with EMIRGE using standard parameters over 120 iterations (Miller et al., 2011). Classification of assembled rRNA genes was performed by MOTHUR, using the Silva nonredundant v119 reference database (Schloss et al., 2009;Quast et al., 2013). In addition to processing samples as individual replicates (P 1 to 3 , L 1 to 3 , W 1 to 3 ), reads from the three biological replicates of whole plant, leaf juice or water were pooled (P, L and W, respectively) before analyses with either RIBOTAGGER or EMIRGE. This was done to evaluate the sensitivity of the taxon detection using either EMIRGE or RIBOTAGGER with three times more reads.

Genome assemblies with long reads
Long reads (PacBioRS II) from DNA of A. filiculoides-Sterilized were read-corrected and then assembled into scaffolds by both the CELERA and FALCON assembler pipelines, yielding two preliminary genome assemblies (Myers et al., 2000; https://github.com/ PacificBiosciences/FALCON; Koren et al., 2012). Bacterial scaffolds in the genome assemblies were identified by RNAMMER (Lagesen et al., 2007). Bacterial scaffolds with a minimum length of 0.1 Mb were extracted and assigned taxonomy based on the 16S rRNA genes in MOTHUR using the Silva database (Table 2). Once identified, the scaffolds were submitted to RAST (Overbeek et al., 2014) for annotation, which scored the nearest neighbor.

Recruitment analyses
Short-read sequences were mapped to reference scaffolds and genomes with BOWTIE2 (v.2.2.6; options: -very-sensitive (-D20-R3-N0-L20-iS1,0.50); Langmead & Salzberg, 2012). If applicable, fragmented genomes were converted to one sequential sequence for the purpose of visualization. BOWTIE output was parsed with a custom script to extract position and the common bases in the alignment (identity score). In a custom R script, aligned reads were binned (normalized for 0.05 Mb and 1% identity) and read count per bin was log 10 -transformed (Wickham, 2011;R Core Team, 2013;Dowle et al., 2014;Carr et al., 2015).

Data deposition
The sequences reported in this paper have been deposited in the ENA database with the study accession number PRJEB19522; the data are separated into three categories: Illumina paired end NextSeq500 sequences (2 9 150 bases (b)) from the environmental samples, Illumina paired end NextSeq500 sequences (2 9 150 b) and short-read sequences sampled at 30 M reads from each of the different species and bacterial scaffolds (including PacBioRSII-corrected reads).

N 2 fixation, d N determinations and N 2 O release
Surface-sterilized ferns (100 mg FW) were placed in enclosed bottles with 43 ml of sterile medium and a residual air space of 262 ml. To determine N 2 fixation after 2 h, 15 N 2 (15 ml) was added at 14 h using air-tight syringes whilst overpressure was removed using a release needle; the bottles were then incubated for 2 h under growth conditions as in Brouwer et al. (2014). To determine N 2 fixation after 24 h, 15 N 2 (5 ml) was added as well as CO 2 (5 ml). After the incubation with 15 N 2 , samples were snap frozen in liquid nitrogen, freeze-dried and homogenized before analysis of the dry weights, N content and isotope abundance determinations. In both the 2 and the 24 h incubation experiments, 15 N 2 provided from Sigma was washed with acid to PacBioRSII reads were read-corrected then assembled using either the CELERA or the FALCON pipelines. The Sinorhizobium-like scaffold was assembled by both pipelines yielding 4.906 Mb and 4.138 Mb scaffolds, respectively, for CELERA and FALCON. These sequences were largely identical but RAST annotation of the N-metabolism genes differed by one gene (Overbeek et al., 2014).
2 RNAMMER detected rRNA genes in the scaffolds and taxonomy was based on the rRNA gene sequences with MOTHUR using the Silva database. 3 Length of the scaffolds in base pairs. 4 Number of features computed by RAST annotation including the number of missing genes in parentheses. 5 Presence of genes from the denitrifying pathway with the total number of nitrogen metabolism genes in the scaffold in parentheses. Small scaffolds from singleton genera were omitted. 6 The closest relative as computed by RAST.

Research
New Phytologist remove ammonia. In the 24 h incubation experiment the gas was washed in addition with a base to remove NOx.
Total N content and stable nitrogen isotopes (d 15 N) were analyzed on a ThermoScience Delta Plus isotope ratio mass spectrometer connected on-line to a Carlo Erba Instruments Flash 1112 elemental analyzer. We assumed no isotope discrimination during the fixation process and therefore rates of fixation calculated may be underestimated.
Ferns used for N 2 O measurements included A. filiculoides cultured in the laboratory (nonsterile), and surface-sterilized ferns with and without N. azollae (A. filiculoides-Sterilized) grown under sterile conditions. For experiments with nonsterile material, 10 g FW fern was used with 200 ml air headspace. For experiments including sterile materials, 100 mg FW fern was used with 15 ml micro-aerobic (10% (v/v) O 2 ) head space. Gas samples of 6 ml were separated on a Hayesep Q column by GC (Hewlett Packard Agilent Technologies) and gases were detected with an electron capture detector (ECD 63 Ni).

Azolla filiculoides sustains a unique microbiome
The Dutch ditch plants of A. filiculoides, together with samples of their in situ ditch water, were sampled and processed for sequencing independently in three biological replicates (i = 1-3) of the following types: whole plant (P i ), enriched leaf pocket contents (L i ) and surrounding water (W i ), containing 8.42-11.99 M reads averaging 147 b (Table S1). Taxonomic groups present in samples were computed either by rRNA assembly with EMIRGE or by analysis of reads containing 16S rRNA variable regions with RIBOTAGGER, using the Silva rRNA reference database (Miller et al., 2011;Quast et al., 2013;Xie et al., 2016). The distribution of taxonomic classes or orders over replicate samples was similar for both methods and very similar among biological replicates (Fig. 1a). RIBOTAGGER taxonomic assignments were not influenced by the number of reads sampled (10 or 30 M) since it computed an identical set of classes or orders in replicates with 10 M reads compared to when the three replicates were pooled to submit 30 M reads for analysis. When assembling rRNA genes with EMIRGE, however, pooling replicates before EMIRGE assembly occasionally yielded more taxonomic assignments, probably because assemblies were dependent on read coverage (Figs S1, S2).
Ditch water surrounding A. filiculoides was more diverse in its microbial community composition than were the plant-related samples: the mean Shannon diversity of RIBOTAGGER-assigned microbial taxonomy was 3.11 AE 0.16 (SD) for water samples, compared to 1.47 AE 0.07 and 1.11 AE 0.08 for whole plant and leaf samples, respectively. The community richness was also higher in the ditch water samples than in plant-related samples. Rarefaction analysis showed saturation of the plant-associated microbiome with sampling size, but not for the ditch water (Fig. 1b). Over half of the taxa found in water samples were identified as class Betaproteobacteria, with the orders Burkholderiales, Rhodocyclales and Methylococcales being the most abundant www.newphytologist.com (Fig. 1a). Overlap between Azolla-associated and water samples was zero at order level and minimal at class level.

Nostoc azollae is the most abundant endophyte of Azolla filiculoides
Taxonomic identification revealed a conserved and plant-specific microbial community associated with A. filiculoides ( Fig. 1a: L, P). Most rRNA hits were assigned to either fern chloroplasts, Viridiplantae nuclei or cyanobacteria. Cyanobacteria-derived rRNA sequences were more abundant in the enriched leaf pocket contents than in the whole plant samples. Fern mitochondrial rRNA was absent from the database and instead assigned to the order Ricketsiales (class Alphaproteobacteria) that was systematically present in all whole plant Azolla samples, yet less abundant in leaf pocket-enriched samples. Cyanobacteria-related sequences were the most abundant in all fern samples, making up c. 60-75% and 45% of the rRNA hits in L and P samples, respectively ( Fig. 1a: L, P). The accuracy of assembled 16S rRNA genes was confirmed by aligning the rRNA assemblies assigned to cyanobacteria to the N. azollae 16S rRNA gene (NCBI reference sequence: NR_074259.1): multiple sequence alignment with CLUSTALW revealed over 99.5% similarity over the full length of the alignment. The results therefore confirmed that N. azollae is the primary symbiont of A. filiculoides.
Rhizobiales are constitutive members of the microbiome in natural and cultivated Azolla species To help reveal microorganisms associated at low abundance with A. filiculoides from the ditch, we removed rRNA hits derived from chloroplasts, Viridiplantae nuclei, mitochondria, cyanobacteria and unclassified sequences (Fig. 2, Environmental). RIBOTAGGER found more operational taxonomic units (OTUs) in nearly all samples than did EMIRGE. Only EMIRGE, however, found Metazoa 18S rRNA in all Azolla plant (P) and one leaf pocket-enriched (L) samples. These rRNA genes all mapped to Stenopelmus rufinasus, a weevil specialized in feeding on Azolla (Hill, 1998). All five assembled Metazoa rRNA genes and GenBank reference FJ867794.1 were trimmed to corresponding lengths and aligned: 98.2% of the 1200 bp multiple sequence alignment was identical. Detection of the weevil and the perfect assembly of the N. azollae rRNA confirmed the accuracy of EMIRGE assemblies and subsequent taxonomic assignments by MOTHUR. The bacterial orders Rhizobiales and Burkholderdiales were found enriched in L samples by both methods at 2% and 1% abundance, respectively, and in all but one L sample by RIBOTAGGER (Fig. 2, Environmental). For the cultured Azolla species, short-read sequencing data obtained from seven different species were also analyzed using EMIRGE and RIBOTAGGER (Table 1). Cultured ferns included A. filiculoides originating from the same ditch as the environmental sample but cultured for 2 yr so as to be devoid of N. azollae (=A. filiculoides-Sterilized). The most abundant taxonomic assignments from DNA of cultured Azolla species were Viridiplantae nuclei, chloroplast and cyanobacteria (Fig. S1); these were removed to reveal taxa present at a lower abundance (Fig. 2, Cultured). Members of Burkholderiales, present in ditch samples of A. filiculoides, were infrequently observed in cultured Azolla species. However, they were particularly prominent in A. filiculoides-Sterilized. Similarly, Caulobacteriales were infrequently observed in cultured Azolla. By contrast, Rhizobiales were observed in all cultured and environmental Azolla samples, including those devoid of N. azollae (Fig. S2,   Fig. 2 Relative abundance of orders within cultured species of Azolla (Table 1) and ditch samples of Azolla filiculoides (natural and sterilized). Taxonomy was assigned to rRNA fragments found in single reads by RIBOTAGGER (RiboTagger) and to rRNA genes assembled with EMIRGE by MOTHUR (EMIRGE). Unclassified orders or those originating from Viridiplantae nuclei, fern plastids and cyanobacteria are not shown. Environmental sequencing data originated from A. filiculoides leaf pocket-enriched samples (L) and whole plants (P) in biological triplicates. Sequence reads from cultured ferns were processed as subsets of 10 M and 30 M reads. (2018)

New Phytologist
A. filiculoides-Sterilized). Azolla accessions from IRRI had been cultured for many years (Table 1), raising the likelihood that their microbiomes were considerably altered from when first collected in their natural environment. The persistent occurrence of Rhizobiales in environmental, cultured and sterilized ferns, however, suggested that these bacteria are closely associated with the fern and possibly have an added ecological function in the Azolla-Nostoc symbiosis. Detection of the rRNA genes from Rhizobiales in DNA from A. filiculoides-Sterilized further indicated that the long-read nuclear genome assembly from this plant probably contained scaffolds of persistent bacterial endophytes.
Near full-length genomes of two novel Rhizobiales species in assemblies of the Azolla filiculoides genome are present in all Azolla species The FALCON and CELERA assemblies from the A. filiculoides-Sterilized were scanned for bacterial scaffolds (presence of 16S rRNA) with RNAMMER; scaffold taxonomy was then assigned using MOTHUR if they were longer than 0.1 Mb (Table 2). Both assemblies reproducibly yielded scaffolds from the genera Shinella and Rhizobium (Rhizobiales).
To differentiate true symbiotic partners from contaminations due to culture treatments or DNA extractions, short reads of all cultured species and environmental samples were mapped to the extracted scaffolds: only hits with an identity over 97% were counted and hit frequency was normalized for scaffold length, thus generating a heat map (Fig. 3). Scaffolds assigned to Ralstonia (Burkholderiales) were most abundant in samples of A. filiculoides-Sterilized, but absent in other species. Three other bacterial genera present in multiple Azolla species stood out with substantial counts: Hydrocarboniphaga (Nevsikiales), and Shinella and Rhizobium (Rhizobiales). The three Rhizobium and two Shinella scaffolds had the same relative frequencies in each sample, indicating that they each originated from one species of Rhizobium and Shinella, respectively. Scaffolds from the Rhizobiales were on average more frequently mapped by reads from the leaf pocket-enriched (L) samples than from whole plants (P); enrichment locates the bacteria from the Rhizobium genome in the leaf pockets (Fig. S3).
To evaluate their representation in the data over the full length of their genomes, short reads of all cultured and environmental samples were mapped to the longest scaffolds of these bacterial genera (Fig. 4). High identity reads (100%) mapped with high frequency to the N. azollae genome, revealing that the published N. azollae genome is the same species as that found in A. filiculoides from the Dutch ditch. The absence of reads from the A. filiculoides-Sterilized samples mapping to N. azollae confirmed that these plants were devoid of cyanobacteria. In the genomes that were absent from these samples, sporadic loci still mapping reads with high identity were localized at highly conserved genes such as rRNA. By contrast, the 3.2 Mb Rhizobium and 4.9 Mb Shinella scaffolds were represented over the full length of the scaffolds in all fern samples. High identity reads were more abundant in A. filiculoides environmental and cultured samples compared to other Azolla species; nevertheless, these scaffolds were mapped with over 90% identity over their full length in all Azolla species. The Hydrocarboniphaga scaffold was only highly represented in fern samples in an area confined to the end of the scaffold; this scaffold therefore was probably an artefact of assembly fused at its end to A. filiculoides genomic DNA (Fig. S4).   Table 1). All reads were mapped with BOWTIE (options: -very-sensitive) and identity scores were calculated with a custom script (see the Materials and Methods section). Reads were binned according to identity score and position on the respective genome, then counted per 50 kb for normalization, and counts were log 10transformed. L, leaf pocket-enriched samples; P, whole plants; W, surrounding ditch water.

New Phytologist
The Rhizobiales endophytes of Azolla filiculoides contain denitrification enzymes To explore possible functions of bacteria from the Azolla microbiomes identified during our recruitment analysis, the combined Rhizobium and combined Shinella scaffolds were submitted for annotation to RAST (Aziz et al., 2008;Overbeek et al., 2014), which computed that the most similar organisms were, respectively, Agrobacterium tumefaciens and Sinorhizobium meliloti (Rhizobiales).
To evaluate the relatedness of our Sinorhizobium-like genome with the two known S. meliloti genomes (GenBank AL591688.1 and AKZZ01000000), we mapped reads from environmental samples and A. filiculoides-Sterilized to these genomes (Fig. S4). Whilst the Sinorhizobium-like genome was well represented in all Azolla samples, reads of all ditch and cultured fern samples mapped less efficiently to both known S. meliloti genomes. The Sinorhizobium-like endophyte was thus determined to be a distinct species from S. meliloti. Similarly, the Agrobacterium-like endophyte persistently detected in all Azolla ferns (Fig. 4) was distinct from known A. tumefaciens strains.
Analyses of N-cycle coding genes revealed that both Rhizobiales genomes were lacking the N 2 -fixing nitrogenase but instead encoded proteins from the denitrifying pathway ( Fig. 5; Table S2). The Sinorhizobium-like genome contained intact nitrite reductase, nitric oxide reductase and their accessory proteins (Figs S5, S6). The Agrobacterium-like genome did not contain nitrite reductase but contained nitric oxide reductase and nitrous oxide reductase features. Closer inspection of the locus and protein alignment, however, revealed insertions of mobile elements in key genes of the nor and nos operons (Figs S7, S8). Rhizobiales endophytes hosted by Azolla ferns therefore did not contribute to N 2 -fixation but may have released N 2 O and possibly also N 2 .
Azolla filiculoides lacking cyanobacteria, but with the Rhizobiales present, neither fix nitrogen nor release detectable amounts of N 2 O Nitrogen-fixation in surface-sterilized A. filiculoides with and without N. azollae (A. filiculoides-Sterilized) that were infected with the Rhizobiales endophytes was examined by supplying 15 N 2 at mid-day for 2 h (Fig. 6a), when both CO 2 and N 2 fixation peak (Brouwer et al., 2014). 15 N 2 -fixation was not significant in A. filiculoides-Sterilized (Fig. 6a, -Cynao+N). Whilst N 2fixation was inhibited by N-fertilizer in the medium required to sustain growth of A. filiculoides-Sterilized (Fig. 6a, compare +Cyano-N with +Cyano+N), A. filiculoides with N. azollae fixed significant amounts of nitrogen even after 2 h (Fig. 6a,  +Cyano+N). When examining nitrogen fixation after one diel cycle of 24 h incubation with 15 N 2 , d 15 N of the biomass was still not significantly increased in A. filiculoides-Sterilized compared to the boiled control whilst it reached on average 362 in ferns with cyanobacteria (Fig. S9). Endophytic Rhizobiales in A. filiculoides-Sterilized therefore did not fix N 2 . This result was consistent with the absence of the N 2 -fixing pathway in our Rhizobiales genomes Fig. 5 Nitrogen metabolism pathway comparing merged Agrobacterium-like and Sinorhizobium-like genomes and Nostoc azollae. The KEGG database was used to retrieve proteins from the closest relative, which was manually annotated (Kanehisa et al., 2010), and the proteins were then aligned via BLAST to the merged scaffolds using the RAST/SEED viewer tool (Overbeek et al., 2014). The KEGG-map of the nitrogen metabolism pathway was used to color-in proteins detected in the merged scaffolds named after the closest relative computed by RAST, or in the N. azollae genome using the KEGG/NCBI annotation: Agrobacterium-like (yellow), Sinorhizobium-like (red) and N. azollae (green). www.newphytologist.com (Fig. 5). In air without 15 N 2 added, biomass d 15 N of the ferns with cyanobacteria in the absence of N-fertilizer was much higher than with fertilizer (Fig. 6b, +cyano-N vs +cyano+N), consistent with inhibition of N 2 -fixation on media with 2 mM NH 4 NO 3 in Fig. 6(a). The most negative d 15 N in A. filiculoides-Sterilized confirmed the absence of N 2 -fixation in these ferns (Fig. 6a). N 2 O release was robustly detected when assayed after 6 h in darkness using nonsterile Azolla on medium with 2 mM NH 4 NO 3 , but not on medium without nitrogen fertilizer (Fig. 6c), even after much longer than 6 h incubation (data not shown). Dependence of N 2 O release on medium with N suggested that if any N 2 O was synthesized in the leaf pockets it would be efficiently converted into N 2 . In contrast to nonsterile A. filiculoides, N 2 O release was not detected when A. filiculoides-Sterilized were grown used on medium with N-fertilizer after 6 h of darkness at the end of the night and in a micro-oxic air space (Fig. 6d). N 2 O release from nonsterile A. filiculoides therefore probably originated from bacteria loosely associated with the fern surface, not from the endophytes. The results were consistent with the low abundance of the denitrifying Rhizobiales endophytes (Fig. 2).

Discussion
Nostoc azollae is abundant and the only cyanobacterium that fixes N 2 in Azolla filiculoides N. azollae in A. filiculoides from the present study and the published strain from Stockholm (Ran et al., 2010) were the same species based on the above 97% identity of their rRNA. Our analyses in Fig. 1 showed enrichment of N. azollae rRNA in the leaf juice and did not detect any rRNA from another cyanobacterial species, suggesting that in the Utrecht ferns, N. azollae was the only abundant cyanobacterium in the leaf pockets; Figs 6 and S9 further demonstrated that N. azollae was responsible for N 2 -fixation in the ferns. The large number of reads that mapped to the N. azollae genome with < 100% identity in the recruitment analyses (Fig. 4) were probably explained by natural variation in bacterial populations and activity of insertion elements in N. azollae (Vigil-Stenman et al., 2015). Previous reports suggesting that several species of cyanobacteria may inhabit the leaf pockets (Gebhardt & Nierzwicki-Bauer, 1991) may have described very low abundance cyanobacteria not detected by our analyses, which revealed bacteria with a relative rRNA abundance at relative detection limit of 0.2%. Our analyses confirmed the presence of less abundant Gram-negative eubacteria in leaf pockets of A. filiculoides, in particular that of an Agrobacterium strain (Plazinski et al., 1990).

Two novel candidate bacterial species from the Rhizobiales are persistent endophytes of all Azolla species
Our data support that Azolla has control over the bacterial community assembly within its closed leaf pockets. First, the bacterial community of the surrounding ditch water was dominated by Proteobacteria, which are typically found in Dutch ditches (El-Chakhtoura et al., 2015), and had no overlap with taxa within the Azolla leaf pocket. Second, different Azolla species cultured under the same conditions housed reproducibly different assemblages of microbial endophytes (Fig. 2, Cultured). Third, Rhizobiales endophyte genome scaffolds were recovered from sequencing nuclear preparations of A. filiculoides-Sterilized; this

New Phytologist
Azolla strain had been grown on erythromycin then cultured in sterile conditions for over 2 yr (Fig. 4). In accordance, Arabidopsis leaf endophytes were shown to depend on the plant genotype, thus demonstrating that the plant host controls the assembly of endophytic bacterial communities (Horton et al., 2014); gene loci that influenced the bacterial communities, for example, encoded regulators of viral reproduction, pectin metabolism and trichome development. The Azolla control over the leaf pocket bacterial community may also depend on the presence of cyanobacteria, since Burkholderiales were more abundant in A. filiculoides-Sterilized (Fig. 2a). The more general lesson learnt was that bacterial scaffolds in genome assemblies deserve attention as they may represent persistent endophytic bacteria.
Rhizobiales bacteria were found in all species of Azolla examined, despite the low proportion of reads with 16S rRNA sequences when sequencing all DNA extracted from the ferns or leaf juice compared to when sequencing PCR-amplified rRNA genes. The difference in the 10 and 30 M read-based taxonomy assignments using EMIRGE/MOTHUR in Fig. 2 and no saturation in Fig. 1(b) attest to this limitation. Rhizobiales were also reproducibly detected in the leaves of several species from the carnivorous angiosperm Genlisea using the meta-transcriptomics approach, which will yield proportionally more rRNA sequences because of the high accumulation of rRNA in RNA extracts (Cao et al., 2015). The long-read assembly of bacterial scaffolds combined with recruitment analyses, however, allowed a very high resolution of the taxonomic assignments in the present study. With RAST, closest relatives were computed scoring homologies of gene candidates predicted by GLIMMER3 with a set of universal proteins and 200 unduplicated proteins (Overbeek et al., 2014). Recruitment analyses further refined this and showed that the Sinorhizobium-like endophytes in A. filiculoides were not the S. meliloti species known (Fig. S4), yet were more than 90% identical over the genome length when comparing the differing Azolla species (Fig. 4). Furthermore, calculating read counts per kilobase in Fig. S3 quantified enrichment of Rhizobiales in leaf-pocket content compared to plant samples, thus locating Agrobacteriumlike bacteria preferentially in the leaf pockets. Unlike the cyanobacteria in the leaf pockets, the Rhizobiales endophytes did not fix N 2 and were present in much lower abundance, as judged from the recruitment analyses.
A possible role for denitrifying Rhizobiales of the Azolla metagenome Persistent Rhizobiales endophytes with denitrifying pathways suggested there may be some wasted cycling of the fixed nitrogen that is not likely to be of direct benefit to Azolla (Fig. 5). In the absence of N fertilizer Azolla will thrive entirely on N 2 fixed by N. azollae; this explained the low d 15 N of the fern biomass grown without N fertilizer compared to legume biomass reported earlier ( Fig. 6; Hipkin et al., 2004) and suggested that growth of Azolla was not limited by nitrogen. Rhizobia are known epiphytes of cyanobacteria heterocysts (Stevenson & Waterbury, 2006). Possibly, the heterotrophic Rhizobiales help to lower the massive amounts of O 2 released from leaf cell photosystem II activity at daytime in the leaf pockets, thereby preserving nitrogenase efficiency inside the heterocysts. Rhizobia may have adapted to survive the micro-oxic environment they create, particularly at night, by respiring nitrate or nitrite. Bradirhizobium japonicum in soybean nodules is responsible for the bulk of N 2 O emissions when flooding soybeans: plants nodulated with B. japonicum mutants with a defect in NapA nitrate reductase producing nitrite emitted less N 2 O whilst plants with a defect in N 2 O reductase emitted more N 2 O (Tortosa et al., 2015). In Azolla, as in legumes, therefore, the denitrification pathway may present an adaptive advantage even though it may constitute futile cycling: survival of the bacteria when O 2 levels are low. Direct N 2 O release from surfacesterilized Azolla containing the Rhizobiales genomes could not be detected in this study, however, even under micro-oxic conditions and after a prolonged night.
Possibly, endophyte communities co-evolve with Azolla, and the metagenome is the unit that undergoes selection by the environment. This would be demonstrated if phylogenetic relationships of Azolla and its endophytes were to mirror one another, and if the endophytes were shown to be transmitted vertically upon sexual reproduction of Azolla by way of spores. Vertical transmission has been demonstrated for N. azollae in A. filiculoides (Ran et al., 2010), and it is entirely possible that the rhizobia reported here are similarly transmitted together with N. azollae in the megasporangiate sori of A. filiculoides (Carrapic ßo, 1991;Zheng et al., 2009). Phylogenetic studies are underway to verify this because, if true, it would imply that crop breeding approaches would have to consider endophytic communities.
Nitrification: how could nitrate and nitrite be formed from the NH 4 + released by Nostoc azollae?
Because bacterial endophytes from rice roots contained the AmoA (pfam 05145) ammonia monooxygenase (Sessitsch et al., 2012), which converts ammonium to nitrate, it is plausible that Azolla endophytes still awaiting characterization may be capable of converting the ammonium released by N. azollae to nitrate. Alternatively, many N 2 -fixing plants are capable of phototrophic nitrification (Hipkin et al., 2004). In several leguminous plants, malonate is transformed via monoamide to 3-nitropropionic acid (3-NPA) and then to nitrate and nitrite (Francis et al., 2013). 3-NPA is an inhibitor of mitochondrial succinate dehydrogenase (E.C. 1.3.5.1) and is therefore a strong antigrazing compound. It has been shown to accumulate at high levels in aquatic plants that fix N 2 (e.g. Lotus), and is inactivated by the by 3-NPA oxidases detected in a leguminous herb and characterized in Pseudomonas aeruginosa, Burkholderia phytofirmans and fungi (Nishino et al., 2010;Francis et al., 2013;Salvi et al., 2014). It will be important to decipher whether nitrification reactions occur within the leaf pocket or inside the fern cells. The combination of nitrifying and denitrifying endophytes could permit Azolla to cope with surplus levels of NH 4 + from N. azollae or micro-oxic ditch waters when phosphate availability is limiting and therefore contribute to defining the aquatic fern's ecological niche.          New Phytologist is an electronic (online-only) journal owned by the New Phytologist Trust, a not-for-profit organization dedicated to the promotion of plant science, facilitating projects from symposia to free access for our Tansley reviews.
Regular papers, Letters, Research reviews, Rapid reports and both Modelling/Theory and Methods papers are encouraged.
We are committed to rapid processing, from online submission through to publication 'as ready' via Early View -our average time to decision is <26 days. There are no page or colour charges and a PDF version will be provided for each article.
The journal is available online at Wiley Online Library. Visit www.newphytologist.com to search the articles and register for