The liverwort Pellia endiviifolia shares microtranscriptomic traits that are common to green algae and land plants

Liverworts are the most basal group of extant land plants. Nonetheless, the molecular biology of liverworts is poorly understood. Gene expression has been studied in only one species, Marchantia polymorpha. In particular, no microRNA (miRNA) sequences from liverworts have been reported. Here, Illumina-based next-generation sequencing was employed to identify small RNAs, and analyze the transcriptome and the degradome of Pellia endiviifolia. Three hundred and eleven conserved miRNA plant families were identified, and 42 new liverwort-specific miRNAs were discovered. The RNA degradome analysis revealed that target mRNAs of only three miRNAs (miR160, miR166, and miR408) have been conserved between liverworts and other land plants. New targets were identified for the remaining conserved miRNAs. Moreover, the analysis of the degradome permitted the identification of targets for 13 novel liverwort-specific miRNAs. Interestingly, three of the liverwort microRNAs show high similarity to previously reported miRNAs from Chlamydomonas reinhardtii. This is the first observation of miRNAs that exist both in a representative alga and in the liverwort P. endiviifolia but are not present in land plants. The results of the analysis of the P. endivifolia microtranscriptome support the conclusions of previous studies that placed liverworts at the root of the land plant evolutionary tree of life.


Table S1
Oligonucleotides that were used in the experiments.

Table S2
Best matches for the miRBase mature miRNAs that were found in P. endiviifolia sRNA-seq data (as separate Excel file). Table S3 25 pri-miRNAs that were identified in the P. endiviifolia transcriptome.

Table S5
Target mRNAs of known miRNAs that were identified in P. endiviifolia and confirmed by degradome data.

Table S6
Targets for novel miRNAs that were identified in P. endiviifolia and confirmed by degradome data.

Methods S1 R A and D A isolation
Total RNA for sRNA detection was isolated using a method that permits the enrichment of sRNAs (Kruszka et al., 2013) with the following modifications: the precipitation was performed with 1.2 vol. of ethanol and 0.4 vol. of salt solution (0.8 M sodium citrate, 1.2 M NaCl). For cDNA, total RNA was isolated similarly, and RNA was precipitated only after chloroform extraction using 0.5 vol. of isopropanol and 0.5 vol. of salt solution (0.8 M sodium citrate, 1.2 M NaCl). The quantity and quality of RNA were measured using a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA), while the RNA quality was estimated using agarose gel electrophoresis.
Genomic DNA (gDNA) was isolated from the liverwort thalli that were grown in vitro using the DNeasy Plant Maxi Kit (Qiagen, Hilden, Germany). The concentration and quality of genomic DNA were estimated using a NanoDrop ND-1000 spectrophotometer and confirmed by electrophoresis on a 0.6% agarose/EtBr gel.

pri-miR A RACE experiments and genome walking
PCR and RT-PCR reactions, 5' and 3' RACE experiments for the identification of pri-miRNA structures and 5' and 3' genome walking analyses for the identification of MIR gene structures were performed as previously described (Sierocka et al., 2011), with the exception of using Advantage® 2 Polymerase (Clontech Laboratories, Inc., Mountain View, CA, USA). The primer sequences that were used in the RACE and genome walking experiments and the primers that were used for the MIR gene amplification are shown in Table S1. The PCR products were separated on 1.2% agarose gels in 1× TBE buffer. The MIR gene structures were obtained by alignment of transcript and genomic sequences derived from RACE and genome walking experiment for each presented gene.

Deep-sequencing and bioinformatic analyses
For the deep-sequencing analysis, RNA isolation was performed from different types of P. endiviifolia thalli as described in the Materials & Methods. Total RNA (10 µg) was mixed 1:1 (vol/vol) with loading buffer II (Ambion® Austin, TX, USA), denatured for 2 min at 90°C and size-fractionated by 15% denaturing polyacrylamide gel electrophoresis; then, the small RNA fragments of 15-30 nt were isolated from the gel and purified. To localize the small RNAs, a 10-bp DNA ladder was used. An aliquot of 1 µl of GlycoBlue Coprecipitant (15 mg/µl, Ambion® Austin, TX, USA) per sample was applied. The small RNA molecules were then ligated to a 5' adaptor and a 3' adaptor (Table S1) and converted to cDNA by RT-PCR following the protocol of Pant et al. (2009). The purified cDNA libraries were sequenced on the Illumina HiScanSQ platform using SR flow cell v1.5 (Illumina, Inc., San Diego, CA, USA). Each library was loaded onto a single lane in a flow cell. The adaptor sequences were identified and trimmed from each read using a customized Perl script. Reads in which the adaptor could not be identified were discarded. In this manner, low-quality reads were automatically rejected because no adapter sequence could be reliably identified. The sequencing reactions resulted in more than 14 million unique, quality-filtered and adaptor-trimmed reads. As expected for the sRNA sequencing procedure, the size distribution of the short sequences revealed the presence of a dominating class of 21-nt-long reads. Next, the BLASTn program was used to align the reads from each library to known plant mature miRNA sequences that permitted up to two mismatches to currently known plant miRNAs (Altschul et al., 1997). The raw counts of reads for each library were recalculated into "reads per million" (RPM) using the total number of sequences matching miRNA. The 'mean count' for each sRNA sequence was calculated as the total number of normalized RPM counts from all of the libraries divided by the library number.

Degradome sequencing
Approximately 200 µg of total RNA was used for mRNA purification and degradome library construction. However, only female and male thalli growing in vitro and having no reproductive organs were used for total RNA isolation. Polyadenylated RNA was twice purified by hybridization with biotinylated oligo-dT 20-mers that were attached to streptavidin-coated magnetic beads (Dynal M-280) according to the manufacturer's instructions. The next steps of the degradome library preparation were performed as previously described (Addo-Quaye et al., 2009;German et al., 2009). Briefly, mRNA was ligated to an RNA oligonucleotide adaptor containing a 3' EcoP15I recognition site. The ligation products were used to generate first-strand cDNA by reverse transcription (RT). Then, a short PCR was used to amplify the cDNA to obtain sufficient quantities of DNA products. After digestion with EcoP15I, the 5'-ends of amplified ds-cDNA of 65-66 bp were ligated to a double-stranded DNA adaptor. The PAGE-purified ligation products were amplified using bar-coded primers to obtain final libraries that were compatible with the Illumina TruSeq system.    electrophoresis of PCR products using DNA that was isolated from P. endiviifolia that was grown in vitro (lanes 1-4) and in vivo (lane 5). Lanes 1 and 3 represent the PCR products that were obtained using primers for 18S rDNA, which is universal for plants and fungi, while lanes 2, 4, and 5 show the PCR products that were obtained using another pair of primers that are specific for fungi 18S rDNA. (b) Agarose gel electrophoresis of the PCR products using DNA that was isolated from P. endiviifolia that was grown in vitro (lane 7) and in vivo (line 8), and using DNA that was isolated from C. reinhardtii (lane 6). All PCR reactions were performed using the same pair of primers that are specific for Chlorophyta. M -100 bp DNA ladder (Fermentas by Thermo Fisher Scientific Inc., Waltham, MA USA).   OY axis -mean counts; OX axis -the length of a given RNA fragment in each cluster (nt). In each cluster, there are dominating RNA species in a length range that is typical for miRNAs.
The very last diagram shows the distribution length for a small RNA cluster in which no particular RNA fragment was dominant (control). -sRNA homologs that were found in C.
reinhardtii NGS data with 1 or 2 mismatches in the overlapping regions. depicts pen-miR408b*. miR408b* was originally identified as a novel miRNA and was then identified as an miR* in the pre-miRNA408. The read counts represent a single NGS experiment. miRNA (highly/equally abundant sRNA sequence perfectly aligned to the precursor, identified in NGS data and confirmed by northern hybridization) alternative sRNA (identified in NGS data and -in case of novel miRNA -confirmed by northern hybridization) miRNA* (significantly less abundant sRNA sequence perfectly matching to the precursor and identified in NGS data) indicates precursor with experimentally confirmed primary transcript gene pen-miR536 (a) conservative miRNAs

Fig. S6
Secondary structure predictions for the identified P. endiviifolia pre-miRNAs encoding conservative and novel miRNAs.
pen-miR8190 Table S1 Oligonucleotides that were used in the experiments.