Identification of key amino acid residues determining product specificity of 2,3oxidosqualene cyclase in Oryza species

Triterpene synthases, also known as 2,3-oxidosqualene cyclases (OSCs), synthesize diverse triterpene skeletons that form the basis of an array of functionally divergent steroids and triterpenoids. Tetracyclic and pentacyclic triterpene skeletons are synthesized via protosteryl and dammarenyl cations, respectively. The mechanism of conversion between two scaffolds is not well understood. Here, we report a promiscuous OSC from rice (Oryza sativa) (OsOS) that synthesizes a novel pentacyclic triterpene orysatinol as its main product. The OsOS gene is widely distributed in indica subspecies of cultivated rice and in wild rice accessions. Previously, we have characterized a different OSC, OsPS, a tetracyclic parkeol synthase found in japonica subspecies. Phylogenetic and protein structural analyses identified three key amino acid residues (#732, #365, #124) amongst 46 polymorphic sites that determine functional conversion between OsPS and OsOS, specifically, the chair–semi(chair)–chair and chair–boat–chair interconversions. The different orientation of a fourth amino acid residue Y257 was shown to be important for functional conversion The discovery of orysatinol unlocks a new path to triterpene diversity in nature. Our findings also reveal mechanistic insights into the cyclization of oxidosqualene into tetraand pentacyclic skeletons, and provide a new strategy to identify key residues determining OSC specificity.


Introduction
Metabolic enzymes produce an enormous array of chemicals that provide adaptive strategies for plants in challenging terrestrial environments (Weng, 2014). Over 20 000 steroids and triterpenoids with c. 200 different skeletons have been identified in eukaryotic organisms (Xu et al., 2004). In addition to the essential function of sterols in maintaining cell membrane fluidity and permeability (Bloch, 1965;Benveniste, 1986;Parks & Casey, 1995), a large number of structurally diverse plant triterpenoids exist that have important functions in crop defense, food quality  and as drug leads (Augustin et al., 2011;Moses et al., 2014). 2,3-Oxidosqualene cyclases (OSCs) catalyze the first committed step in triterpene biosynthesis, namely the cyclization of the universal triterpene precursor 2,3oxidosqualene (1), and therefore define sterol and triterpene skeletal diversity (Thimmappa et al., 2014). Most steroids reported from nature to date are seemingly derived from tetracyclic skeletons, including cycloartenol, lanosterol and parkeol (2) via a chair-boat-chair (C-B-C) conformation prosteryl cation path. By contrast, the most common pentacyclic triterpenoid skeletons include lupeol (3) and b-amyrin (4), generated via a chair-chair-chair (C-C-C) conformation dammarenyl cation path (Fig. 1a). However, despite the intensive study of the OSC enzyme family, its functional diversity is yet to be fully explored.
Certain key amino acid residues have been shown to determine the functional diversity of OSCs (Chappell, 2002;Wu et al., 2008). Random mutagenesis and structure-guided, site-directed mutagenesis experiments have identified a total of at least 17 amino acid residues that are essential for the initiation or cyclization process resulting in product specificity (Hart et al., 1999;Herrera et al., 2000;Matsuda et al., 2000;Wu & Griffin, 2002;Salmon et al., 2016) (Supporting Information Table S1). Eight directly interacting residues (DIRs), the catalytic D455 of the conserved DCTAE motif (Christianson, 2006), aromatic amino  Fig. 1 Carbocation intermediates and products of 2,3-oxidosqualene cyclization by triterpene cyclases. (a) Substrate folding leads to the protosteryl C-B-C (chair-boat-chair), orysatinol C-17 C-C-C (chair-(chair)-chair) or dammarenyl C-C-C conformation in initial cyclization. Three-dimensional (3D) views of orysatinol and parkeol represented by X-ray structure (ORTEP drawing, shown at 30% probability) (Supporting Information Notes S1) and 3D-optimized structure from ACD-lab, respectively. (b) Proposed reaction scheme of cyclization. The cyclization pathway of parkeol, polypodatetraenol, marnerol and orysatinol, which initially shared the same bicyclic C-8 cation I (C-B: chair-boat) rather than biocyclic C-8 cation II (C-C: chair-chair). *Marnerol and orysaspirol are the reduction products of their aldehyde form by endogenous reductases in Pichia pastoris.

Research
New Phytologist of premature terminated cyclization products. Mutations at the six indirectly interacting residues (IIRs), W230, G441, F442, S454, S699 and C700, and three other residues (T381, C449 and V453), caused changes in product types and/or profiles (Table S1). Although extensive mutagenesis experiments have been conducted in many laboratories, very little is known about the key residues required for functional conversion between tetraand pentacyclic triterpene synthases.
Rice (Oryza sativa) is an important crop. Its genome contains an expanded OSC gene family with 12 members, four of which have been shown previously to synthesize a variety of triterpenes (Inagaki et al., 2011;Ito et al., 2011;Xue et al., 2012;Sun et al., 2013). OsOSC7 from the ssp. japonica of cultivated rice encodes an accurate tetracyclic triterpene synthase (OsOSC7/OsPS) that cyclizes 2,3-oxidosqualene to form tetracyclic parkeol (Ito et al., 2011;Xue et al., 2012). Sequence variation between the OsOSC7 genes in the japonica and indica subspecies suggests that OsOSC7 from indica may produce non-parkeol triterpenes. In this study, OsOSC7 genes from a total of 34 accessions of cultivated rice and the closely related wild species, Oryza rufipogon, Oryza nivara and Oryza meridionalis (Kovach et al., 2007;Sang & Ge, 2007), were cloned to investigate the functional diversity of the cognate gene products. Using a combined approach that relied on phylogenetic, molecular evolution, structural modeling and biochemical analyses, the key amino acid residues underlying triterpene product specificity were identified, providing catalytic insights into the functional conversion between tetra-and pentacyclic triterpenes.

Plant collection and growth
The cultivated rice (Oryza sativa L.) and wild rice (Oryza rufipogon, Oryza nivara and Oryza meridionalis) accessions were from collections of rice accessions maintained at the Institute of Botany, Chinese Academy of Sciences (IBCAS) in Beijing, and China National Rice Research Institute (CNRRI) in Hangzhou, China. The accessions were selected on the basis of the germplasm database records of phenotypic data and sampling localities to maximize genetic and geographic diversity, and were maintained by selfing.

Quantitative reverse transcription-polymerase chain reaction (qRT-PCR)
The seeds of Zhonghua 11 (O. sativa ssp. japonica, ZH11) and Guangluai 15 (O. sativa ssp. indica, GL15) were soaked in sterile distilled water and germinated on wet filter paper at 30°C. The radicles, coleoptiles and plumules were collected from 2-d-old seedlings; 20-d-old seedlings were transplanted to the field at IBCAS. The roots, leaf sheaths, leaf blades and shoot apical meristems were collected from plants at the five-leaf stage, and roots, nodes, culms (stems), leaf sheaths, leaf blades, lemmas, paleas, stamens and pistils were collected at the flowering stage. Embryos and endosperms were collected 21 d after pollination. Tissue samples were frozen in liquid nitrogen before RNA extraction using TRI-REAGENT (Sigma catalog no. T9424). For qRT-PCR analysis, cDNA synthesis was carried out using Superscript III Reverse Transcriptase (Invitrogen) and 2 lg of DNase-treated total RNA, according to the manufacturer's instructions. Primers for the amplification of fragments of OsOSC7j, OsOSC7i and OsACTIN1 (LOC_Os03g50885, as a reference gene) are listed in Table S2. qRT-PCR analysis was carried out on a Rotor-Gene 3000 thermocycler (Corbett Research, Mortlake, Australia) using the SYBR Green ER qPCR SuperMix Universal Kit (Invitrogen). The relative expression levels for each gene were normalized to the actin gene OsACTIN1, with three biological replicates per targeted gene.
Gene cloning and yeast transformation cDNA derived from leaf sheath material was used to clone OsOSC7 coding sequences (CDSs). The amplified products were cloned into the pGEM-T easy vector (Promega) and sequenced from both ends. They were then cloned into the expression vector pPICZA (Invitrogen) between the SpeI and XbaI restriction sites to place the OSC open reading frame (ORF) under the control of the methanol-inducible promoter AOX1. Pichia pastoris wildtype strain X33 was transformed using electroporation according to the protocol described in the EasySelect TM Echo TM -Adapted Phicia Expression kit (Catalog no. ET230-02; Invitrogen) using Bio-Rad Gene Pulser Xcell.
Purification and structural elucidation of O. sativa OSC (OsOS) and O. rufipogon polypodatetraenol synthase (OrPtS) products Orysatinol was extracted from a 2-l P. pastoris culture expressing wild-type OsOS. The cell pellet was mixed with 0.5 l of saponification reagent (20% (w/v) KOH in 50% (v/v) ethanol) and incubated at 70°C for 2 h before extracting twice with an equal volume of hexane. The hexane extracts was loaded onto a silica gel column (zcx. II, granularity 200-300; Haiyang, Qingdao, China), 30 cm long and 2.4 cm in diameter, and eluted with hexane : ethyl acetate (6 : 1 v/v). Fractions were collected in 10-ml tubes and analyzed by thin layer chromatography (TLC) and gas chromatography-mass spectrometry (GC-MS), as described below. Fractions containing orysatinol were dried in a rotary evaporator and further purified by reverse-phase highperformance liquid chromatography (HPLC) on an Agilent 1200 series liquid chromatograph (Santa Clara, CA, USA) equipped with a semi-preparative column (Eclipse XDB-C18, 5 lm, 9.4 mm 9 250 mm, Santa Clara, CA, USA) using a gradient from 95% to 100% methanol for 35 min at a flow rate of 2.5 ml min À1 at 40°C. The semi-preparative fractions were collected manually based on UV absorbance at 210 nm and monitored by GC-MS using the methods described below. Fractions containing the compound of interest were concentrated to dryness.
Marnerol, orysaspirol and polypodatetraenol were purified as described above for orysatinol from 40 l of P. pastoris culture expressing OsOS substitution strains and OrPtS, respectively. The purified compounds were analyzed by nuclear magnetic

General considerations for NMR
NMR spectra were recorded in Fourier transform mode at a nominal frequency of 800 MHz for 1 H and 200 MHz for 13 C NMR, or 400 MHz for 1 H NMR and 100 MHz for 13 C NMR, using the specified deuterated solvent. Chemical shifts were recorded in parts per million (ppm) and referenced to the residual solvent peak or to an internal tetramethylsilane standard. Multiplicities are described as: s, singlet; d, doublet; dd, doublet of doublets; dt, doublet of triplets; t, triplet; q, quartet; m, multiplet; br, broad; appt, apparent; coupling constants are reported in hertz.

Alignment, phylogenetic, network and inferred ancestral sequences analysis
Multiple alignments of OSC protein sequences were performed, and a codon matrix was produced using the MUSCLE alignment package in MEGA6 (Tamura et al., 2013). The evolutionary history was inferred using the maximum likelihood method based on the JTT matrix-based model. The bootstrap consensus tree inferred from 1000 replicates was taken to represent the evolutionary history of the taxa analyzed. Median-joining network analysis was carried out using DNA ALIGNMENT v.1.3.1.1, NETWORK v.4.6.1.1 and NETWORK PUBLISHER v.1.3.0.0 software (Fluxus Technology, Clare, Suffolk, UK). The ancestral sequences at nodes P and S were inferred by ancestral sequence inference in MEGA6 using the maximum likelihood method under the general time reversible (GTR) model. The rate and pattern were gamma distributed with invariant sites (G + I).

Molecular evolutionary analysis
The free ratio model of CODEML, implemented within the PAML4 software package (Yang, 2007), was used to estimate the lineage specificity of the non-synonymous to synonymous substitution ratio x. A branch site analysis, which compared the nearly neutral model with Model A, was performed to test the assumption that the foreground x value of a specific branch was > 1 at sites at which positive selection appeared to have acted within a specific lineage. The resulting likelihood ratio tests were performed at the 5% level.

Modeling and plasticity residue identification
Three-dimensional models of the OsPS and OsOS proteins were generated by modeling with HsLAS (1W6K) template (Thoma et al., 2004) using SWISS-MODEL software (Biasini et al., 2014). The three-dimensional structure of parkeol was obtained from https://pubchem.ncbi.nlm.nih.gov/. Docking searches were performed using the Lamarckian genetic algorithm, with a maximum of 25 000 000 energy evaluations and other options set as default in Autodocking tools (Morris et al., 2009). Potential models were returned ranked on the basis of binding energy and the top ranked model was assumed to be the most likely. The models were graphically rendered using CHIMERA software (https://www.cgl.ucsf.edu/chimera/olddownload.html) (Pettersen et al., 2004). Four amino acid residues were identified within a 5-A radius of the active site.

Mutagenesis experiments
Mutagenesis was performed using the QuikChange site-directed mutagenesis method (Cat. 200519; Stratagene, Santa Clara, CA, USA). The primers used for site-directed mutagenesis are listed in Table S2 with the substitutions underlined. PCR mix was composed of 0.1 mM of deoxynucleoside triphosphates (dNTPs), 40 ng of plasmid template, 1 9 Phusion HF buffer, 0.25 lM of each primer, 2 U of Phusion DNA polymerase (New England Biolabs, Ipswich, MA, USA) and double-distilled H 2 O to a final volume of 20 ll. The reaction mixture was denatured at 98°C for 30 s, and then run for 20 cycles of denaturation at 98°C for 10 s, annealing at 60°C for 30 s, polymerization at 72°C for 6 min; a final extension was carried out at 72°C for 30 min. PCR products

Research
New Phytologist were incubated with DpnI at 37°C for 2 h to digest the parental supercoiled DNA. The plasmid DNAs were isolated by a plasmid DNA purification kit (Macherey-Nagel, D€ uren, Germany), according to the manufacturer's instruction. The mutations were confirmed by DNA sequencing (Eurofins Genomics, Ebersberg, Germany) and the plasmids were subsequently electroporated into the P. pastoris wild-type strain X33 and selected for growth on yeast extract peptone dextrose medium with sorbitol (YPDS) plates with 100 lg ml À1 of zeocin.

Metabolite extraction and GC-MS analysis
Transformed P. pastoris strains were grown at 30°C in glycerol minimal medium (MGY, 1.34% yeast nitrogen base, 1% glycerol, 4 9 10 À5 % biotin) in a shaking incubator (250-300 rpm) until the culture reached the log phase (optical density at 600 nm (OD 600 ) = 2-6). Cells were collected by centrifugation and the cell pellet was resuspended to an OD 600 = 1 in minimal methanol medium (MM, 1.34% yeast nitrogen base, 4 9 10 À5 % biotin, 0.5% methanol). The resuspended cells were then incubated at 30°C for 72 h, with the addition of 0.5% methanol every 24 h. Cell pellets were collected from 4 ml of culture and saponified in 1 ml of saponification reagent (20% (w/v) KOH in 50% (v/v) ethanol) with constant shaking at 70°C for 1 h. Water (0.5 ml) was added to the resulting product and the mixture was extracted twice with 1.5 ml of n-hexane. Hexane extracts were combined to obtain the crude extract. Aliquots of hexane solution (100 ll) were dried under N 2 and derivatized using 50 ll of trimethylsilyl imidazole/pyridine reagent (Sigma-Aldrich 92718, UK) at 70°C for 30 min. Reaction solutions were diluted with 50 ll of hexane and analyzed using an Agilent 6890 instrument under electronic impact at 70 eV with a Zebron ZB-5 HT capillary column (0.25 mm 9 30 m) (Torrance, CA, USA). The oven temperature was initially set at 170°C, raised from 170 to 290°C (6°C min À1 ), maintained for 4 min, and then elevated from 290 to 340°C (10°C min À1 ). The relative quantification of the compounds was carried out by comparison with an internal standard (betulin; Sigma-Aldrich B9757, UK).

Immunoblotting
To extract protein, 300 mg of cultured P. pastoris cells were suspended in 2 ml of protein extraction buffer (50 mM Tris-Cl, pH 7.5, 150 mM NaCl, 5 mM ethylenediaminetetraacetic acid (EDTA), 10% glycerol, 1% w/v polyvinylpolypyrrolidone (PVPP), 1% Triton-X100, 1 9 complete protease inhibitor tablet (Roche)) and lysed three times using a French press at 7756.6 kPa, followed by incubation at 4°C for 1.5 h. The lysates were centrifuged (12 000 g, 12 min, 4°C), and the supernatant was mixed with an equal volume of 4 9 sodium dodecylsulfate (SDS) loading buffer and heated at 95°C for 10 min. A 10-lL aliquot of each sample was loaded onto a NuPAGE Novex 4-12% Bis-Tris gel (Invitrogen) and electrophoresed at 150 V for 60 min. A wet transfer cell (Bio-Rad, Hercules, CA, USA) was used to transfer proteins from the gel to the membrane (60 min, 50 V). The membranes were blocked by exposure to 5% w/v powdered skimmed milk in Tris-buffered saline with 0.05% (v/v) Tween 20 (TBST, 10 mM Tris, 100 mM NaCl) overnight at 4°C. The membranes carrying the transferred proteins were incubated with a mouse monoclonal anti-OsPS antibody (Abmart, Shanghai, China) (1 : 250 dilution in TBST and 5% w/v skimmed milk) for 1 h at room temperature, washed in TBST at room temperature (3 9 10 min), incubated with a goat anti-mouse immunoglobulin G-horseradish peroxidase (IgG-HRP)-conjugated secondary antibody (Sigma, UK) diluted 1 : 5000 in TBST and 5% w/v skimmed milk, and rinsed thoroughly. Visualization of the conjugated secondary antibody was performed by supplying SuperSignal ® West Dura Extended Duration Substrate (Thermo Scientific, Waltham, MA, USA) for 5 min, followed by exposure to X-ray film.

Results
Characterization of a novel pentacyclic orysatinol synthase from indica subspecies S1a) based on the previously annotated 12 OSCs from ssp. japonica (cv Nipponbare, Nip) (Inagaki et al., 2011). Genes from indica and japonica are designated with the suffixes 'i' and 'j', respectively. The sequences of OsOSC7 (Os11g08569) in GL15 and Nip have 35 non-synonymous and seven synonymous substitutions, giving the highest ratio of K a /K s (1.559) among the 12 pairs of OsOSCs evaluated (Table S3). Both OsOSC7i and OsOSC7j are highly expressed in the plumules at the germination stage and in the sheath at the seedling and flowering stages (Fig. S2). The co-linearity and high sequence identity of OSCs between the two subspecies (96%) indicate that OsOSC7i is orthologous to OsOSC7j (Fig. S1b,c).
OsOSC7j is a parkeol synthase (OsPS) (Ito et al., 2011;Xue et al., 2012). The high K a /K s ratio (1.559), combined with the large number of non-synonymous nucleic acid base changes (35 nucleotides), indicates that OsOSC7i and OsPS have undergone rapid amino acid sequence divergence, suggesting that OsOSC7i may have a different function. GC-MS analysis of metabolites from P. pastoris cells expressing OsOSC7i from GL15 revealed one major product, orysatinol (66.7%) (Fig. 1a), and at least 13 minor compounds compared with the empty vector control (Fig. S3a). The structures of three products were elucidated by spectroscopic data (Figs S3b-e, S4a,b). The major product orysatinol is a novel triterpene with an unprecedented chair-(semi-chair)-chair-chair envelope (C-sC-C-C) conformation featuring a cis-methyl (C-26 and C-27) orientation at the C-D ring-fused carbons (C-13 and C-14) ( Fig. 1; Notes S1, S2). In addition to marnerol (6), which has been characterized in a previous study on Arabidopsis thaliana PEN5 (Xiong et al., 2006) (Fig. S4a), another minor product, named orysaspirol (7), has an extremely unusual seco A ring and spiral B-C ring architecture (Fig. S4b).
These three products can be assigned to different carbocations in the cyclization pathway following the formation of bicyclic C-8 cation I, as proposed in Fig. 1b formed via Grob fragmentation of the bicyclic C-8 cation (Xiong et al., 2006) (Figs 1b, S4a). Reduction of marneral by endogenous reductases in P. pastoris gives marnerol. Orysatinol is likely to be derived from the deprotonation of H-7 of the pentacyclic C8 cation from the proposed cyclization pathway (Fig. 1b). The unprecedented conformation of orysatinol is likely to originate from the formation of the C ring via the attack of the bicyclic C-8 cation by the side-chain C-13 and C-14 double bond from below. The formation of the C-ring then ushers in an entirely new cyclization pathway that yields hitherto unseen triterpene structures, such as pentacyclic orysatinol (5) and tetracyclic orysaspirol (7). The formation of orysaspirol is likely to proceed via the C ring contraction of the pentacyclic C-8 carbocation, followed by 1,2-hydride and methyl shifts, and terminated by Grob fragmentation, forming an aldehyde form which might be reduced by endogenous reductases in P. pastoris, as is the case for marneral. OsOSC7i from GL15 is referred to as orysatinol synthase (OsOS) hereafter.
Metabolite analysis of the sheath extracts of GL15 by GC-MS showed that the detected orysatinol was identical to that of the authentic standard, and the two minor peaks were also minor triterpene products (8) and (9) of P. pastoris cells expressing OsOS (Figs 2, S4c-f). However, parkeol, which has been characterized in japonica cv. Zhonghua 11 (ZH11), was not detected in the sheath extract of GL15 (Fig. 2). These results suggest that OsOS has the same catalytic function both in planta and when heterologously expressed in P. pastoris.

Functional diversity of OsPS and OsOS among Oryza species
We were able to identify 11 distinct protein variants (V1-V11) from OsOSC7 CDSs cloned from two subspecies, indica and japonica, and only three closely related wild species (AA genome), O. rufipogon, O. nivara and O. meridionalis (Kovach et al., 2007;Sang & Ge, 2007) (34 accessions in total; Table S4; Notes S3). The 11 variants were divided into two separate clades. Clade I consists of 23 sequences from ssp. indica and three wild species, O. rufipogon, O. nivara and O. meridionalis, whereas clade II contains 11 sequences that are only present in japonica cultivars and a few accessions of O. rufipogon (Fig. 3a). Expression of OsOSC7 proteins from clade II in P. pastoris showed that V9 did not produce any triterpenes, despite being successfully expressed (Figs 3a, S5, S6). V10 from O. rufipogon accession NEPc (Nepal) and V11 from eight cultivars of japonica, including ZH11, produced a single product parkeol (Fig. S5), and therefore act as parkeol synthases (Fig. 3a). Interestingly, V8 from accession KHM0225 (O. rufipogon, Cambodia) produces a compound that is distinct from parkeol, orysatinol and other minor products of OsOS (Figs 3a, S5). Structural elucidation by one-dimensional (1D) and two-dimensional (2D) NMR analysis established the compound's structure as polypodatetraenol (10), an iridal-type bicyclic triterpene, which was characterized in the previous study of yeast ERG7 C703I/H mutants (Fig. S7), and is referred to as polypodatetraenol synthase (OrPtS). Polypodatetraenol is likely to be synthesized via a 1,2-hydride and methyl shift, followed by deprotonation of the bicyclic C-8 cation (Chang et al., 2012) (Fig. 1b). In clade I, V1, V2, V3 (OsOS) and V4-V7 derived from indica cultivars and those of three wild species, O. rufipogon, O. nivara and O. meridionalis, were shown to be promiscuous OSCs that produce orysatinol as the predominant product with the formation of marnerol, orysaspirol and several other minor, uncharacterized compounds (Figs 3a,b, S5

Research
New Phytologist wide range of geographic regions (Fig. 3a,b; Table S4). OsOS is a promiscuous enzyme making orysatinol and 12 other products (Fig. 3b), whereas OsPS only produces the specialized single product, parkeol, and was only present in cultivars of japonica and its closely related O. rufipogon. These accessions are from Cambodia, Laos and Nepal, and are close to the first domesticated area of japonica from its wild relative O. rufipogon . OrPtS (V8) in O. rufipogon (accession KHM0225, Cambodia) has at least 16 amino acid changes, relative to the ancestral sequence (S) of V1-V7, which resulted in the production of the bicyclic triterpene polypodatetraenol (Fig. 3b). Further, an additional six amino acid changes were responsible for the emergence of ancestral sequence (nP), producing parkeol (C-B-C) (shown in next section) (Fig. 3a, b). Molecular evolutionary analysis of OsOSC7 proteins using PAML software (Yang, 2007) revealed that branch a, which leads to parkeol synthase, was under highly significant positive selection (2d = 8.55, P < 0.01) (Table S5). Bayes empirical Bayes analysis suggested #732 as one of the positive sites (x > 1) having the highest probability (93.6%), whereas, for the seven other sites, the probability ranges from 57.0% to 85.6% (Table S5).
Protein variants V1, V2, V3 (OsOS), V4 and V5 generated products with similar triterpene profiles and with similar orysatinol yields (125-324 lg g À1 yeast cells), whereas V6 and V7 produced much smaller amounts of orysatinol (31-33 lg g À1 yeast cells) and fewer minor products (Fig. 3b,c), even though the The outgroup sequences used were from Sorghum bicolor (http://www.plantgdb.org/SbGDB/, ID Sb05g008010 and Sb05g008020). nP and nS denote nodes P and S for inferred ancestral sequences; a and b denote branches for branch-site model analysis; bold line, branch under positive selection; C-B-C, chair-boat-chair conformation; C-C-C, chair-chair-chair conformation. (b) Median-joining network and triterpene product profile (%) of 11 distinct OsOSC7 protein variants. Circle size corresponds to the number of species identified that carry particular protein types. Lines represent genetic distance between protein sequences, typically indicating an amino acid difference. 5, orysatinol; 7, orysaspirol; 2, parkeol; 10, polypodatetraenol; 9, 11, uncharacterized products. The numbers on the lines indicate amino acid residues that differ among variants. (c) The quantities of total triterpene products (TTPs) produced by 11 OsOSC7 proteins determined by gas chromatography-mass spectrometry (GC-MS) analysis, using betulin as an internal standard. The amounts of the expressed OsOSC7 proteins were quantified by Western blot analysis. The activities of wild-type and all mutants are presented as means AE SE, n = 3 (log 10 of micrograms of the produced TTP per gram of yeast cells; n is three different biological replicates). V1-V11 represent distinct protein variants. New Phytologist protein expression levels were higher (Fig. S6). V11 (OsPS) from japonica produced 12 times more parkeol (14.6 lg g À1 yeast cells) than V10 of O. rufipogon NEPc (Nepal) (1.21 lg g À1 yeast cells); however, both proteins were expressed at a similar level (Figs 3c, S6). These results demonstrate that OsOSC7 variants from cultivated rice and its close relatives are highly divergent as inferred from their product profiles. Not only do they vary in catalytic activity, but also in the conformation of cations en route to the formation of the diverse triterpene structures.

Identification of amino acid sites determining product specificity
In total, 46 amino acid variations were identified across OsOSC7 sequences among the 34 rice cultivars or accessions analyzed. There were 21 amino acid differences between the ancestral sequences at nodes nP and nS (MEGA analysis). Expression of the synthetic ancestral gene of nP in P. pastoris gave parkeol, and that of the nS gene produced orysatinol (Fig. S8a), suggesting that the key residues that determine the functional divergence between these two lineages must be among these 21 amino acids (Figs 4a,b,S8b). Three-dimensional structural models of OsOS and OsPS were generated using SWISS-MODEL with the HsLAS structure as a template (Fig. S8b). There are 25 amino acid residues predicted to be within the 5-A region of the active site. Of these, four sites (#124, #365, #553 and #732) vary between the predicted catalytic regions of OsOS and OsPS (Fig. 4b).
To verify which of these four sites are responsible for the functional specificity of parkeol synthase, P. pastoris expression constructs were built in such a way that each combination of residues of OsPS at these four sites was replaced by the corresponding sites from OsOS. In total, all 15 possible OsPS variants were heterologously expressed in P. pastoris (Table S6). GC-MS analysis of the resultant P. pastoris cell extracts showed that replacement of V553 by alanine had no influence on the product profiles of OsPS (Fig. S9), suggesting that this site does not play an important role in the functional specificity of OsPS. By contrast, single site mutants with substitutions of phenylalanine by leucine (OsPS F124L and OsPS F365L ) resulted in substantially decreased parkeol production (20.98 AE 0.32 lg g À1 to 4.98 AE 2.40 lg g À1 and 2.62 AE 0.32 lg g À1 , respectively) without the formation of additional triterpene products. Mutation of both phenylalanine residues (OsPS F124L/F365L ) resulted in a complete loss of activity (Figs. 4c, S10; Table S7), although it made stable proteins (Fig. S11). Interestingly, when isoleucine was changed to alanine at position #732, the mutant protein OsPS I732A became a multifunctional enzyme, producing both parkeol (55.93 AE 29.93 lg g À1 ) and orysatinol (5.76 AE 2.71 lg g À1 ), in addition to several other triterpenes (20.78% of total triterpene products (TTPs)) (Figs. 4c, S10; Table S7). The OsPS F124L/I732A and OsPS F365L/I732A double mutants yielded dramatically less parkeol, but produced orysatinol and several other triterpenes (Figs 4c, S10; Table S7), indicating that positions #124 and #365 are very important for the maintenance of the parkeol synthase function of OsPS, and position #732 plays a critical role in the catalytic specificity that distinguishes parkeol and orysatinol synthesis. Finally, the triple mutant variant, OsPS F124L/F365L/ I732A , does not produce any parkeol, and becomes an orysatinol synthase, producing orysatinol as the major product (46.32 AE 8.57 lg g À1 , c. 61.2% of TTP) and several other triterpenes which are similar to that of the wild-type OsOS (66.7%) (Figs 4c,e, S10; Table S7).
We created all possible OsOS variants by replacement of amino acid residues at each of these three sites, alone or in combination, with the corresponding residues of OsPS, and expressed them in P. pastoris (Table S8). The OsOS A732I variant, with substitution of alanine by isoleucine at the first key site (#732), acquired parkeol synthase activity, producing 3.08 AE 0.22 lg g À1 of parkeol (16.8% of TTP) and a small amount of orysatinol (1.59 AE 0.28 lg g À1 , 8.7% of TTP), compared with that of wildtype OsOS (112.98 AE 17.55 lg g À1 ) (Figs 4d, S12; Table S9). However, replacement of leucine with phenylalanine at the second key site (#365) caused decreased orysatinol production (43.7% of the TTP) with concomitant increases in other nonparkeol triterpenes (56.3% TTP). The OsOS L124F and OsOS L124F/L365F mutants also acquired the ability to produce parkeol, whereas the synthesis of orysatinol was reduced in comparison with OsOS. The OsOS L124F/A732I and OsOS L365F/A732I double mutants and the OsOS L124F/L365F/A732I triple mutant were unable to produce orysatinol, but were able to synthesize parkeol (Fig. 4d). Intriguingly, parkeol synthesis was greatly enhanced in the triple mutant, producing up to 17.59 AE 0.17 lg g À1 , which is close to that of the wild-type OsPS (20.98 AE 0.32 lg g À1 ) (Figs 4d,e, S12; Table S9). Our results not only elucidate the role of the three key sites (#732, #365 and #124) in using different conformation cations to synthesize diverse triterpene skeletons, such as orysatinol (C-sC-C) and parkeol (C-B-C), but also provide insights into the process by which OsPS is likely to have evolved from OsOS.

Mechanism for the function conversion between OsPS and OsOS
To understand how these four sites impact on the cyclization processes leading to parkeol and orysatinol formation, further docking of OsPS and OsOS models with the carbocation intermediates was conducted (Figs S13, S14). To our surprise, only F124 is predicted to directly interact with the C-20 positive cation (4.951 A), potentially providing cation-p stabilization for this intermediate when OsPS is docked with the protosteryl C-20 cation (Table S10; Fig. S13). Although residues A553 and I732 are located within 3-5 A of the carbocations of several intermediates of OsOS and OsPS models, respectively, these two non-polar and non-aromatic amino acid residues are unlikely to directly interact with any carbocation (Table S10; Figs S13, S14). Residues at site #365 in both OsPS and OsOS models are even further from the carbocations of all docked intermediates (Table S10). However, F/L365 is likely to interfere with Y257 and two other residues (Table S11). Interestingly, the orientation of Y257 in OsPS differs from that in OsOS, and the orientations of Y257 in OsPS F124L/F365L/I732A and OsOS L124F/L365F/A732I are twisted and overlapped with that of OsOS and OsPS, respectively (Fig. 5a,b). Relative amount of production g -1 yeast cell (µg g -1 ) Relative amount of production g -1 yeast cell (µg g -1 ) Orysatinol Parkeol  To analyze Y257 function in parkeol and orysatinol synthase, mutations of Y257 with leucine, phenylalanine and alanine were introduced into OsPS, OsOS and the mutant variants OsPS F124L/ F365L/I732A and OsOS L124F/L365F/A732I . GC-MS analysis of the P. pastoris cell extracts showed that Y257 is not a necessary residue for parkeol formation, as most of the OsPS and OsOS L124F/L365F/A732I mutants with substitutions of Y257L, Y257F and Y257A could still produce parkeol, similar to ERG H234X mutants in a previous study (Wu et al., 2005(Wu et al., , 2006. By contrast, the same substitutions (OsOS Y257L,F,A and OsPS F124L/F365L/I732A/Y257L,F,A ) resulted in complete loss of function (Table 1), indicating that Y257 is an essential residue of orysatinol synthase. These results indicate that Y257 is more important in orysatinol formation than in parkeol formation. In addition, mutagenesis of tyrosine with leucine and alanine led to the loss of promiscuity. OsOS L124F/L365F/A732I/Y257L and OsOS L124F/L365F/A732I/Y257A could only produce parkeol ( Table 1). Mutagenesis of tyrosine with another aromatic amino acid residue phenylalanine in OsOS L124F/L365F/A732I/Y257F still yielded the production of several minor triterpenes as sideproducts. This strongly indicates that the p-electrons of Y257 stabilize the intermediary cation(s) through cation-p interaction. OsPS Y257F is unable to produce any other triterpenes, possibly as a result of other unknown factors, such as steric bulk. These results indicate that an aromatic amino acid residue at site #257 is essential for orysatinol formation, and also important for promiscuity.

Discussion
In this study, the three key amino acid sites that determine the functional switch between orysatinol and parkeol synthesis were successfully identified through comprehensive phylogenetic analysis and structural modeling. Of relevance, a study on recently diverged rice subspecies-associated diterpene synthase orthologs led to the successful identification of a single site that is the key determinant of diterpene synthase specificity (Xu et al., 2007). As the number of plant whole genome sequences increases (http:// www.phytozome.net/), it is becoming possible and worthwhile to conduct functional analysis of metabolic enzyme families in multiple species from particular genera or families. The use of evolutionary information combined with protein homology modeling is an effective approach to identify new enzymes and unveil the catalytic mechanisms used by enzymes to create chemical   (Weng, 2014). In this study, such an approach enabled us to identify the new triterpenes, orysatinol and orysaspirol, and allowed us to shed light on previously unknown mechanisms of OSC cyclization. Furthermore, our investigations reveal the stepwise evolution of OsPS from a multifunctional OsOS via the intermediate enzyme, polypodatetraenol synthase, by sequential substitutions of three amino acids at #732, #365 and #124.
Recently, a study of bacterial OSCs identified four residues, at positions W230, H232, Y503 and N697 (positions of HsLAS), that determine the functional interconversion of lansterol and isoarborinol synthases. These enzymes produce tetracyclic and pentacyclic triterpenes from the same protosteryl-type C-B-C conformation (Banta et al., 2016). Studies (Ito et al., 2013;Salmon et al., 2016) on Euphorbia tirucalli and Avena strigosa bamyrin synthase (SAD1) have identified two conserved residues at positions F696 and S699 (positions of HsLAS) which, when mutated, result in a change in product from the pentacyclic triterpenoid b-amyrin to tetracyclic cyclization products, but maintain a dammarane-type C-C-C conformation. The three key residue positions, #124, #365 and #732, in OsPS and OsOS, corresponding to L104, I335 and I702 in HsLAS, that have been identified here were not reported in the previous studies (Ito et al., 2013;Banta et al., 2016;Salmon et al., 2016). This may be because these residues are responsible for the alternation between C-B-C and C-sC-C conformations, whereas the residues identified in the previous studies (Ito et al., 2013;Banta et al., 2016;Salmon et al., 2016) are involved in the formation of the E ring without conformational change. More studies are required in order to gain further insights into the mechanisms of OSC cyclization and product specificity.
Cultivated rice (O. sativa) and its closest wild relative (O. rufipogon) have a broad geographical distribution, with adaptations to different ecological and agronomic conditions. We hypothesize that multifunctional enzymes, such as OsOS, have the ability to quickly create chemical diversity to enhance the survival of species in the challenging environments. Our analysis revealed a fast evolutionary process from OsOS to OsPS, which is normally caused by natural selection under biotic stress (Bergelson et al., 2001). The OsPS-derived pathway may play a role in resistance to insects and/or pathogens. By contrast, parkeol, the product of OsPS, has a similar structure to cycloartenol and lanosterol (Fig. 1a), the precursors of sterol biosynthesis. The product of OsPS may thus be involved in the biosynthesis of a new steroid hormone, and hence influence plant growth and development (Bishop & Koncz, 2002;Ohyama et al., 2009).

Supporting Information
Additional Supporting Information may be found online in the Supporting Information tab for this article:             Table S1 List of essential 2,3-oxidosqualene cyclase (OSC) or squalene-hopene cyclase residues for cyclization of triterpenes  Notes S1 Crystal X-ray diffraction experiment for orysatinol (5).
Notes S3 Sequence alignments of OsOSC7 variants.
Please note: Wiley Blackwell are not responsible for the content or functionality of any Supporting Information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office.
New Phytologist is an electronic (online-only) journal owned by the New Phytologist Trust, a not-for-profit organization dedicated to the promotion of plant science, facilitating projects from symposia to free access for our Tansley reviews and Tansley insights. Regular papers, Letters, Research reviews, Rapid reports and both Modelling/Theory and Methods papers are encouraged. We are committed to rapid processing, from online submission through to publication 'as ready' via Early View -our average time to decision is <26 days. There are no page or colour charges and a PDF version will be provided for each article.
The journal is available online at Wiley Online Library. Visit www.newphytologist.com to search the articles and register for