Therapeutic recombinant protein production in plants: Challenges and opportunities

Funding information Biotechnology and Biological Sciences Research Council; Margaret Claire Ryan Fellowship Fund at the Yale Jackson Institute for Global Affairs; U.S. Department of Energy, Grant/Award Number: DE‐ SC0012704 Societal Impact Statement Therapeutic protein production in plants is an area of great potential for increasing and improving the production of proteins for the treatment or prevention of disease in humans and other animals. There are a number of key benefits of this technique for scientists and society, as well as regulatory challenges that need to be overcome by policymakers. Increased public understanding of the costs and benefits of thera‐ peutic protein production in plants will be instrumental in increasing the acceptance, and thus the medical and veterinary impact, of this approach. Summary Therapeutic recombinant proteins are a powerful tool for combating many diseases which have previously been hard to treat. The most utilized expression systems are Chinese Hamster Ovary cells and Escherichia coli, but all available expression sys‐ tems have strengths and weaknesses regarding development time, cost, protein size, yield, growth conditions, posttranslational modifications and regulatory approval. The plant industry is well established and growing and harvesting crops is easy and affordable using current infrastructure. Growth conditions are generally simple: sun‐ light, water, and the addition of cheap, available fertilizers. There are multiple op‐ tions for plant expression systems, including species, genetic constructs and protein targeting, each best suited to a particular type of therapeutic protein production. Transient expression systems provide a mechanism to rapidly transfect plants and produce therapeutic protein in a matter of weeks, rather than the months it can take for many competing expression systems, while proteins targeted to cereal seeds can be harvested, stored and potentially purified much more easily than in competing systems. Current challenges for plant expression systems include a lack of regulatory approval, environmental containment concerns and nonhuman glycosylation, which may limit the scope of the type of therapeutic proteins that can be manufactured in plants. The specific strengths of plant expression systems could facilitate the produc‐ tion of certain therapeutic proteins quickly and cheaply in the near future.


| INTRODUC TI ON
Therapeutic recombinant proteins are exogenous proteins that are expressed in a production organism and used for the treatment or prevention of disease in humans or animals. Since human insulin was first produced in Escherichia coli in 1982 (Kamionka, 2011;Pavlou & Reichert, 2004), therapeutic recombinant proteins have become the latest great innovation in pharmaceuticals. Since then hundreds of recombinant protein drugs have come to the market, and hundreds more are currently in development (Margolin, Chapman, Williamson, Rybicki, & Meyers, 2018;Marsian & Lomonossoff, 2016;Meyer et al., 2008;Rader, 2012;Shadid & Daniell, 2016) with the promise of treating diseases from arthritis to cancer. Unlike traditional chemically produced drugs, recombinant proteins can be very large and complex molecules with sophisticated and specific mechanisms of action. Their size and complexity make chemically synthesizing proteins incredibly difficult, so these new drugs must be produced biologically using the protein synthesis machinery found in all cells (Thomas, Deynze, & Bradford, 2002). Production using plant expression systems is both cost-effective and scalable, representing a 'major paradigm shift' for the pharmaceutical industry (Margolin et al., 2018).
The most promising therapeutic recombinant proteins are monoclonal antibodies (mAbs), originally copied from human immunoglobulin G1 (IgG1) to target epitopes with high specificity. As technology has advanced mAbs now have the potential to perform many different functions as therapeutic molecules. For instance, mAbs have the ability to stimulate the host immune system against a target cancer cell, can inhibit enzymes or inactivate other proteins and can mimic a signaling ligand or present an antigen (Dijk & Winkel, 2001). There are currently many other therapeutic recombinant proteins in production and development including hormones, growth factors, cytokines, serum proteins, enzymes, and vaccines (Margolin et al., 2018;Marsian & Lomonossoff, 2016;Rader, 2012;Shadid & Daniell, 2016).

Most therapeutic proteins are produced in either Chinese
Hamster Ovary (CHO) cell cultures or E. coli fermentations, with a significant number also being produced in Saccharomyces cerevisiae and murine myeloma cells (Rader, 2008). These expression systems are the best characterized protein production platforms and each system has its own strengths and limitations. However, there are other expression systems that have not been as well utilized, which may be able to produce new therapeutic drugs or improve the production of current proteins (Table 1).
Plant cultivation technology and practice have been optimized over thousands of years to ensure high yields and low cost production for food and industry, and plant species have been domesticated to produce high biomass yields, have simple and robust growth requirements, and facilitate easier harvesting. Many of these improvements are relevant to the production of therapeutic molecules in plants, giving plant expression systems an advantage over other platforms, where much less time and money has been spent on optimization (with the possible exception of work in yeast fermentation).
While plants have not been used extensively to produce therapeutic protein products in the past, there is a history of genetically engineering plants to produce useful compounds (Vasil, 2008), and a wealth of knowledge in the scientific literature about genetic engineering in crop plants and tobacco in particular (Shinozaki et al., 1986;Zhang, Shanmugaraj, & Daniell, 2017;Zhang, Li, et al., 2017).
Plants have been previously considered as expression systems for therapeutic recombinant proteins, and the concept is gathering steam again as scientists look to increase the efficiency of producing recombinant proteins (Kaiser, 2008;Tekoah et al., 2015).
Plants have many attractive characteristics as a recombinant protein production platform: cheap growth conditions, well-understood manufacturing practices, a high level of scalability, the ability to synthesize complex proteins, existing industry infrastructure, the potential for rapid production timescales, and a low risk of human pathogen contamination (Moustafa, Makzhoum, & Trémouillaux-Guiller, 2016).
In this article, we review the current status of therapeutic protein production in plants. We firstly outline the key considerations for therapeutic protein production systems, demonstrating how plants fit into the broader picture of therapeutic protein production.
Next, we describe the different tools and techniques which may be used to carry out protein production in plants. We then examine the key plant species which are commonly used in this effort, and their advantages and disadvantages for therapeutic protein production.
Finally, we discuss the challenges for the field of therapeutic protein production in plants and conclude by considering what the future holds for this exciting discipline.

| CON S IDER ATI ON S FOR THER APEUTIC PROTEIN PRODUC TION SYS TEMS
There are a number of fundamental issues that must be considered when considering the most appropriate expression system to produce a therapeutic recombinant protein.

| Protein size
E. coli, or other prokaryote cells are the expression system of choice for small proteins (<30 kDa), but struggle to produce high yields of fully formed large peptides, which are more easily produced in eukaryote systems such as plants (Demain & Vaishnav, 2009).

| Folding and solubility
Correct folding of a therapeutic protein is essential for activity and complex proteins can require specific chaperone proteins to facilitate this (High, Lecomte, Russell, Abell, & Oliver, 2000;Margolin et al., 2018). Nonmammalian cells may have difficulty producing the correct folding of human proteins, especially prokaryote cells without protein processing organelles (Sahdev, Khattar, & Saini, 2008).
Additionally, some expression systems (notably E. coli) have issues of insoluble protein accumulation when the product is overexpressed (Verma, Boleti, & George, 1998).

| Posttranslational modification
After translation, many proteins are modified and these modifications may include the formation of covalent bonds, as in the case of disulphide bridges, or the addition of carbohydrate molecules in a process known as glycosylation (Box 1). Most of these posttranslational modification (PTM) mechanisms are conserved across eukaryotes and prokaryotes, but glycosylation mechanisms can differ even between species. Many secretory human proteins are glycosylated which can be essential for protein function affecting serum half-life, immunogenicity, effector function, and solubility (Lim et al., 2010;Sethuraman & Stadheim, 2006). This raises a problem for expression systems based on nonhuman cells, therefore glycosylation is a major concern for every expression system (Box 1, Table 1). The capacity of plants to carry out glycosylation is an advantage over prokaryotic expression systems; and even in insect and yeast cells, glycosylation capacity is limited (Marsian & Lomonossoff, 2016). Glycoengineering in all of the available systems aims to increase the production of human-like glycosylation TA B L E 1 Advantages and disadvantages of current therapeutic protein expression systems  Ghaderi et al., 2012;Gomes et al., 2019;Lagassé et al., 2017;Ma et al., 2003;Verma et al., 1998;Walsh, 2010. profiles in recombinant proteins, and the success of this work

| Safety
As mentioned above, nonhuman PTMs can cause an immune response against the therapeutic protein, and some expression systems have a risk of introducing other contaminants into the drug.
Plant systems generally avoid both of these pitfalls. These risks must be addressed with purification procedures, adding to the cost of downstream processing.

| Genetic engineering
All of the expression systems require the use of transgenic organisms/cell lines, so the ease and stability of performing genetic engineering is particularly relevant. Expression systems that are well characterized and have many genetic tools, such as expression vectors and strong promoters optimized for use in that specific system, will have an advantage. Producing a transgenic E. coli is much easier (Verma et al., 1998) than producing a transgenic goat because of the complexity of the goat's genome and because genetic manipulation is well understood in E. coli. CHO cells, S. cerevisiae, and E. coli are the best understood and therefore the most used expression systems. Using a well-characterized system reduces development time and increases the predictability of the production process. Even CHO cells, a well characterized mammalian cell type, rely on essentially random integration of expression cassettes (Barnes, Bentley, & Dickson, 2003;Manivasakam, Aubrecht, Sidhom, & Schiestl, 2001), and in these less controlled genetic engineering approaches detailed screening is the key to creating productive strains. Another consideration is the genetic stability of an expression system (Barnes et al., 2003), which determines how long the system will continue to produce the target protein at the original level and specificity. Plant expression systems are relatively easy to manipulate genetically, and transgenes are generally more stable than in bacterial systems.

| Yield
The maximum yield of each system is a major consideration. It is obviously beneficial to get the highest yield of correctly folded and posttranslationally modified protein from an expression system, but this is particularly important with regard to downstream processing, which becomes significantly more expensive when purifying protein from a more dilute mixture. The type of cell used to produce the therapeutic protein will also affect the purification procedures used in downstream processing, affecting the overall yield and cost of processing (Kozlowski & Swann, 2006).

| Growth conditions and rate
The growth rate will significantly affect the productivity of each system as production is usually run in a batch process. A faster growth rate will allow more batches over a set time. The specific growth requirements also affect the cost of a process, some cell types, for example, yeast or bacteria, can be grown to a high concentration on a cheap, simple media, while others, such as mammalian cells, require very complex and expensive media for optimum growth.
With great variation between expression systems, and the large number of different therapeutic recombinant proteins, it is unlikely that there is a 'one system fits all' solution to producing affordable protein drugs. In the same way that smaller proteins are currently produced in E. coli and larger proteins requiring human-like posttranslational modifications are produced in CHO cells, different systems are likely to prove to be the most effective expression systems for different proteins. Plants may prove to be the ideal in-between system, able to produce larger therapeutic proteins than bacteria, Box 1. Nonhuman glycosylation profiles Each expression system faces its own glycosylation challenges.
Escherichia coli does not possess any native glycosylation machinery and when engineered to express a Campylobacter jejuni glycosylation system can only glycosylate fully folded proteins (Kowarik et al., 2006), although this can be overcome in some cases using chemical modification (e.g., PEGylation) (DeFrees et al., 2006). Yeast expression systems can glycosylate, but glycan molecules have a much higher proportion of mannose residues than human glycans and often lack fucose and terminal sialic acid residues, reducing the half-life in patients (Ghaderi, Zhang, Hurtado-Ziola, & Varki, 2012;Walsh, 2010). Insect cells add paucimannosidic glycans, which are not found in humans.
Plants exhibit a range of different glycosylation mechanisms which lack certain sugars, including terminal sialic acid residues, and often include β1-2xylose and α1-3fucose residues, which elicit an immune response when introduced intravenously Gala1-3Gal (alpha-Gal) and N-glycolylneuraminic acid (Neu5Gc) residues, which cause rapid clearance of the protein from the bloodstream (Varki, 2009). Homogeneity is a desirable characteristic of any therapeutic molecule, and consistent glycosylation profiles are a challenge for mammalian cell expression systems in particular (Sethuraman & Stadheim, 2006).
while being more scalable and cost-effective than mammalian cell systems, as well as reducing the risk of pathogens and toxic contaminants compared to both systems.

| TOOL S AND TECHNI QUE S FOR PROTEIN PRODUC TI ON IN PL ANTS
There is a wide range of options available when choosing a plant expression system, ranging from the choice of expression vector and promoter to the type of plant that will be used. These options can generate huge differences in yield, protein storage capacity, ease of harvest, and posttranslational modification and must be chosen carefully to suit the requirements for the production of each specific recombinant protein.

| Expression types
Optimal yield of recombinant protein relies on a controlled, high level of transcription, translation, correct folding, targeting, and protein stability (Ma, Drake, & Christou, 2003). The keys to high levels of transcription are the regulatory genetic elements, the most important of which in plants are the promoter and the polyadenylation site.

| Nuclear expression
The basic expression system incorporating transgenes into the nuclear genome of a plant, nuclear expression is the conventional method of genetically engineering plants (Figure 1). Nuclear expression involves transcription in the nucleus and translation in the cytoplasm. It involves the expression of a foreign antigen from the nuclear genome, introduced into the plant using either Agrobacterium tumefaciens-mediated transformation or biolistic gene gun-mediated transformation; signal peptides are used to target proteins for secretion or organellar storage (Shadid & Daniell, 2016). This is the simplest and most widely used method of genetically modifying crops. Disadvantages of this system include gene silencing, risk of transgene contamination through reproductive tissues, and low expression levels (Shadid & Daniell, 2016 alongside the prokaryotic-traceable RNA-guided nuclease Cas9, to precisely edit the genome, and has been applied in both prokaryotes and eukaryotes as a mechanism of genome editing. CRISPR/ Cas9 requires co-transformation of two vectors, which give rise to a crRNA and a tracRNA; these form a two-RNA structure and integrate to form one transcript, the sgRNA, which guides the Cas9 endonucleases to the target DNA sequences (Wang et al., 2018).
CRISPR/Cas9 is highly efficient and highly robust, for example when compared to zinc finger nucleases and transcription activator-like effector nucleases, and is site-specific. A recent study in cotton showed no off-target editing and reported genome editing with an efficiency of 66.7%-100% at each of multiple sites (Wang et al., 2018); off-target mutations seem to occur more frequently in human cells than in plant cells. The most popular promoter for use in dicots is the CaMV 35S from the cauliflower mosaic virus (Ma et al., 2003), a strong constitutive promoter which can be boosted by duplicating the enhancer region (Kay, Chan, Daly, & McPherson, 1987). Alternative promoters such as the maize ubiquitin-1 promoter are used effectively in monocots (Ma et al., 2003;Twyman, Stoger, Schillberg, Christou, & Fischer, 2003 (Ma et al., 2003). Strong, constitutive promoters may give a high overall protein yield, but more nuanced approaches are being explored, as documented in the literature (Ma et al., 2003;Twyman et al., 2003). Tissue-specific promoters, such as those expressed in cereal seeds, target the protein production to certain tissues allowing easier harvesting of the product and avoiding toxicity in the parent plant which may inhibit growth (Twyman et al., 2003). In fact, with the discovery of a nectary promoter, work has been done to express proteins in the nectar of a flower, which can be harvested by bees and concentrated into honey (Breithaupt, 1999). Honey has the multiple advantages of concentrating the protein and being made up of almost exclusively sugar, greatly easing the purification process. Honey also has natural preservative properties, increasing the shelf-life of the protein (Breithaupt, 1999). Inducible promoters have also been used to initiate protein production just before, or after harvest, again, to avoid the growth limiting effects of recombinant protein over-expression (Twyman et al., 2003).

| Chloroplast expression
Chloroplast expression involves the introduction of a transgene into the chloroplast genome using a particle gun. Transforming a recombinant gene into the chloroplast genome has a number of advantages over nuclear transformation (Figure 1). The chloroplast genome is more easily manipulated-if the chloroplast genome has been sequenced, a transgene cassette can be created to insert foreign genes into a spacer region between functional chloroplast genes, using two known flanking sequences in the chloroplast genome, via homologous recombination (Daniell, Lin, Yu, & Chang, 2016;Daniell, Streatfield, Streatfield, & Wycoff, 2001). This precise targeting avoids placing the gene into a part of the genome which is poorly transcribed, ensuring a high level of expression.
Additionally, gene silencing has not been documented using this method. Transformation into the chloroplast genome is more difficult than transformation into the nuclear genome due to the double membrane barrier found around the chloroplast and the lack of any virus known to infect the chloroplast. However, effective transformation has been achieved using the gene gun methodbombarding young plant tissue with gold or tungsten particles coated with DNA (Verma, Samson, Koya, & Daniell, 2008). Since there are thousands of copies of the chloroplast genome in each leaf cell, very high yields (over 70% of the total soluble protein in plant leaves) have been achieved using chloroplast expression  as the method allows a high gene copy number per cell (Ma et al., 2003;Shadid & Daniell, 2016). Chloroplast expression has the added benefit of reducing the risk of genes leaching into the environment as chloroplast genes are maternally inherited in most crop plants, and expression in the chloroplasts allows harvest before the appearance of any reproductive structures ensuring "total biological containment of transgenes" (Verma et al., 2008). Glycosylation does not occur in chloroplasts, which allows the production of therapeutic proteins completely free of glycosylation (Verma et al., 2008). This removes a source of immunogenicity but also limits the ability to produce some therapeutic proteins such as antibodies which require glycosylation to function.
Conversely the lack of a glycosylation pathway provides a glycoengineering opportunity, with a "clean slate" to engineer a custom glycosylation mechanism in chloroplasts without the need to alter or interfere with host glycosylation pathways which may be essential for cell viability. The current chloroplast expression system is best suited to proteins which do not require significant posttranslational modification and a number of vaccines and human proteins have been produced using this method including cholera toxin B

| Transient expression
Transient expression (Figure 1) allows the rapid production of recombinant proteins, drastically reducing the development time of the expression system. This can be used to test genetic constructs and for rapid sampling of recombinant proteins for functional analysis (Twyman et al., 2003). Transient expression also has the potential to be used for the production of large amounts of protein as a mainstream production platform, but ultimately has limited scaling up potential compared with transgenic plants (Vaquero et al., 2002). The rapidity of the system nevertheless provides the potential for a rapid response, for example, in response to a pandemic, since the need for full transformation is eliminated (Marsian & Lomonossoff, 2016). is much more efficient than that of integrated genes, reported to be at least 1,000 fold higher (Janssen & Gardner, 1990)  VLPs do not contain infectious genomic material, so they are considered safe, yet they are similar enough to virus particles to successfully elicit an immune response (Marsian & Lomonossoff, 2016).
The safety of these particles is a major advantage over traditional  (Love et al., 2012). In addition to immunity, VLPs could also be used for drug delivery (Marsian & Lomonossoff, 2016). eVLPS (RNA-free, empty VLPs) are being developed for various applications including cell-specific drug targeting (Wen et al., 2012).
Successful VLP production may partly depend on ensuring acid-and thermostability, for both function and storage purposes; recent work has shown that site-directed mutagenesis used to introduce amino acid substitutions increasing acid-and thermostability increased the stability and yield of VLPs engineered in Nicotiana benthamiana leaves (Veerapen, Zyl, Rybickia, & Meyersa, 2018).

| Suspension cells
Suspension cell cultures have the same advantages of sterility, containment, and well-defined downstream processing procedures which other cell culture expression systems possess, but lose many of the aspects of plant expression systems that make them attractive including the huge scaling up potential (Twyman et al., 2003). The ability to use low cost defined growth media is an advantage over mammalian cell culture, but therapeutic protein production using plant cells in suspension offers few advantages over a yeast or insect expression system.

| Tobacco
The molecular biology workhorse of the plant world, tobacco is the most widely used species for the production of recombinant proteins in the laboratory (Ma et al., 2003). Benefits of using tobacco include a high biomass yield of "more than 100,000 kg per hectare for close-cropped tobacco" (Ma et al., 2003), and rapid scale up potential due to a huge seed production capacity. Protein storage in the leaves is not particularly stable and the product is vulnerable to degradation, so the leaves must be frozen or dried for storage or the protein extracted soon after expression. Tobacco tissues usually contain phenols and toxic alkaloids which have implications for downstream processing.

| Cereals
Cereal seeds are excellent protein storage devices equipped with protein storage vesicles and a dry intercellular environment, reducing protease activity and the rate of nonenzymatic hydrolysis. Maize has the highest biomass yield among food crops (Obembe et al., 2011) and has already been used in the production of avidin (Hood et al., 1997), bovine trypsin and recombinant antibodies to name a few (Ma et al., 2003). Dry cereal seeds such as those from rice and wheat have the advantage of high protein stability, allowing storage at room temperature for a matter of months without significant loss of activity ; additionally, rice is self-fertilizing, reducing the risk of transgenes being transferred to other plants (Rybicki, 2010). Food crops also present the opportunity to administer oral vaccines produced in the crop by feeding them to patients with minimal processing (Margolin et al., 2018). Coupled with the stability of proteins in seeds, this presents an extremely attractive opportunity to reduce the cost and distribution issues faced by conventional vaccines . However, with strict regulatory requirements, it is unlikely that an edible plant vaccine could be used in humans without a level of processing and formulation to homogenize the product and make sure the correct dose and potency was reproducible in all products (Rybicki, 2010). The concept of producing vaccines in food crops has lost favor in recent years after two incidents in the USA where transgenic plant material contaminated wild-type food crops. These incidents have resulted in a tightening of regulations and a reduced interest from drug companies to pursue the production of vaccines in edible crops (Rybicki, 2010) although edible vaccines against E. coli, produced by potato and maize, have reached phase I clinical trials (Shadid & Daniell, 2016).

| Legumes
Therapeutic protein production has been documented in legumes such as soybean, pea, and alfalfa. Legumes have the advantage of fixing atmospheric nitrogen, removing the nitrogen requirement in their fertilizer, and therefore reducing cultivation cost. However, these plants do have lower leaf biomass than tobacco (Ma et al., 2003). Grain legumes such as peas have high protein content in their seeds, and are being developed as expression systems (Perrin et al., 2000).

| Fruits and vegetables
A number of fruit and vegetable crops have been used to produce therapeutic recombinant proteins, including lettuce, tomato, and most frequently, potato. Like for cereals, a great advantage of these systems is that the protein could be delivered orally with minimal processing, although as mentioned previously guaranteeing the dose and quality is a challenge (Daniell, Kulis, & Herzog, 2019;Ma et al., 2003;Marsian & Lomonossoff, 2016;Rybicki, 2010).

| CHALLENG E S FACED BY PL ANT E XPRE SS ION SYS TEMS
As attractive as plants may seem as therapeutic protein expression systems, there are a number of challenges that must be overcome before they can be widely adopted.

| Environmental contamination
Perhaps the biggest challenge facing protein expression in plants are the concerns around genetically modified (GM) crops. Major concerns include the spread of recombinant genes through seed dispersal, pollen dispersal, viral transfer or horizontal transfer; therapeutic proteins getting into the food supply of humans or animals; and adverse effects on organisms in the environment (Ma et al., 2003;Obembe et al., 2011). In recent years, USDA legislation has reacted to incidents of transgenic plants being found in food crops (Kaiser, 2008;Ma et al., 2003;Rybicki, 2010). There are a number of strategies that can be used to ease these concerns including geographical containment, using different planting seasons than those of local food crops, the use of male sterility in GM plant strains, using the chloroplast expression system (Lau & Sun, 2009), the use of inducible promoters, producing easily identified plant varieties (e.g., white tomatoes) (Ma et al., 2003), using self-pollinating species, producing nongerminating seeds (Obembe et al., 2011), and producing inactive fusion proteins that are activated by postpurification processing (Daniell, Streatfield, et al., 2001). Growing crops inside appropriately managed greenhouses, hydroponic growth rooms or using cell suspension cultures can provide an effective and economical means of containing GM plant material (Ma et al., 2003;Obembe et al., 2011;Su et al., 2015).

| Regulatory approval
As promising as this technology may be, drug companies are unwilling to risk the huge sums of money required to get a new product approved by the large drug approval administrations if there is already a proven alternative expression system with regulatory approval (Rybicki, 2010). This economic constraint has a stagnating effect on the pharmaceutical industry, limiting the scale of progress and the development of new drug production technologies.
Unfortunately this situation is unavoidable because of the high level of confidence that is needed in any therapeutic molecule to be used in humans. The production of animal vaccines in plants is making faster progress, as there are fewer regulatory hurdles (Rybicki, 2010); this could provide a proof-of-concept for the production of human vaccines in plants, demonstrating the value of the expression system to produce effective therapeutic proteins cost effectively. A large advantage which plant expression systems have over conventional therapeutic protein production platforms is the ability to produce protein rapidly, going from gene sequence to grams of protein in under a month using transient expression techniques (Rybicki, 2010). This is preferable to the current influenza vaccine production system using eggs which 'does not provide sufficient capacity and adequate speed to satisfy global needs to combat newly emerging strains, seasonal or potentially pandemic' (Shoji et al., 2011). This provides a significant advantage over conventional methods of responding to rapidly emerging disease strains, as was shown in 2014 when an Ebola treatment was produced at short notice in Nicotiana benthamiana using a transient expression system (Gomes et al., 2019). This is an opportunity for plant expression systems to excel, producing vaccines quickly in response to emerging threats such as rapidly mutating diseases or bioterror threats. In the case of the Ebola treatment, full regulatory approval was sidestepped under compassionate protocols (Gomes et al., 2019). The first plant-produced therapeutic protein to win full regulatory approval for human use was taliglucerase alpha produced in carrot cell culture (Tekoah et al., 2015). The molecule was already approved from mammalian cell culture, so it was easier to transfer approval to a new production system than to bring an entirely new product through the regulatory process (Gomes et al., 2019). These advances will undoubtedly make it easier for further drugs to be licenced in future and pharmaceutical companies should now be more likely to consider plant expression systems (Davies, 2010).

| Protein stability
The stability of expressed proteins is a concern which has significant bearing on the overall viability of the expression system.
The solutions to unstable protein breakdown are dependent on the individual recombinant protein being expressed, but could include the following: the creation of fusion proteins with a stabilizing peptide co-expressed with the therapeutic protein (this method can also facilitate downstream processing with the use of affinity tags); protein targeting to seeds, oil bodies or protein storage vacuoles; freeze-drying plant material in order to preserve expressed proteins; and for proteins that do not require glycosylation the chloroplast expression system is ideal for maximizing protein yield, stability and accumulation (Daniell et al., 2019;Obembe et al., 2011).

| Posttranslational modifications
The plant proteome is highly plastic, facilitating extensive engineering: the simultaneous co-expression of many proteins enables complex protein production pathways to be established, allowing the possibility of complex glycosylation engineering (Margolin et al., 2018). Whilst plants have a similar glycosylation mechanism to humans, there are differences in terms of N-glycan composition-notably the addition of α1-3fucose and β1-2xylose and the absence of α1-6fucose, glucose and sialic acid residues (Obembe et al., 2011). These differences can have drastic effects on the distribution, half-life in serum, activity, and immunogenicity of therapeutic proteins (Twyman et al., 2003).
While safety concerns may be unwarranted (Ma et al., 2003), there is no doubt that consistent human-like N-glycosylation is a vital goal in the production of some therapeutic proteins such as monoclonal antibodies (Raju, Briggs, Borge, & Jones, 2000). However, there are therapeutic proteins which may not require such specific posttranslational modification, and these proteins may be better suited to production in plants. There are several strategies proposed to overcome the problem of nonhuman N-glycosylation: in vitro modification using purified human β1-4 galactosyltransferase and sialyltransferase enzymes (Blixt, Allin, Pereira, Datta, & Paulson, 2002), knock-out/knock-down of the native plant fucosyltransferase and xylyltransferase enzymes (Twyman et al., 2003), and expressing human β1-4 galactosyltransferase in the transgenic plant (Bakker et al., 2001). Recombinant viral structural proteins may be readily produced in plants, but viral glycoproteins pose a similar challenge to mammalian glycoproteins (Margolin et al., 2018).
The issue of glycosylation, whilst challenging, is not insurmountable: plant-derived influenza haemagglutinin, the only viral glycoprotein to have been tested in humans, has successfully been engineered with glycans at all possible sites and is anticipated to have FDA approval by 2020 (Le Mauff et al., 2015;Margolin et al., 2018;Ward et al., 2014); a suite of viral glycoprotein vaccine candidates against a range of diseases-including influenza, HIV, and Ebola-have been expressed in plants, summarized by Margolin et al. (2018). Finally, chloroplast expression provides a 'blank slate' for in vitro or in vivo glycoengineering without interfering with the native glycosylation mechanism. Although the ability of chloroplasts to add posttranslational modifications is not fully understood, they have been shown to have the capabilities for phosphorylation, lipidation and forming disulphide bonds (Zhang, Shanmugaraj, & Daniell, 2017;Zhang, Li, et al., 2017).

| PER S PEC TIVE S AND FUTURE DIREC TIONS
The potential market for therapeutic proteins is huge, with products ranging from antibodies to hormones and enzymes to vaccines. Each type of recombinant protein has its own production challenges and these will inevitably match up with the strengths of the different expression systems available.
The relatively short time it takes to go from sequence to producing grams of protein, using high yield transient expression systems such as Magnifection is a major advantage plants have over other expression systems. This strength lends itself to the production of vaccines to treat emerging or rapidly mutating diseases such as influenza or bioterror threats. There is also the potential for small production runs using this technology, for orphan diseases with a small number of patients, or perhaps even personalized treatments.
The rapid production combined with the ability to grow transgenic plants in low cost greenhouses could greatly reduce the otherwise high cost of protein drugs for rare diseases.
As the therapeutic protein market matures, patents will expire, allowing the production of "biosimilars"-copies of the original, licensed protein produced off patent (Davies, 2010). Plant expression systems, for example a high yield chloroplast expression system, could allow the production of these proven drugs on a much larger scale and at a lower cost, grown in greenhouses or perhaps in the field (with the appropriate containment strategies in place). With the current state of glycoengineering in plants these therapeutic proteins could not require essential, human-like N-glycosylation as this is not yet available in plants (Strasser, 2016). But with progress in engineering, the glycosylation pathway, and in vitro glycosylation procedures, N-glcosylated therapeutic proteins produced in plants could be a possibility in the near future.
One of the largest barriers to widespread acceptance of plant expression systems is the lack of regulatory approval, although there are plant produced recombinant protein products on the market most are either diagnostic, veterinary or classed as medical devices, which are not required to meet the high standards of drugs for human use (Lico et al., 2012). The difficulty and cost of gaining this approval currently outweighs the benefits of using plants to produce therapeutic proteins. One supposed benefit of plant expression systems is low cost and high scalability. While it is true that plants have the potential to produce more protein more cheaply than mammalian cell culture, for example, this only has a limited impact on the overall cost of producing a therapeutic protein drug. The major part of the cost is in purification of the product, which would essentially be the same in the cell extract of a plant or mammalian cell. If protein harvest and purification could be done at a lower cost in plants, most likely through targeting the expression to certain storage bodies such as seeds, which have a lower volume of water, or nectar which has few other contaminants from which to extract the protein, the economic benefit of using a plant expression system would be much greater.
Alternatively, if purification can be sidestepped entirely such as in the example of coagulation factor IX in lettuce leaves for the treatment of hemophilia B, plant expression systems become hugely attractive (Su et al., 2015).
Plants may also be considered safer than many other expression systems, since they do not constitutively produce endotoxins, or naturally support the growth of viruses or prions with the potential for infecting humans (Moustafa et al., 2016).
As the understanding of recombinant protein expression systems increases and their limitations are fully understood, companies will be able to make informed choices on the ideal expression systems available to produce a specific therapeutic protein. Plant expression systems will no doubt fit into this landscape, but how much they are utilized relies on how effectively the challenges can be overcome.