Amylose in starch: towards an understanding of biosynthesis, structure and function.

Starch granules are composed of two distinct glucose polymers - amylose and amylopectin. Amylose constitutes 5 to 35% of most natural starches and has a major influence over starch properties in foods. Its synthesis and storage occurs within the semi-crystalline amylopectin matrix of starch granules, which poses a great challenge for biochemical and structural analyses. However, the last two decades have seen vast progress in understanding amylose synthesis - including new insights into the action of GRANULE BOUND STARCH SYNTHASE (GBSS), the major glucosyltransferase that synthesises amylose, and the discovery of PROTEIN TARGETING TO STARCH1 (PTST1) that targets GBSS to starch granules. Advances in analytical techniques have resolved the fine structure of amylose, raising new questions on how structure is determined during biosynthesis. Further, the discovery of wild plants that do not produce amylose revives a long-standing question of why starch granules contain amylose, rather than amylopectin alone. Overall, these findings contribute towards a full understanding of amylose biosynthesis, structure and function - which will be essential for future approaches to improve starch quality in crops.


I.
Introduction

I. Introduction
This review will explore the synthesis, structure and function of amylosea ubiquitous carbohydrate polymer in starch. Starch is the main storage carbohydrate in plants, and a vital source of calories in our diet. It is typically produced in plastids of leaves, seeds and storage organs as semicrystalline insoluble granules. Amylopectin is the major polymer in starch, and has a-1,4-linked glucose chains (degree of polymerisation (DP) < 100) joined by a-1,6-linked branch points (Fig. 1). The semicrystalline granule matrix is formed by amylopectin, where adjacent chains make double helices that pack into crystalline lamellae, while the branch points give rise to amorphous lamellae. Amylose constitutes < 35% of most natural starches and, unlike amylopectin, is not necessary for the formation of semicrystalline granules. It is composed of long a-1,4-linked linear chains (typically DP 100-10 000) with rare a-1,6-linked branch points, and is believed to reside in amorphous regions of the granule (Fig. 1). The widespread occurrence of starch makes amylose one of the most abundant natural polymers on Earth, but many aspects of its biosynthesis, structure and function are not yet understood. Amylose synthesis in itself is simple: the linear chains form through processive elongation by a single enzyme, the GRANULE BOUND STARCH SYNTHASE (GBSS). However, the process must occur in tight coordination with amylopectin synthesis to form granules of correct structure and composition. Recent years have seen great progress in understanding amylose synthesis in Arabidopsis and various crop species, including new information on GBSS biochemistry and structure, as well as the discovery of a second protein involved in the process, PROTEIN TARGETING TO STARCH1 (PTST1). The fact that amylose is embedded in the semicrystalline granule matrix poses a great challenge for structural analyses. However, recent technical breakthroughs have revealed more structural complexity to amylose than previously thought, evoking new questions on how this structure is determined during biosynthesis. Efforts have also been made to address the longstanding mystery on why starch granules contain amylose at all, rather than containing amylopectin alone.
These new insights provide a solid foundation for further discoveries on the biology of amylose, and will be the focus of this review. Vast progress has also been made on understanding the important influence of amylose on starch physicochemical behaviour, and its contribution to resistant starchaffecting starch digestibility and nutritional properties (Box 1). These effects on crop quality will not be discussed in detail here, but the reader is directed to specialist reviews in Box 1. Achieving a full understanding of amylose biosynthesis, structure and function will undoubtedly enable novel and improved approaches to produce higher quality, healthier starches.  Fig. 1 The molecular architecture of starch granules. Starch contains two glucose polymer componentsamylopectin and amylose. Both polymers have a-1,4-linked linear chains and a-1,6-linked branch points. Amylose is primarily linear and has very few branches. The branched structure of amylopectin allows adjacent chains to form double helices that pack into crystalline lamellae, while branch points reside in amorphous lamellae. Alternating crystalline and amorphous lamellae give rise to the semicrystalline starch granule matrix. Large growth rings (each containing many alternating crystalline/amorphous lamellae) are visible in the scanning electron micrograph of a broken, partially digested potato starch granule (adapted from Seung & Smith (2019); bar, 5 µm). Illustrations are based on those in Zeeman et al. (2010) and Santelia & Zeeman (2010).

Box 1 Applied and technical aspects of amylose
Applications of starches with altered amylose content -There is a wide variety of food and industrial uses for starch with low or highamylose contents: (1) Amylose-free/low-amylose starches -These starches produce stickier textures than normal starches in cooked foods, and form clear, stable pastes that are ideal for a wide variety of industrial applications (e.g. papermaking) (Jobling, 2004;Santelia & Zeeman, 2010). Many amylose-free varieties of cereals originate from Asia, where there was a strong preference for stickier food textures (Olsen & Purugganan, 2002;Stamp et al., 2016;Zhang et al., 2019).
(2) High-amylose starchesact as a type of resistant starch in food as it is less digestible than normal starches in the upper gut, and promotes lower gut health by acting as a prebiotic (Topping & Clifton, 2001;Li et al., 2019a). High-amylose starches produce firm gels with film forming properties, which are ideal for use in biodegradable plastic substitutes (Jobling, 2004) Amylose quantification -Common methods of amylose quantification are based on one of three principles: (1) Iodine colourimetrybased on the principle that the wavelength absorbed by amylose in the presence of iodine is distinct from that of amylopectin. A standard curve can be produced using different ratios of purified amylopectin and amylose. This method cannot distinguish between amylose and long amylopectin chains. Thus, amylose quantified using this method is often referred to as 'apparent amylose'.
(2) Lectin precipitation -Amylopectin can be selectively precipitated from amylose using the lectin concanavalin A. Some amylose can co-precipitate with amylopectin, leading to an underestimation of amylose content.
(3) Chromatographic separation -Starch is debranched and separated using size-exclusion chromatography. Longer chains (typically DP > 100) are summed to calculate amylose content. This approach cannot distinguish between long amylopectin chains and amylose. To resolve these, chromatography may be done in two dimensions, where amylopectin and amylose are fractionated as whole molecules before debranching and chromatography. The reader is directed to Vilaplana et al. (2012) and references therein for a full description of each method, and a discussion of advantages and limitations.

II. The occurrence of amylose among starchproducing organisms
The increased availability of genome sequences in the last two decades has led to new insights on the origins of amylose biosynthesis. Amylose is present in virtually all land plants examined, as well as in green algae (Delrue et al., 1992;van de Wal et al., 1998;Busi & Barchiesi, 2014). Most species of red algae produce starch with amylopectin only, but some species in the Porphyridiales possess a GBSS homologue and produce amylose (McCracken & Cain, 1981;Shimonaga et al., 2007). This sporadic occurrence of amylose in the red lineage contrasts the green lineage, for which there are no reports of entire species that have completely lost the GBSS gene. Amylose and GBSS are also present in starches from the cryptophyte Guillardia theta and the glaucophyte Cyanophora paradoxa Plancke et al., 2008;Nielsen et al., 2018). Markedly, a cyanobacterial strain CLg1 possesses a GBSS gene and produces insoluble starch-like granules that contain an amylopectin-like component and amylose Suzuki et al., 2013;Nielsen et al., 2018). These findings all point towards a cyanobacterial origin of GBSS, and that amylose synthesis is likely to have evolved before the divergence of the green and red lineages. Further analysis of emerging genome sequences from cyanobacteria and algae will allow a better assessment of GBSS conservation within these groups, and provide more insights into the evolutionary origins of amylose.
In starches of land plants, amylose content varies greatly between species and organs. Starches from seeds and storage organs (e.g. roots and tubers) typically contain 5-35% amylose Waterschoot et al., 2015) (Table 1). The amylose content of starch is usually low during the early stages of seed or tuber development, and increases at the later stages until a final amylose content is reached (Geddes et al., 1965;Merritt & Walker, 1969;Tsai et al., 1970;Asaoka et al., 1985). This pattern is consistent with amylose being synthesised within a pre-existing matrix of amylopectin (Denyer et al., 1996;Tatge et al., 1999). Amylose content of leaf starches has only been quantified in a few species (Table 1). In pea, amylose content is much higher in seeds than in leaves, pods or nodules (Denyer et al., 1997). Arabidopsis leaf starch typically has < 12% amylose, which is lower than most storage starches (Table 1) (Zeeman et al., 2002b;Seung et al., 2015Seung et al., , 2020. The low-amylose content in Arabidopsis leaves can be at least partly explained by the shorter time available for amylose synthesis in comparison with seeds and storage organs. Arabidopsis synthesises leaf starch during the day and degrades it almost completely in the subsequent night, allowing only a single photoperiod for amylose synthesis, whereas starch synthesis during seed or tuber development can take up to several months. Indeed, extending the photoperiod or introducing mutations (e.g. sex4) that prevent the complete turnover of starch increases amylose content to levels typically found in storage starches of crop species (Zeeman et al., 2002b;Seung et al., 2015). Earlier studies on leaf starches from tobacco and cotton report an amylose content that is comparable with storage starches, even after a normal photoperiod (Table 1). However, tobacco does not completely turnover its leaf starch at night, thus allowing more time for amylose accumulation (Matheson & Wheatley, 1963). It is possible that different patterns of starch turnover contribute to wide interspecies variation in the amylose content of leaf starches. While the data in Table 1 reflect consensus values for amylose, it is important to note that differences in quantification methods prevent precise cross-study comparisons of amylose content (Box 1). This highlights the need for a broad survey of amylose content across different species and organs within a single study, or implementation of a standard quantification method.
Mutants that produce amylose-free or low-amylose starch have been isolated in various species (Table 2). Amylose-free varieties of maize, barley, rice and millet have been cultivated for centuries in Asia for their desirable textural and cooking properties (Olsen & Purugganan, 2002;Hunt et al., 2010;Stamp et al., 2016;Zhang et al., 2019). These mutants are historically referred to as 'waxy', after the wax-like appearance of the endosperm in uncooked grains, or 'glutinous', after the glue-like sticky texture of cooked grains. Outside cultivated crops, wild accessions of Arabidopsis thaliana that do not produce amylose in leaf starch have been recently reported, suggesting that amylose may not be as strictly conserved as once thought . The extent to which this phenomenon applies in other species is not known (further discussed in Section IX).
III. The specialised biochemistry of GBSS facilitates amylose synthesis GBSS belongs to the starch synthase class of glucosyltransferases. All members of this class contain tandem GT-5 and GT-1 catalytic domains that can extend a-1,4-linked glucan chains using ADPglucose as a glucosyl donor (Ball & Morell, 2003;Brust et al., 2013;Pfister & Zeeman, 2016) (Fig. 2) Abt et al., 2020), and specialised species-specific isoforms, such as SS6 in potato (Helle et al., 2018). GBSS has undergone several duplications during land plant evolution, leading to tissue-specific and biochemical specialisation among paralogues (Pan et al., 2009;Cheng et al., 2012). Grasses for example contain two GBSS paralogues, GBSS1 and GBSS2. The expression of GBSS1 is restricted to the endosperm and pollen grains, while GBSS2 is expressed in vegetative tissues and pericarp (GBSS2) (Nakamura et al., 1998;Vrinten & Nakamura, 2000;Dian et al., 2003). There are also two distinct GBSS paralogues in pea, referred to as GBSSa and GBSSb, in which the expression of the former is enriched in embryos, while the latter is enriched in leaves, pods and nodules (Dry et al., 1992;Denyer et al., 1997Denyer et al., , 2001Edwards et al., 2002). These two paralogues are present in most other rosids, suggesting that rosids with just one GBSS gene (e.g. Arabidopsis) most likely lost one of the two paralogues (Cheng et al., 2012). The multiple duplications of GBSS during plant evolution, and the presence of GBSS in nonplant lineages, have created a large, diverse protein family. A detailed, comparative survey of GBSS from various species may reveal biochemical diversity within the family that contributes to the differences in amylose content between species and organs. Three common features appear to distinguish GBSS from other starch synthases: 1. GBSS is tightly associated with starch granules GBSS is almost exclusively bound to starch granules rather than soluble in the plastid stroma (Mu-Forster et al., 1996;Smith et al., 2004;Seung et al., 2015;Hebelstrup et al., 2017). This contrasts the other starch synthase isoforms, which are mostly present in the stroma, or partitioned between the stroma and starch granules (Smith, 1990;Denyer et al., 1993;Liu et al., 2012b;G amez-Arjona et al., 2014). Furthermore, the treatment of the granule surface with proteases has demonstrated that the majority of GBSS is internalised within the granule matrix rather than bound to the granule surface (Denyer et al., 1993;Mu-Forster et al., 1996;Grimaud et al., 2008). This location is consistent with the synthesis of amylose within starch granules.
Due to its exclusive granule-bound localisation, GBSS abundance in Arabidopsis leaves is linked to starch content, such that no GBSS is detectable at the end of the night when starch is absent (Smith et al., 2004). This implies that an unidentified plastidial protease removes GBSS as starch is degraded during night. Consistent with this hypothesis, proteolytic fragments of GBSS were observed in Chlamydomonas during starch degradation in the dark (Ral et al., 2006). While GBSS must be granule-bound to synthesise amylose within granules, the above observations raise questions on whether there are negative consequences of having unbound stromal GBSS. On the one hand, although amylose is amorphous within starch granules, the mature polymer is insoluble outside the granule matrix and prone to precipitation (Denyer et al., 2001;Bertoft, 2017). The precise reason for this is not known, but it is possible that the long linear chains in isolation form helical structures that precipitate, whereas the formation of such secondary structures are impeded within the granule. It can be speculated that the formation of mature crystalline amylose outside the granule may resist degradation in the stroma, and thus become an inaccessible pool of carbohydrate within plastids. On the other hand, nascent amylose chains at the early stages of synthesis are not different to a malto-oligosaccharide (MOS), in that they are soluble and vulnerable to degradation from both aand b-amylases in the stroma. During normal amylose biosynthesis, the granule matrix would probably protect these nascent chains from these stromal amylases. Thus, it is also possible that the presence of soluble GBSS activity may result in a wasteful cycling of carbohydrates within the plastid 2. GBSS elongates glucans processively GBSS has a processive mode of action that enables it to add more than one glucose unit per substrate encounter, which allows it to  (Denyer et al., 1999b;Cuesta-Seijo et al., 2016;Nielsen et al., 2018). All other starch synthase isoforms have a distributive mode of action, where glucan chains are extended by one glucose unit per encounter Denyer et al., 1999b;Cuesta-Seijo et al., 2016). However, the processivity of GBSS appears to depend on acceptor substrate and/ or presence of amylopectin in the assay medium. GBSS from pea and potato elongated short MOS (DP = 3) processively in the presence of amylopectin, but switched to distributive action in the absence of amylopectin (Denyer et al., 1999b). The presence of amylopectin also greatly increased GBSS affinity for MOS substrates, and enhanced activity in vitro Denyer et al., 1999a,b). In a recent study, the barley GBSS1 elongated short MOS distributively, but switched to processive elongation on longer MOS substrates (DP ≥ 8), even in the absence of amylopectin (Cuesta-Seijo et al., 2016). It is possible that longer MOS in the assay medium can mimic amylopectin chains and switch the mode of action. The structural features that allow GBSS to act processively are not currently known. A recent study on the structures of nonplant GBSS enzymes (from Cyanobacteria CLg1 and Cyanophora paradoxa) observed loops that potentially form a 'closed dome' around the active site and contribute to processive action (Nielsen et al., 2018). These loops were less obvious in the rice GBSS1 structure. It is possible that the switch to processive action involves conformational changes induced by the presence of amylopectin or long MOS, such that structural features mediating processivity can only be observed in the presence of these substrates. While amylose is the primary product of GBSS, the enzyme can also elongate long chains of amylopectin. In Chlamydomonas, GBSS makes an important contribution to amylopectin synthesis by synthesising the long inner chains (B-chains) that interconnect amylopectin clusters, together with SS3 (Ral et al., 2006). There is no evidence that GBSS in land plants produces similar intercluster chains. However, plant GBSSs have been implicated in the synthesis of extra-long amylopectin chains (DP 300-500) in storage starches of cereals and root/tuber crops (Hanashiro et al., 2005;Charoenkul et al., 2006). These chains are best characterised in rice, where their abundance varies between cultivars (Takeda et al., 1987;Inouchi et al., 2005). Their abundance correlates to specific polymorphisms in GBSS1 (Hanashiro et al., 2008;Crofts et al., 2019), and eliminating GBSS1 activity abolishes them (Hanashiro et al., 2008). However, these extra-long chains are likely to be distinct from the intercluster B-chains that can be synthesised by Chlamydomonas GBSS in that they are longer, and their susceptibility to hydrolysis by b-amylases suggests that they are mostly external with sparse branching (Takeda et al., 1987;Hanashiro et al., 2005;Fujita et al., 2012). The GBSS from sweet potato, potato and pea can also elongate the long chains of amylopectin in vitro (Baba et al., 1987;Denyer et al., 1996). Notably, extra-long chains have never been reported in leaf starches, either suggesting they are absent, or at levels below detection. As with amylose, it is possible that the rapid turnover of starch in leaves restricts the formation of these long amylopectin chains.

GBSS has structural features that are unique among SS
Unlike the other starch synthases, GBSS does not have a specialised N-terminal extension to the GT domains (Fig. 2). It has instead a unique hydrophilic, negatively charged C-terminal tail of around 40 amino acids that is conserved among GBSS of land plants .The structure of the C-terminal tail in rice GBSS1 could not be resolved by X-ray crystallography, suggesting that it could be disordered (Momma & Fujimoto, 2012). The exact function of the C-terminal tail in vivo is not known, and removing it completely does not appear to affect activity in vitro . GBSS also has a coiled coil on a single helix of the GT-1 domain, which is important for its interaction with PTST1 (Fig. 2).
The only plant GBSS structure currently available is of rice GBSS1. Intriguingly, the structure contained a disulfide bridge that New Phytologist (2020) 228: 1490-1504 Ó 2020 The Author New Phytologist Ó 2020 New Phytologist Trust www.newphytologist.com links the GT-5 and GT-1 domains (Momma & Fujimoto, 2012). However, one of the cysteine residues involved in forming this bridge is absent outside the cereal GBSS1 enzymes. This raises a possibility that GBSS1 isoforms of endosperms have distinct biochemical and structural features compared with GBSS2 in leaves, and with other plant GBSS isoforms outside the cereals. A disulfide bridge was also observed at a different location in the barley SS1 structure, and this bridge plays a role in the redox regulation of activity (Cuesta-Seijo et al., 2013). It is not known whether the disulfide bridge in GBSS1 is also regulatory or purely structural.

IV. PTST1 targets GBSS to starch granules
It was recently discovered that the location of GBSS on starch granules in Arabidopsis leaves is facilitated by the presence of PROTEIN TARGETING TO STARCH 1 (PTST1) (Seung et al., 2015). PTST1 was first identified as a plastidial protein containing coiled coils (specialised a-helices that often mediate proteinprotein interactions) and a Carbohydrate Binding Module 48 (CBM48) at the C-terminus (Lohmeier-Vogel et al., 2008). PTST1 interacts directly with GBSS in pulldown experiments with recombinant proteins in vitro and pairwise immunoprecipitation assays in vivo (Seung et al., 2015). Mutations in the coiled coil of GBSS or the adjacent loop, can greatly weaken the interaction (Seung et al., 2015. Strikingly, Arabidopsis ptst1 mutants accumulate almost no GBSS in either the starch granule or stroma, and no detectable amylose in starch (Fig. 3a,b). Also, the Arabidopsis GBSS was located to starch granules when coexpressed with Arabidopsis PTST1 in Nicotiana benthamiana, but not when co-expressed with a mutated form of PTST1 that has a nonfunctional CBM48 domain. These data suggest that PTST1, using its CBM48 domain, plays an active role in facilitating GBSS location on starch (Fig. 3c).
It should be stressed that PTST1 is not strictly required for amylose biosynthesis, but greatly enhances the process in Arabidopsis leaves. Although amylose was undetectable in ptst1 single mutants, the ptst1 sex4 double mutant accumulated detectable amounts of amylose and GBSS, particularly in older leaves (Seung et al., 2015). However, the amylose content of the double mutant was much lower than the sex4 single mutant. The incomplete turnover of starch in the double mutant is likely to provide more time for amylose to accumulate in the absence of PTST1. Also, the ptst1 mutant accumulates detectable levels of amylose in root tips, where complete starch turnover does not occur .
The discovery of PTST1 as a second component of amylose biosynthesis came more than 50 yr after the discovery of GBSS. Considering the extensive screening of amylose-free mutants in many crop species, this late discovery was somewhat surprising. However, as storage organs of crops do not turnover starch as in leaves, a PTST1 deficiency in these organs results in low-amylose starch rather than amylose-free starch, analogous to our observations with the Arabidopsis sex4 ptst1 mutant. Indeed, CRISPR/ Cas9-generated knockout mutations of PTST1 in cassava decreased amylose content by up to 40% in storage roots (Bull et al., 2018). Such reductions in amylose content could not be distinguished visually with iodine staining. It is therefore possible that ptst1 mutants were missed in iodine-based screening methods, as the low-amylose trait is more difficult to detect than amylose-free starch. However, mutants/varieties of crops that have low-amylose and GBSS levels should not be assumed to be defective in GBSS without genetic mapping, as they could be defective in PTST1.
Currently, the role of PTST1 in cereal endosperms is not clear. A recent study in rice used CRISPR/Cas9 to create knockout mutants of PTST1 (also referred to as OsGBP by the authors) . Like the Arabidopsis mutant, the rice mutants produced amylose-free starch in leaves and had undetectable levels of GBSS2. By contrast, in the endosperm, the mutants only had small reductions in GBSS1 abundance, accompanied by a c. 10% reduction in amylose content of starch. A tiny amount of GBSS1 was detected in the amyloplast stroma in the mutant, but not in the wild-type. Thus, the loss of PTST1 in rice appears to drastically impact amylose synthesis in leaves, but only has minor effects in the endosperm. It seems reasonable that GBSS is more dependent on PTST1 to locate starch granules in chloroplasts, where granules are dispersed through the stroma, than in endosperm amyloplasts, where granules occupy the majority of the amyloplast volume. Also, although both rice GBSS1 and GBSS2 isoforms can interact with PTST1 , it is possible that GBSS1 has other features that allow it to associate with starch in the absence of PTST1. In stark contrast to the results in rice, CRISPR/Cas9generated knockout mutants of PTST1 in barley produced nonviable seeds with no starch in the endosperm (Zhong et al., 2019). It is possible that PTST1 has a specialised role in barley, or that there are unidentified background factors that interact with PTST1 mutations to cause such a severe phenotype. Investigating PTST1 in close relatives of barley (e.g. wheat) will clarify its function in the endosperm.
There are now several examples of proteins targeting other proteins to starch granules (e.g. SS2, LSF1), suggesting that it could be a common theme in starch metabolism (Liu et al., 2012b;Schreier et al., 2019). Why GBSS evolved to depend on PTST1 to locate to starch, rather than having its own Carbohydrate Binding Module (CBM) domain, remains a curiosity. A possible advantage of having the CBM on another polypeptide is that it allows the enzyme to dissociate from the CBM when not required. Indeed, the vast majority of PTST1 remains stromal rather than granulebound, suggesting that PTST1 dissociates from both GBSS and the starch granule after targeting (Seung et al., 2015) (Fig. 3c). It is plausible that the glucan affinity of a CBM is required for GBSS to initially locate to granules, but may be a hindrance once inside the granule matrix. The extent to which GBSS is mobile within the amorphous regions is not known, but its processive action requires either 'forward' movement of GBSS as it follows the growing end of the amylose chain, or the movement of the growing amylose chain as it is pushed out of an immobile enzyme. In either case, continuous glucan binding at a CBM could hinder such processive action by restricting the movement of the enzyme or by binding the growing amylose chain during deposition.

V. How is amylose synthesis primed?
There has been vast recent progress on understanding how starch granules are initiated in plastids (Seung & Smith, 2019). As amylopectin precedes the formation of amylose, the initiation of amylose synthesis is likely to be distinct from the initiation of starch granules. Indeed, mutants that are defective in granule initiation have aberrant numbers of granules, but almost no change in amylose content (Rold an et al., 2007;Seung et al., 2017Seung et al., , 2018. In a model proposed by Denyer et al. (2001), amylose synthesis is most likely primed by short MOS (DP 2-7), which diffuse into the granule matrix and are processively elongated by GBSS. The MOS can come from various sources, including the trimming process of amylopectin biosynthesis (Mouille et al., 1996;Streb et al., 2008) and starch degradation (Critchley et al., 2001). Once the MOS are extended beyond DP > 7, they can no longer readily diffuse out of the granule and eventually become long amylose chains (Denyer et al., 2001). This model is supported by in vitro and in vivo observations. First, GBSS activity in isolated granules from Arabidopsis, pea and potato is greatly stimulated by the addition of short MOS (Denyer et al., 1996;Zeeman et al., 2002a), by directly acting as an acceptor substrate rather than an allosteric activator (Denyer et al., 1999b). Second, Arabidopsis and Chlamydomonas dpe1 mutants (defective in the plastidial disproportioning enzyme) that accumulate large amounts of MOS have increased amylose content (Colleoni et al., 1999;Zeeman et al., 2002a). The Arabidopsis esv1 mutant, which has elevated maltose levels during the day, also has significantly more amylose than the wild-type (Feike et al., 2016).
In an alternative model, amylose may be primed by the extension of amylopectin chains. Radiolabelling experiments in isolated Chlamydomonas starch granules showed elongation of long amylopectin chains by GBSS, and a transfer of labelled chains from the amylopectin to amylose fractions (van de Wal et al., 1998). Thus, the extra-long chains of amylopectin (discussed previously in Section III) may be an intermediate towards the formation of new amylose chains. However, subsequent labelling experiments in Arabidopsis using both isolated starch granules and whole rosettes failed to find evidence of label transfer from amylopectin to amylose (Zeeman et al., 2002a). Although technical differences arising from different labelling conditions and detection methods cannot be ruled out, a true difference across species is not implausible, especially considering the greater involvement of GBSS in amylopectin synthesis in Chlamydomonas than in land plants (Delrue et al., 1992;Ral et al., 2006). Interestingly, the C-terminal extension of the Chlamydomonas GBSS is much longer than that of plant GBSS, which may explain some of the differences in biochemistry.

VI. Factors that influence amylose content
The major determinant of amylose content of starch is the activity of GBSS within the granule. However, the relationship between the two parameters is not linear: in maize and potato, GBSS activity had a linear relationship with gene dosage, but the relationship between amylose content and gene dosage was hyperbolic (Tsai, 1974;Flipse et al., 1996;Denyer et al., 2001) (Fig. 4). GBSS overexpression will therefore only increase amylose content in species/cultivars that have not already approached the upper end of the GBSS activity/amylose content curve. Indeed, GBSS overexpression did not increase amylose content in potato (Flipse et al., 1994) or wheat (Sestili et al., 2012), despite substantially increasing Gene dosage GBSS activity Gene dosage : GBSS activity Amylose content Fig. 4 Relationship between amylose content and GBSS activity. Linear relationships have been observed between GBSS gene dosage and GBSS activity on starch granules (left). However, these have a hyperbolic relationship with amylose content, which saturates towards higher GBSS activity levels (right). Illustration is based on the findings of Tsai (1974) and Flipse et al. (1996).
New Phytologist (2020) 228: 1490-1504 Ó 2020 The Author New Phytologist Ó 2020 New Phytologist Trust www.newphytologist.com the amount of granule-bound GBSS in both species. GBSS overexpression increased amylose content in a japonica cultivar of rice, which has inherently low GBSS1 levels due to a polymorphism in the 5 0 UTR (Itoh et al., 2003). The recent overexpression of PTST1 in barley led to only small increases in amylose content (<10% increase) (Zhong et al., 2019). These results suggested that other factors limit amylose biosynthesis at high GBSS levels.
Such factors may include GBSS substrate availability. The high-amylose contents of dpe1 mutants in Arabidopsis and Chlamydomonas suggest that the availability of MOS substrates can limit amylose synthesis in these species (Colleoni et al., 1999;Zeeman et al., 2002a). The availability of ADP-glucose also affects amylose content in addition to overall starch levels. Mutants that are impaired in plastidial ADP-Glc production produce low amounts of starch with low-amylose content (Lloyd et al., 1996(Lloyd et al., , 1999Van den Koornhuyse et al., 1996;Tjaden et al., 1998;Clarke et al., 1999;Li et al., 2017). Conversely, potato lines overexpressing the plastidial adenylate transporter to boost ADP-Glc synthesis produce more starch in tubers than the wild type and have elevated amylose content of starch (Tjaden et al., 1998). These findings can be at least partly explained by GBSS having a K m for ADP-glucose that is higher than other starch synthases involved in amylopectin biosynthesis, and thus a decrease in ADP-Glc availability will have a greater effect on amylose synthesis than on amylopectin synthesis (Delrue et al., 1992;Edwards et al., 1999;Denyer et al., 2001). However, the high K m for ADP-Glc may not be a universal feature of all GBSS proteins, as recent studies found that the barley GBSS1 has a K m for ADP-Glc that is comparable with the other barley starch synthases (Cuesta-Seijo et al., 2016;Hebelstrup et al., 2017).
It is also likely that amylose content is limited by space available inside the granule (Flipse et al., 1996). A physical upper limit of amylose accumulation is presumably reached when amylose occupies all amorphous regions within a granule. Thus, the amylose-holding capacity of a granule may reflect the proportion of amorphous vs crystalline areas, which may be influenced by the structure of amylopectin itself. Arabidopsis mutants deficient in SS2 have elevated amylose contents, which appears to result from both reduced amylopectin synthesis, and a boost in amylose biosynthesis (Zhang et al., 2008). This boost was proposed to be facilitated by the altered amylopectin structure in the mutant. The available amorphous area for amylose biosynthesis may also be influenced by granule matrix organisation. Arabidopsis lines overexpressing LIKE EARLY STARVATION1 (LESV), a granule-bound protein proposed to contribute to matrix organisation, had tiny granules that accumulated normal amounts of GBSS on a starch weight basis, but synthesised almost no amylose (Feike et al., 2016). Further investigation of this line may provide insights on whether the loss of amylose is due to small granule size alone, or other changes in the matrix organisation.
It is likely that combinations of the above factors contribute to the vast variation of amylose content observed between species. A greater understanding of GBSS post-translational and transcriptional regulation may reveal further mechanisms that influence amylose content. GBSS has been observed to be phosphorylated (Grimaud et al., 2008;Chen et al., 2017) and to form oligomers , but how these affect GBSS activity and amylose content is not known. Similarly, factors that regulate GBSS transcription have been identified, but their overall influence on variation in amylose content is not known. In Arabidopsis and rice leaves and Chlamydomonas, GBSS transcripts undergo strong diurnal oscillations regulated by the circadian clock (Dian et al., 2003;Tenorio et al., 2003;Smith et al., 2004;Ral et al., 2006;Serrano et al., 2009;Ortiz-Marchena et al., 2014). It is unlikely that these oscillations exert large control over amylose content, but could play a role in resynthesising GBSS turned over during the night. In rice endosperm, several transcription factors that bind and regulate the GBSS1 promoter were identified including BP-5/EBP-89, bZIP58 and SERF1/RPBF (Zhu et al., 2003;Wang et al., 2013;Schmidt et al., 2014). As most of these transcription factors also regulate other starch-related genes, they are unlikely to play a role in the specific modulation of amylose content.
A greater understanding of factors determining amylose content may also lead to more effective biotechnological approaches to increase amylose content. Most currently available high-amylose mutants/varieties are defective in branching enzyme (BE) activity. Suppression of BE can increase amylose in pea (Bhattacharyya et al., 1990), maize (Arai & Baba, 1984;Liu et al., 2012a), rice (Nishi et al., 2001), wheat (Regina et al., 2006), barley (Carciofi et al., 2012), potato (Schwall et al., 2000;Tuncel et al., 2019), and cassava (Zhou et al., 2019); but in almost all cases leads to a yield penalty. Two factors lead to the increased amylose content in these mutants: First, as BE is required for amylopectin biosynthesis, suppressing its activity reduces the amount of amylopectin and increases the relative proportion of amylose. Similarly, many mutations in the amylopectin biosynthesis pathway have been observed to increase amylose content in Arabidopsis leaves and various crops: including other starch synthases (Wang et al., 1993;Delvall e et al., 2005;Fujita et al., 2007;Zhang et al., 2008;Szydlowski et al., 2009), andisoamylase (Dauvill ee et al., 2001;Tziotis et al., 2004;Blennow et al., 2020). Second, the suppression of BE reduces the frequency of branching on amylopectin, resulting in the formation of long amylose-like chains on amylopectin. Early maize genetics demonstrated that at least some amylose detected in the amylose extender mutant (defective in SBE2, accumulating up to 60% amylose) cannot be true amylose synthesised by GBSS, as removing GBSS in this background (i.e. in an amylose extender waxy double mutant) still results in detectable amylose levels (c. 15% amylose) (Creech, 1968;Boyer et al., 1976). Similar results have been obtained in rice (Nishi et al., 2001). In some cases BE suppression did not alter amylose content at all, but only increased long amylopectin chains (Butardo et al., 2011).

VII. Amylose structure
Structural analysis of amylose is challenging since the polymer is embedded in the semicrystalline granule matrix. Amylose molecules are estimated to have a mass of c. 10 6 Da, whereas amylopectin molecules are 10 7 -10 8 Da (Gilbert et al., 2013;Bertoft, 2017 al., 2009). Thus, with current technologies, the exact size of whole molecules within starch granules cannot be determined. However, the analysis of debranched amylose has greatly advanced over the last decade, allowing the characterisation of an amylose fine structure. These studies have revealed a bimodal chain length distribution, with an AM1 fraction containing shorter amylose chains (DP: 100À700) and an AM2 fraction containing longer chains (DP: 700À40 000) (Wang et al., 2014) (Fig. 5). There is relatively little intraspecies variation in amylose chain length distributions, but large interspecies variation in the length and abundance of AM1 and AM2 chains (Wang et al., 2014;Perez-Moral et al., 2018). These data point to at least some genetic contribution to amylose fine structure, and make it timely to explore how structure is formed during biosynthesis. It appears that structure is independent of amylose content: a survey of rice varieties with different SNPs in GBSS1 saw amylose content vary between 16-24%, but the chain length structure did not vary (Wang et al., 2015). However, this does not rule out GBSS as an important determinant of amylose structure, as the SNPs may not have altered GBSS capacity for processivity, which may influence the lengths of amylose chains produced. Interestingly, pea GBSSa produced a shorter chained amylose compared with GBSSb when expressed in potato tubers (Edwards et al., 2002). It is also likely that the amylose chain length structure is influenced by branching patterns, but it is not known which isoforms of BE are involved in branching amylose. BE1 in maize prefers amylose over amylopectin as a substrate in vitro (Guan & Preiss, 1993;Guan et al., 1994), but there is no evidence that it does so in vivo. The relatively low branch frequency of amylose is proposed to result from amylose synthesis occurring inside the granule, protected from BEs, which are mostly in the stroma (Kram et al., 1993). However, a small proportion of BE is also granule bound (Denyer & Smith, 1992;Tetlow et al., 2004;Liu et al., 2012b), and could be involved in branching amylose. Interestingly, sorghum lines with varying pullulanase activity were recently reported to have significant changes in amylose structure, suggesting that debranching activities can also contribute to amylose structure (Li et al., 2019b).
Emerging evidence has suggested that starch physicochemical properties do not depend only on amylose content (Box 1), but also on the amylose fine structure (Li et al., 2016;Tao et al., 2019). A greater understanding of how structure is determined during biosynthesis has the potential to generate novel approaches to alter starch quality by modifying amylose structure.
VIII. Where is amylose located within starch granules?
The precise location of amylose within starch granules is not known. This is at least partly due to our lack of knowledge regarding the arrangement of the amylopectin matrix itself. X-ray scattering data suggest that amylose occurs mostly in amorphous regions of the granule (Jenkins & Donald, 1995). Indeed, the restriction of amylose to the amorphous lamellae would allow minimal disruption of amylopectin helix packing in the crystalline lamellae. However, it is likely that amorphous regions are not restricted to the amorphous lamellae as X-ray scattering and NMR data suggest that native starch granules only have 20-50% crystallinity (Lopez-Rubio et al., 2008). Other amorphous regions may occur in growth rings or at the hilum (or core) of starch granules (Baldwin et al., 1994;Bul eon et al., 1997;Baker et al., 2001). Indeed, transgenic potato lines with silenced GBSS expression have reduced overall amylose content, but this reduction does not occur homogenously in the granule. Staining amylose in these granules with iodine enables the visualisation of a distinctive blue core and blue lines around growth rings, suggesting that amylose is enriched in these regions (Kuipers et al., 1994a;Visser et al., 1997;Tatge et al., 1999). This finding is supported by multiple techniques, including the reducing end stain 8-amino-1,3,6-pyrenetrisulfonic acid (APTS), acid hydrolysis, and X-ray microfluorescence (Blennow et al., 2003;Wang et al., 2012;Bul eon et al., 2014).
By contrast, some studies have suggested that amylose chains could be interspersed between amylopectin chains, rather than restricted to amorphous regions. Amylose chains in maize and potato starch were more susceptible to chemical cross-linking with amylopectin chains than with other amylose chains (Jane et al., 1992;. The presence of amylose has been observed to partially disrupt the packing of crystalline lamellae, suggesting at least that some amylose chains occur among the amylopectin helices (Jenkins & Donald, 1995;Donald et al., 2001;Koroteeva et al., 2007;Kozlov et al., 2007). Other studies using chemical gelatinisation methods have suggested that amylose is enriched on the surface of the starch granules (Jane & Shen, 1993;Pan & Jane, 2000). These data seem to contradict those above, but it is possible that amylose has several distinct locations within the granule. Although amylose can be washed/leached out of the granule in aqueous solutions at elevated temperatures, a fraction New Phytologist (2020) 228: 1490-1504 Ó 2020 The Author New Phytologist Ó 2020 New Phytologist Trust www.newphytologist.com remains trapped inside the granuleconsistent with some amylose molecules being more intimately associated with the amylopectin matrix than others (Greenwood & Thomson, 1962). Also, given the vast diversity in starch granule morphology between species and organsincluding granules with multiple cores (e.g. compound starch granules of rice) or granules that lack a distinct core (e.g. leaf starch)diversity in the arrangement and location of amylose would not be unexpected (Seung & Smith, 2019).

IX. What is the function of amylose in starch?
Despite the widespread occurrence of amylose in plants, its physiological role or adaptive significance is not known. Amylosefree varieties of wheat (Vignaux et al., 2004;Yasui, 2006), barley (Howard et al., 2014), rice (Zhang et al., 2018), potato (Kuipers et al., 1994b) and cassava (Koehorst-van Putten et al., 2012) have similar growth characteristics and comparable yields to normal varieties. Also, Arabidopsis mutants deficient in gbss or ptst1 are indistinguishable from the wild-type in terms of growth and starch content (Seung et al., 2015), suggesting that the loss of amylose synthesis is fully compensated by increased amylopectin synthesis. A prevalent theory is that the presence of amylose increases the storage efficiency of the starch granule by filling amorphous spaces. However, it should be noted that normal and amylose-free maize starches have no quantifiable difference in molecular density (Donald et al., 2001). It has also been suggested that amylose could play a role in influencing sugar availability during the transition from vegetative to reproductive growth in Arabidopsis, stemming from the observation that leaf starch accumulates less amylose after flowering than before flowering (Ortiz-Marchena et al., 2014). Interestingly, this change in amylose content is dependent on photoperiod, and may result from direct transcriptional regulation of GBSS by CONSTANS, a known regulator of photoperioddependent flowering (Ortiz-Marchena et al., 2014). However, the link between amylose content and sugar availability in vivo has not yet been substantiated.
Spontaneous mutations in GBSS are known to arise during crop breeding. In amylose-free varieties of cereals, the types of GBSS1 mutations range from InDels and SNPs, and sometimes involve transposable element insertions (Wessler & Varagona, 1985;Kawase et al., 2005;Pedersen et al., 2005;Hunt et al., 2010). As these varieties are defective only in GBSS1, they are not affected in GBSS2-mediated amylose synthesis in leaves. However, spontaneous mutations at GBSS have also been observed in species where there is only one GBSS paralogue, including potato, cassava and Arabidopsis (Hovenkamp-Hermelink et al., 1987;Ceballos et al., 2007;Schwarte et al., 2013;Seung et al., 2020) (Table 2). The natural occurrence of loss-of-function gbss alleles in wild Arabidopsis accessions is particularly striking, given that it is an undomesticated plant in which there has been no positive human selection for starch traits. Among the 1135 sequenced Arabidopsis accessions, we identified 51 different amino acid substitutions in GBSS relative to the reference sequence, and two polymorphisms that led to the loss of a splice site or the start codon . We determined that at least five genetically distinct wild accessions produce amylose-free starch, and two of these accessions carry the same defective GBSS allele. This suggests that amylose in leaves is not necessary for reproductive fitness in the wild, at least in the short term and under some conditions. Ecological approaches to assess the prevalence of amylose-free wild plants in their natural populations over time will provide vital clues to find the adaptive advantage of amylose synthesis. Given the undisputable conservation of amylose and GBSS in the vast majority of plants, it is probable that amylose biosynthesis presents a fitness advantage under most conditions in the long term; but the fact that it is dispensable for short-term survival allows occasional examples of plants that have lost it altogether, or perhaps traded it off for other traits that are more critical for survival.

X. Conclusions and prospects
We have seen great progress in understanding the biosynthesis and structure of amylose in recent years. This provides a solid basis for further discoveries towards a complete mechanistic understanding of amylose biosynthesis, including a greater understanding of factors that determine amylose content and structure. Further technical innovations will undoubtedly reveal more detail on the whole molecular structure of amylose and its location within granules, which will raise further questions about how these are determined during biosynthesis. Finally, ecological approaches may reveal the adaptive advantage of producing amylose. These efforts will not only increase our understanding on the biology of this abundant biopolymer, but expand the range of biotechnological approaches available to modify amylose in plants, particularly to increase amylose content without yield penalty and to alter amylose structure.