Pod indehiscence is a domestication and aridity resilience trait in common bean
Summary
- Plant domestication has strongly modified crop morphology and development. Nevertheless, many crops continue to display atavistic characteristics that were advantageous to their wild ancestors but are deleterious under cultivation, such as pod dehiscence (PD). Here, we provide the first comprehensive assessment of the inheritance of PD in the common bean (Phaseolus vulgaris), a major domesticated grain legume.
- Using three methods to evaluate the PD phenotype, we identified multiple, unlinked genetic regions controlling PD in a biparental population and two diversity panels. Subsequently, we assessed patterns of orthology among these loci and those controlling the trait in other species.
- Our results show that different genes were selected in each domestication and ecogeographic race. A chromosome Pv03 dirigent-like gene, involved in lignin biosynthesis, showed a base-pair substitution that is associated with decreased PD. This haplotype may underlie the expansion of Mesoamerican domesticates into northern Mexico, where arid conditions promote PD.
- The rise in frequency of the decreased-PD haplotype may be a consequence of the markedly different fitness landscape imposed by domestication. Environmental dependency and genetic redundancy can explain the maintenance of atavistic traits under domestication.
Introduction
Plant domestication was a transformative evolutionary process, which turned wild plants into crops adapted to the human-mediated environment starting some 10 000 yr ago (Gepts, 2004, 2014; Meyer et al., 2012; Meyer & Purugganan, 2013; Larson et al., 2014; MartĆnez-Ainsworth & Tenaillon, 2016). Core domestication traits across a range of seed-propagated taxa include a reduction in seed dispersal, reduced seed dormancy, increased phenotypic diversity of harvested structures, including gigantism, changes in growth habit, and modified phenology, collectively called the domestication syndrome (Hammer, 1984; Lenser & TheiĆen, 2013). Global food security is entirely dependent on crops that have undergone these changes. The domestication process has also served as a series of natural experiments in evolutionary biology and genetics, a role that has been recognized since the inception of these fields (Darwin, 1859; Mendel, 1866).
Effective seed dispersal is vital for spermatophytes. In the Fabaceae, the third largest family of flowering plants (Azani et al., 2017), seed dispersal is typically mediated by the explosive dehiscence (āshatteringā) of pods at maturity. While this mechanism is effective for the propagation of plants in the wild, it results in yield reduction and constrains the temporal window for harvest in the cultivated environment. This has led to selection for pod indehiscence during and after domestication across a range of legume taxa (Ogutcen et al., 2018; Rau et al., 2019). These cultivated forms generally display pod indehiscence, also known as PD resistance.
Phaseolus beans are an exceptional experimental system to study domestication and the molecular evolution associated with this process. Humans domesticated members of this genus seven times (Gepts et al., 2008; Bitocchi et al., 2017), which are part of the 41 domestications in the Fabaceae (Harlan, 1992). The common bean (Phaseolus vulgaris L.), a dietary staple for hundreds of millions of people worldwide (Singh, 1999; Gepts et al., 2008), diverged into distinct Middle American and Andean gene pools c. 87 000 yr before present (Ariani et al., 2018), well before the first human migrations into the Americas some 16 000ā23 000 yr ago (Moreno-Mayar et al., 2018; Potter et al., 2018). It was domesticated independently in Middle America and the Andes, resulting in a replicated experiment in evolution. Each of the two domesticated gene pools of the common bean is subdivided into several ecogeographic races. For example, the Middle American domesticated gene pool comprises, in part, the race Durango (sometimes clustered with the genetically indistinguishable race Jalisco to form the race Durango/Jalisco), which is adapted to the arid, higher-altitude regions of northern Mexico, and race Mesoamerica, adapted to the warmer, humid lowlands of southern Mexico and Central America (Singh et al., 1991). Atmospheric dryness has a strong PD-promoting effect in legumes, and mean annual precipitation is related to signatures of selection on PD-related candidate genes (Bandillo et al., 2017). Desiccation is also often used to induce pod fracture experimentally (Dong et al., 2014; Funatsuki et al., 2014).
Koinange et al. (1996) were the first to identify a pod fiber factor, namely a major gene on linkage group Pv02 (Freyre et al., 1998), in the recombinant inbred (RI) population derived from stringless cv āMidasā and wild accession G12873. This gene, called Stringless (St), maps near the common bean ortholog of INDEHISCENT (PvIND), but a low frequency of recombination is known to exist between the PvIND and the stringless trait, and no causal polymorphism is known to exist in the PvIND sequence (Gioia et al., 2013). St epistatically masks the effect of all other PD quantitative traid loci (QTLs) by dramatically decreasing fiber content but is only relevant in snap beans grown for pods as a vegetable. This locus does not explain any PD variation in the nutritionally important classes grown for grain. Recently, Rau et al. (2019) used QTL mapping to identify a single segregating locus on Pv05 in the same Midas Ć G12873 genetic background (Table 1). To date, a comprehensive evaluation of the genetic basis of PD in diverse germplasm has not yet been conducted and no molecular polymorphisms with a potential causal effect on PD have been described.
Chromosome or linkage group | Gene pool | Ecogeographic race, if available (Singh et al., 1991) | QTL location (bp, v.1.0, Schmutz et al., 2014) | Potential candidate genes (when identified) | Source in Phaseolus vulgaris | Homologies in other species (when known) |
---|---|---|---|---|---|---|
Pv02 | Andean | Nueva Granada | 43 425 893ā43 900 872 | PvIND | Koinange et al.(1996); Gioia et al.(2013); Hagerty et al.(2016) | Arabidopsis (Liljegren et al., 2004) |
Pv03 | Middle American | Durango | 47 527 006ā48 475 205 | PvPdh1: dirigent family | This research | Soybean (Funatsuki et al., 2014) |
Pv03 | Andean | 39 768 300ā48 451 789 | NAC family, C2H2 zinc finger | This research | Cowpea (Lo et al., 2018) | |
Pv04 | Middle American | 42 310 662 | Hagerty et al. (2016) | |||
Pv05 | Andean | Nueva Granada | 35 000 893ā39 497 309 | MYB26, MYB46 | Rau et al. (2019); this research | Cowpea (Suanum et al., 2016; Lo et al., 2018); Arabidopsis (McCarthy et al., 2009) |
Pv08 | Andean and Middle American | Mesoamerica | 330 345ā9 215 942 | MYB family, WRKY family, polygalacturonase | This research | Sorghum (Tang et al., 2013); Arabidopsis (Ogawa et al., 2009) |
Pv09 | Andean | 29 587 741ā37 450 759 | CESA7, polygalacturonases | This research | Cowpea (Suanum et al., 2016) |
In the research reported here, we used high-precision phenotyping techniques, both in an RI population and diversity panels, to identify PD QTLs in the common bean grown for nutritionally important dry seeds. We sequenced a locus underlying a major QTL to identify a possible causal polymorphism. We found that orthologous genes regulate PD among certain domesticated legumes. We were further able to identify associations between PD and the environmental backgrounds of common bean races. Alleles identified in this study will be valuable for developing common bean varieties suited to the increasingly arid climatic conditions of coming decades.
Materials and Methods
Germplasm
An RI population (n = 238), developed from a cross between ICA Bunsi (domesticated, PD-susceptible, Middle American) and SXB 405 (domesticated, PD-resistant, Middle American), was used for QTL mapping (Assefa et al., 2013; Berny Mier y Teran et al., 2019). For association mapping, different panels were used. Two-hundred and eight members of the Andean Diversity Panel (ADP; Cichy et al., 2015) and 278 members of the Middle American Diversity (MDP; Moghaddam et al., 2016) were grown and phenotyped. Sequencing was performed in a diverse panel of 90 varieties representing six species were acquired from the National Plant Germplasm System (NPGS). Eighteen varieties commonly grown at UC Davis with known PD phenotypes were also genotyped. Stringless snap bean varieties were specifically excluded from the analysis to avoid the epistatic effect of the St locus on PD.
Microscopy
Pods of G12873 (wild, high dehiscence), ICA Bunsi (domesticated dry bean, dehiscence-susceptible), SXB 405 (domesticated dry bean, dehiscence-resistant), and Midas (domesticated snap bean, dehiscence-susceptible) were Vibratome-sectioned to identify anatomical differences that might be associated with PD. All sectioned pods were glasshouse-grown and harvested when pods were at full size with seeds filled, at the onset of pod color change. All sections were 100 Ī¼m thick and made in a transverse plane perpendicular to the fibers of interest. All sections were treated with Auramine O (aqueous, 0.01%) for at least 20 min to stain lignified tissue (Ursache et al., 2018). Fluorescence was visualized using an Olympus microscope (Waltham, MA, USA).
RI population cultivation and PD phenotyping
The ICA Bunsi Ć SXB 405 (IĆS) RI population of 238 recombinant inbred lines (RILs) was field-grown during the spring and summer of 2014. The spring planting was an unreplicated trial conducted at Coachella, California. At maturity, plots were visually evaluated for the presence or absence of PD, and the data were used as a phenotype for QTL mapping. During the summer of 2014, the RI population was grown in a replicated field trial in Davis, California. At maturity, dried nondehiscing pods from 191 RILs were harvested from each plot; these were evaluated for susceptibility to PD by two methods. First, all pods were desiccated at 65Ā°C for 7 d, and then returned to room temperature for a minimum of an additional 7 d. The proportion of dehiscing pods after this process was recorded for each plot. Second, the amount of force required to induce pod fracture was measured using an Imada force measurement gauge (ABQ Industrial, The Woodlands, TX, USA; method modified from Dong et al., 2014). Force measurements were taken on pods that had not dehisced during the desiccation treatment. A bit mounted to the gauge was used to press the ventral side of each pod at the most apical seed, and the peak force required to cause fracture at the apical end of the pod beak was recorded. Force required for PD was normalized to account for small but significant differences between note-takers, and the standardized score was used for QTL mapping. Pods that failed to produce seeds were excluded from all phenotyping analyses.
Genotyping
Genomic DNA was extracted from parents and RILs of the IĆS population using a modified cetyl trimethyl ammonium bromide (CTAB) protocol. DNA quality was confirmed using a NanoDrop spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). The IĆS population was genotyped using the Illumina Infinium II BARCBean6K_3 BeadChip (Illumina, San Diego, CA, USA) (Song et al., 2015); 382 segregating single nucleotide polymorphisms (SNPs) were identified in the population. Primers spanning the transcribed sequence of Phvul.003G252100, also known as Phaseolus vulgaris Pod Dehiscence 1 (PvPdh1), a candidate gene underlying a major QTL identified in this study, were developed using the NCBI Primer-BLAST tool. There are several differences in the genomic sequence between the Middle American and Andean gene pools, so a mixture of two forward primers was introduced into each PCR with a common reverse: PvPDH1 ALL Middle American Forward, CATCTCCCCCATTTTCCCCC; PvPDH1 ALL Andean Forward, CATCTCTCCCATTTTCTCCT; PvPDH1 ALL common Reverse, AACACGTGGAAGAGGAGGATT. PCR conditions for this amplification included an initial denaturation at 95Ā°C for 180 s, 38 cycles of 95Ā°C for 30 s, 51Ā°C for 30 s, and 68Ā°C for 60 s, and a final elongation step of 68Ā°C for 300 s. Another set of primers was developed to specifically improve the amplification and sequencing of Andean common beans, with the following sequences: PvPDH1 Andes Forward, TTTTTCTTGTGAGCAAAATTGAGTT; PvPDH1 Andes Reverse, GCAGAGGAAAAACACGTGGA. This primer set was amplified with an initial denaturation at 95Ā°C for 300 s, 34 cycles of 95Ā°C for 30 s, 46Ā°C for 30 s, and 72Ā°C for 70 s, and a final elongation step of 72Ā°C for 300 s. PCR products were cleaned using a GeneJET PCR Purification Kit (ThermoFisher Scientific, Waltham, MA, USA) and sequenced at the UC DNA Sequencing Facility by Sanger sequencing.
QTL mapping
Composite interval mapping was conducted using the r package R/qtl (Broman et al., 2003). Field dehiscence score, proportion dehiscing in a desiccator, and force measurements were used separately to identify PD QTLs marked by SNPs. The maximum logarithm of the odds (LOD) score of 1000 randomized permutations of the data was used as a significance threshold. Single QTL scans were performed using the scanone function. Multiple QTL mapping was conducted using the āscantwoā function in R/qtl and by running the analysis with RILs subset by genotype at the most significant marker near PvPdh1 on Pv03. QTL mapping results were based on maximum likelihood via the expectationāmaximization algorithm (Lander & Botstein, 1989).
Validation of QTL mapping results using association mapping
Two hundred and eight accessions of the ADP were grown in Davis, California, during summer 2016. PD in the field, proportion dehiscing in a desiccator, and force required for fracture were recorded. Principal component analysis was conducted on SNP data for the population, and the results were used as covariates to account for population structure. Two hundred and seventy-eight members of the MDP were phenotyped for PD by desiccation in 2017. Association mapping was conducted using generalized linear models in Tassel via SNiPlay (Bradbury et al., 2007; Dereeper et al., 2011). A minor allele frequency of 0.1 was used as a threshold for SNPs, and these SNPs were evaluated for significance based on a Bonferroni-corrected Ī± of 0.05. QTL regions of significance were determined as the area between the first and last significant SNP on a chromosome arm. Individual significant SNPs without significant neighbors in the same population or others were not given further consideration, as they are unlikely to have a real biological effect. All results were visualized using the qqman package in R (Turner, 2018), including the Bonferroni-corrected significance thresholds at Ī± = 0.05 and 0.01, and these were shown, along with the positions of major candidate genes.
Expression and synteny mapping
Gene expression information from a variety of tissues and developmental stages was extracted from published data (O'Rourke et al., 2014) and visualized independently using R base graphics (R Core Team, 2013). Candidate genes related to PD were identified in significant QTL intervals based on definition line terms for gene families related to PD, which were downloaded with the PhytoMine interface of Phytozome 12 (Goodstein et al., 2012). Subsequent comparisons were made using the Blast function with known amino acid sequences from related species. Synteny comparisons between common bean and soybean (Glycine max) were made using the Legume Information System 2.0 (Rice et al., 2015); these were verified using the available literature (McClean et al., 2010; Schmutz et al., 2014). The CoGe SynMap (Lyons et al., 2008) and LegumeIP 2.0 (Li et al., 2016) synteny tools were used to compare syntenic regions between Arabidopsis (Col-0, TAIR10), common bean (G19833, Pvulgaris_V1.0_218; Schmutz et al., 2014), and soybean (Williams 82, release 1.1; Schmutz et al., 2010). A neighbor-joining tree was produced to determine the pattern of homology between a common bean candidate gene (PvPdh1), a related soybean gene (GmPDH1), and other members of the dirigent gene family in these two species. The amino acid sequence of these proteins was Blast-ed against the G. max and P. vulgaris proteomes to identify closely related genes. These were then compared using a multiple BlastP to develop a distance tree based on a Grishin protein distance matrix (Grishin, 1995). A fast-minimum evolution tree (Desper & Gascuel, 2004) was generated based on a maximum sequence difference of 0.85.
Amino acid conservation analyses
The complete amino acid sequence of PvPdh1 from accession G19833 was compared via BlastP against the NCBI proteome database, using a BLOSUM62 matrix for comparison and existence and extension costs of 11 and 1, respectively (Altschul et al., 2005). The Constraint-Based Multiple Alignment Tool (Cobalt; Papadopoulos & Agarwala, 2007, https://www.ncbi.nlm.nih.gov/tools/cobalt/re_cobalt.cgi) was used to align the most similar proteins known among several plant taxa and identify conserved residues based on the BlastP results. The Protein Variation Effect Analyzer (Provean; Choi & Chan, 2015) v.1.1.3 software tool was used to estimate the effect of mutations of interest using default settings, including a cutoff threshold of ā2.5 for identifying deleterious alleles.
Validation of the role of PvPdh1 in a wider population
Genomic DNA was extracted using a modified CTAB method; amplification and Sanger sequencing of PvPdh1 were conducted as described previously. An indel was identified between positions 646 and 647 of the PvPdh1 transcript reference sequence. Varieties of known Andean ancestry, including the reference accession G19833, lack two base pairs found in varieties of Middle American ancestry. This indel occurs in the gene's 3ā² untranslated region and therefore does not affect the protein product's reading frame. The indel was used to distinguish Andean from Middle American varieties; only Middle American varieties included the mutant PvPdh1 allele. After sequencing, Middle American varieties were separated based on amino acid at position 162 of PVPDH1. The degree of dehiscence between these groups was evaluated by Student's t-test. Pod shatter phenotype data from the Germplasm Resource Information Network (GRIN: https://npgsweb.ars-grin.gov/gringlobal/descriptordetail.aspx?xml:id=83053) was compared with our sequencing data for varieties acquired from NPGS.
Landrace ecogeography
Precipitation across the native range of Middle American beans was mapped in Qgis 2.18.19 using data from WorldClim 2 (Fick & Hijmans, 2017). National boundaries and coastlines were added using shapefiles available through Natural Earth (Kelso & Patterson, 2010). USGS topographical global raster data grids were also used to improve the visualization of coastlines (https://topotools.cr.usgs.gov/gmted_viewer/gmted2010_global_grids.php). Landraces genotyped by Kwak & Gepts (2009) were filtered by their ecogeographic race, and those with values of 0.5 in Structure groups K6 (race Mesoamerica) and K9 (race Durango/Jalisco) were used for subsequent analysis. Delimited text layers were added in Qgis for varieties with latitude and longitude data that belonged to one of the ecogeographic races of interest. The average annual precipitation and elevation of the region where each landrace was collected using the āadd raster values to pointsā function in Qgis, and the values between ecogeographic races were compared by Student's t-test.
Data availability
Segregation data of pod shattering data (oven test proportion, force, and shattering in the field) as well as SNP markers in the IĆS population have been deposited in the UC Davis Dash public database (doi: 10.25338/B8TW2N: Gepts et al., 2019; doi: 10.25338/B8ZG68: Parker et al., 2019).
Genotype data for the Middle American Diversity Panel (Moghaddam et al., 2016) can be accessed at http://arsftfbean.uprm.edu/beancap/research/. Genotype data for the Andean Diversity Panel can be accessed at http://arsftfbean.uprm.edu/bean/?p=472 (Cichy et al., 2015). Coding DNA sequences of PvPdh1 have been deposited in the NCBI database: accession nos. MN094634āMN094748.
Results
Anatomical analysis of developing pods
Clear differences in pod anatomy were found among domesticated snap beans, domesticated dry beans, and wild common beans (Fig. 1). Wild beans produce a lignified wall fiber layer (LFL) in the pods that is thicker than the vascular bundle sheaths (VS, or suture string) layer, while the LFL is greatly reduced in domesticated varieties. Stringless snap beans have a weakly lignified VS at the suture, with a reduction in the number of lignified cells and the extent of secondary cell wall deposition in each cell, as reported previously (Prakken, 1934; Rau et al., 2019). In stringless beans, the LFL is typically absent. In contrast to the clear anatomical differences between these three groups, no variation between PD-resistant and PD-susceptible domesticated dry bean pods was observed (Fig. 1b,c), which parallels the pattern caused by the soybean gene POD DEHISCENCE 1 (PDH1; Suzuki et al., 2009; Tiwari & Bhatia, 1995). This observation suggests that the genetic change responsible for reduction of PD among dry beans may have been related to a modification of fiber composition or structure (e.g. lignin) rather than the total quantity of lignin or cell fate in the relevant pod structures.

Variation in the IĆS population
Segregation for PD was first determined in an RI population derived from PD-susceptible cv āICA Bunsiā and PD-resistant breeding line SXB 405 (Assefa et al., 2013). Both parental genotypes belong to the Middle American domesticated gene pool. Three phenotyping approaches were used to evaluate PD (Supporting Information Fig. S1) and each had a unique segregation pattern (Fig. S2). These phenotypes were strongly correlated (Fig. S3). RI lines that dehisced in the field had higher rates of PD after desiccation at 65Ā°C (two-tailed t-test, P = 3.1 Ć 10ā8) and required lower amounts of force to induce fracture at the sutures (two-tailed t-test, P = 1.2 Ć 10ā9). Similarly, the proportion dehiscing in the desiccator and force required to cause PD were negatively correlated (r2 = 0.71 simple linear model, P < 2 Ć 10ā16).
Quantitative trait locus mapping by composite interval mapping identified a major, PD-related QTL peak located in the same position on linkage group Pv03 using each of the three phenotyping methods (Fig. 2). The QTL mapped between SNP markers ss715639553 and ss715639323 (1). Force measurement produced the most significant results (LOD score 53.3), followed by desiccation (LOD score = 42.7), and field notes (LOD score = 8.9). Each phenotyping method produced results that were statistically significant based on 1000 randomized permutations of the data. The allele at the most significant SNP explained 17% of the variation in PD based on field notes, 59% of the variation based on desiccation, and 64% of the variation in fracture force in the population. Analyses to find additional PD QTLs failed to identify other regions of interest in the IĆS population.

Validation through association mapping
Next, we examined whether the Pv03 QTL affected PD in a broader cross-section of the dry bean gene pool. A genome-wide association study (GWAS), conducted using the desiccation method in the MDP, indicated that the most significant SNP (S1_149243152) was located in the QTL interval on Pv03 (MAF threshold = 0.1; Fig. 3a). This SNP was < 5.7 kb from a candidate gene, PvPdh1 (see the following section). This association analysis also revealed loci significantly associated with PD on chromosomes Pv06 and Pv08 (Fig. 3a).

A GWAS was similarly conducted in the ADP to determine which loci control PD in this independently domesticated population. Chromosomes Pv03, Pv05, Pv08, and Pv09 all included major regions significantly associated with PD (Fig. 3b). The QTL on chromosome Pv08 was in an overlapping physical position with the QTL from the MDP (Figs 3b, 1). The QTLs on chromosome Pv03 in the ADP and MDP appear to be only partially overlapping, and different candidate genes can be invoked (see the following sections).
In both the Andean and Middle American gene pools, PD varied greatly among market classes (Table S1). GWAS using only members of the race Mesoamerica (MDP with PC1 > 50) showed that the Pv08 QTL was most closely associated with PD in this race (Fig. S4). SNP S1_329543689, near the center of this interval of interest, was used for subsequent analyses. The region near PvPdh1 did not include significant SNPs in this race, further indicating that the races Durango and Mesoamerica rely on different genes for PD resistance.
To visualize the correlation between PD and population substructure in the MDP, PD was plotted against the first principal component of the genetic data. Each point was color-coded by its allele at the GWAS SNP peaks on Pv03 (S1_149243152, 5.7 kb from PvPdh1) and Pv08 (SNP S1_329543689) (Fig. 4a,b). Members of the MDP with the Pv03 PD resistance allele exhibited mean PD in the desiccator of 0.0067, with a maximum value of 0.14. Members of the MDP with the Pv08 PD resistance allele showed a mean PD of 0.021 and a maximum value of 0.08. In genotypes with no known resistance allele, the mean PD was 0.206 and ranged up to 0.63 (Fig. 4b). The mutations on Pv03 and Pv08 probably reflected independent selection for reduced PD in their respective environments (highland vs lowland). No synergistic gene action was observed between these two loci (Fig. 4b).

Identification of a candidate gene for the Pv03 QTL
The most significant SNP from the MDP GWAS (Fig. 3a) was located in an intergenic region well within the QTL mapping interval. One of the genes directly flanking this intergenic region was of immediate interest owing to its unique expression pattern. The gene, PvPdh1, is transcribed solely in developing pods (Fig. S5; data from O'Rourke et al., 2014), indicating that its function is unique to this structure. This gene encodes a dirigent-like protein, a family believed to regulate PD in soybean (Funatsuki et al., 2014). Because of the close phylogenetic relationship and extensive microsynteny between P. vulgaris and G. max (McClean et al., 2010; Schmutz et al., 2014), further analyses were conducted to determine the degree of synteny and orthology between common bean and soybean QTLs related to PD. The LegumeIP 2.0 synteny tool (Li et al., 2016) indicated that strong synteny exists between the soybean region surrounding GmPdh1 in soybean and the common bean QTL on Pv03 (Table S2), in agreement with previous synteny analyses (McClean et al., 2010; Schmutz et al., 2014). An amino acid Blast of GmPDH1 (cv. Toyosume) against the P. vulgaris G19833 proteome (Schmutz et al., 2014) indicated that the most similar common bean protein is encoded by the PvPdh1 gene model, which was immediately adjacent to our most significant GWAS SNP. A neighbor-joining tree of common bean and soybean dirigent proteins indicates that GmPDH1 and the protein product of PvPdh1 cluster together (Fig. S6). Together, these results suggest that PvPdh1 is orthologous to GmPDH1.
Sequencing of PvPDH1
Sequencing of PvPdh1 in IĆS revealed a nonsynonymous single-bp substitution at position 485 of the gene's coding sequence (Fig. S7a). This substitution leads to a threonine/asparagine polymorphism (T162N) in the protein product (Fig. S7b). The 11 RILs with recombination between the most significant markers from QTL mapping showed complete cosegregation between the threonine/asparagine polymorphism and the PD phenotype (Table S3). To investigate the functional importance of T162N, we evaluated the extent of its sequence conservation, surveyed literature related to this position in closely related dirigent proteins, and used Provean to predict the effect of this substitution at the position. Sequencing of PvPdh1 in several species of wild and domesticated Phaseolus from NPGS and UC Davis showed that the asparagine at this position was unique to the Middle American domesticated gene pool (Table S4). No polymorphism in the Andean gene pool was consistently associated with PD. In the Middle American gene pool, PD was significantly higher among genotypes with a threonine at position 162 than among those with an asparagine (t-test: P = 9.97 Ć 10ā5, n = 47; Fig. S8). This threonine was strictly conserved in Andean domesticated common bean, Middle American and Andean wild common bean, and the closely related Phaseolus dumosus and Phaseolus lunatus (Table S4).
In addition, the threonine residue is present in 99 of the 100 most similar proteins in the NCBI database (Fig. S9a), indicating its functional importance. The protein that lacks a threonine at this position is found in Trifolium subterraneum, a legume that produces pods that mature underground. PD is not relevant for seed dispersal in this species and the gene may be undergoing pseudogenization. This threonine is also conserved in the 19 most similar proteins of Selaginella moellendorffii (Fig. S9b), a member of the first diverging group of lignin-containing plants, indicating that the residue has been conserved since before the lycophyteāeuphyllophyte divergence c. 400 Ma (Soltis et al., 2002; Zimmer et al., 2007). No comparable protein could be found in the proteome of Physcomitrella patens, a nonlignified moss. Studies of closely related dirigent proteins indicate that this threonine is a component of one of the protein's active sites, and that its substitution eliminates protein function. An analysis with Provean (Choi & Chan, 2015) predicted that the T162N mutation would have a deleterious effect (score = ā4.587, cutoff = ā2.5).
Candidate genes for other QTLs identified by association mapping
Association mapping revealed several other dehiscence-related QTLs across the gene pools and races of common bean (Table 1). Our ADP association mapping identified significant Pv03 SNPs in an interval that is syntenic with a region controlling dehiscence in cowpea (Lo et al., 2018). NAC family and C2H2-type zinc finger transcription factors are found in this region (Table 1) and members of these families affect PD in soybean (Dong et al., 2014) and rapeseed (Tao et al., 2017), respectively. Orthologs of these genes may also affect dehiscence in cowpea (Lo et al., 2018). The QTL identified in the ADP is large enough to include PvPdh1, although the QTLs discovered in Middle American beans and cowpeas are nonoverlapping (Table 1).
Another major QTL for PD in Andean beans maps to Pv05, as described recently (Rau et al., 2019), and several genes in this region are candidates for future study. Rau et al. (2019) noted that an ortholog of MYB26 exists in the qPD5.1-Pv region of interest on Pv05, which may be responsible for variation in PD. Significant Pv05 SNPs from our association mapping completely envelope the qPD5.1-Pv interval, supporting this result. Our most significant Pv05 SNPs in the ADP are found just 22 kb from MYB46. MYB46 is involved in the same pathway as MYB26 and the soybean PD resistance gene SHAT1-5 (McCarthy et al., 2009; Dong et al., 2014). MYB46 also works redundantly with MYB83, a gene that may play a role in cowpea pod development (Suanum et al., 2016; Lo et al., 2018), making MYB46 another potential subject of future study.
Several genes of interest exist near the middle of the ADP's Pv08 GWAS peak. These include an MYB family transcription factor with similarity to A. thaliana MYB17, three WRKY family transcription factors, which are related to genes involved in sorghum dehiscence (Tang et al., 2013) and a polygalacturonase, a group known to influence PD in A. thaliana (Ogawa et al., 2009) (Table 1).
The Pv09 GWAS peak found in the ADP included a gene predicted to be cellulose synthase A7 (CESA7; Table 1). CESA7 may play a role in fiber development in cowpea (Suanum et al., 2016). Similarly, two polygalacturonases are found in this interval, and members of this family are known to affect seed dispersal in A. thaliana (Ogawa et al., 2009). These genes may regulate dehiscence by altering the breakdown of cell wall material in developing pods.
Associations among ecogeographic race, environment of origin, and PD
In landraces genotyped by Kwak & Gepts (2009), individuals belonging primarily to race Durango (genetically indistinguishable from race Jalisco) came from regions with significantly lower rainfall (709 mm yrā1 vs 1215 mm yrā1; Student's t-test, P = 2.3 Ć 10ā5) and higher elevations (1312 vs 1879 m; Student's t-test, P = 0.002) compared with landraces primarily belonging to race Mesoamerica (Fig. S10). These results are in agreement with previous analyses (Singh et al., 1991).
The PD-resistant allele of PvPdh1 on Pv03 is found exclusively in genotypes with ancestry from the ecogeographic race Durango (Fig. 4a; Table 1), which evolved in the northern, semiarid highlands of Mexico. The conditions in these areas cause pods to become dry and brittle, which exacerbates PD. The nonfunctional PvPdh1 allele (caused by the replacement of a threonine in position 162 by an asparagine) rose to very high frequency in this ecogeographic race. By contrast, race Mesoamerica is adapted to humid lowlands, where environmental conditions mask PD and reduce selection pressure against it. In this race, the loss-of-function PvPdh1 allele remains at a low frequency and PD is widespread (Figs 4a, 5).

Discussion
Associations with environmental conditions
Pod dehiscence in common bean is correlated with environmental parameters (Fig. S10). Common bean was domesticated twice, once in the Andes and once in the western region of Middle America (Gepts et al., 1986, Kwak et al., 2009; Bitocchi et al., 2013). From the Middle American center of origin, race Durango developed as cultivated common bean spread north into the semiarid highlands of northern Mexico and the southwestern United States. By contrast, race Mesoamerica formed as the crop spread south into the lowland tropics of southern Mexico and Central America (Fig. 5; Singh et al., 1991; Kwak et al., 2009). These variable environmental conditions may have led to strong differences in selection pressure among the races, including differences in selection against PD. The arid conditions of northern Mexico are highly conducive to PD, which could lead to major yield losses. In the tropical lowlands, environmental humidity masks susceptibility to PD, reducing selection pressure against it. The wild-type PvPdh1 allele may also be responsible for the ease of threshing that has been noted in race Mesoamerica. In humid environments, the wild-type PvPdh1 allele may facilitate separation of seeds from pod material, while PD in the field remains low. In northern Mexico, the semiarid climate facilitates threshing but increases PD in the field. Under these conditions, the PD-resistance allele may be advantageous. Therefore, variation in PvPdh1 allele frequency may be the result of selection for local adaptation based on this tradeoff (Fig. 5). Nevertheless, the existence of varieties that displayed low PD despite having no known PD-resistance allele indicates that there could be incomplete PD expressivity or additional PD-resistance loci that remain to be identified. Future work could identify detailed spatial patterns of PvPdh1 allele frequency across a broad panel of Mexican landraces of known geographic origins. Alleles that prevent PD will be valuable in coming decades, which are predicted to be increasingly arid (Sherwood & Fu, 2014).
The markedly different fitness landscape of domestication
The strict conservation of the threonine at position 162 in PvPdh1 highlights its functional importance in wild populations and species over hundreds of millions of years. Yet, in a remarkable example of parallelism, independent loss-of-function mutations in this gene at some time in the last 10 000 yr since domestication are found in certain domesticated populations in soybean and common bean, both species being subjected to selection for reduced dehiscence. This highlights the strong differences in selection pressure between the wild and cultivated environments, which in turn modify the fitness landscapes of the wild and cultivated environments. Whereas the wild environment favors PD, the cultivated environment favors pod indehiscence: a single locus with a single amino acid substitution is sufficient to bridge these two fitness peaks. The threonine to asparagine substitution further provides an additional example of strongly convergent phenotypic and molecular evolution (Lenser & TheiĆen, 2013). Similar examples of parallel evolution in the common bean include the determinacy trait (fin or PvTFL1y; Repinski et al., 2012; Kwak et al., 2012), absence of pigmentation (P; McClean et al., 2018), and photoperiod adaptation (Weller et al., 2019). By contrast, the major QTL on Pv05 discovered in a biparental population by Rau et al. (2019) and confirmed here in a diverse panel of Andean beans is not closely orthologous to PD-related loci, yet is described in other species. Future investigations may find that this locus has also been subject to parallel molecular evolution among taxa.
Our results serve as a note of caution when assessing the ācost of domesticationā on the basis of supposedly deleterious mutations identified by sequence variation alone. This cost refers to the load of harmful mutations that accumulates as a consequence of linkage, selection, and genetic drift during and after domestication. Several studies have documented this cost, for example, in horse (Schubert et al., 2014), sunflower, globe artichoke, and cardoon (Renaut & Rieseberg, 2015), and rice (Liu et al., 2017). Conversely, our results indicate that nonsynonymous mutations may also be responsible for advantageous changes that have occurred during crop domestication and dispersal beyond the speciesā native range. Thus, these bioinformatic studies should be complemented by studies measuring fitness under specific environments reflecting both the ancestral, wild and the derived, domesticated environments.
Further research is needed to identify the biochemical and biophysical aspects responsible for differences in PD in domesticated dry beans. Notably, our results could shed light on the fundamental process of lignin synthesis and fate under different environmental conditions. Dirigent-like genes, including PvPdh1, encode nonenzymatic proteins that guide the dimerization of lignin and lignan monomers (Davin et al., 1997). The role of these proteins in lignin synthesis has been debated, with suggestions that polymerization is guided (Davin & Lewis, 2005; Hosmani et al., 2013) or unguided (Ralph et al., 1999, 2008). Varieties of common bean with mutations in Pdh1 could be used to elucidate the role of this protein family in lignin synthesis generally.
Redundancies in genetic control and maintenance of atavistic traits
Crosses between races have tremendous potential for crop improvement (e.g. between races Durango and Mesoamerica: Singh et al., 1993), but can also result in problematic gene complementation in the progeny of crosses between parental lines with different PD resistance genes. Because several genes influence PD redundantly, progenies descended from crosses between these parents could show complementation allowing the expression of PD. In the absence of selection against PD, in a humid environment, for example, PD could reappear in breeding programs in spite of the deleterious effects of PD in a domesticated environment. Complementation and environmental dependency of PD are the cause of the maintenance of atavistic traits in a domesticated gene pool in the absence of sympatric wild populations, and are responsible for the high degrees of dehiscence seen in some cultivars of common bean.
In conclusion, our results depict crop domestication as a complex phenomenon, going beyond a single process that took place in a single, geographically and temporally circumscribed area. Domestication embraced the genetic complexity of higher plants wherein the same phenotype can be based on contrasting molecular foundations and interactions, in addition to spatially and temporally variable environments. This stands in contrast to many earlier studies, which have been based on the assumption that domestication occurred in a very specific geographic and temporal range within any given species (e.g. Matsuoka et al., 2002; Kwak et al., 2009; Huang et al., 2012; Bitocchi et al., 2013). It also highlights the importance of studying the genetic basis of domestication traits in genetically diverse populations. Our results depict domestication as including adaptations to a series of radically different environments, in which long-standing selection regimes in the wild can be reversed and replaced by new selective paradigms and alternate monomorphisms under domestication. Our results further highlight the fact that even core domestication traits, such as seed retention, can be found in a variable state in well-domesticated species. Crop domestication was a complex process of adaptation to a range of new environments, with multiple genetic paths to increased fitness in each environment, and without a single fixed solution for overcoming any given obstacle. This genetic complexity brings the investigation of plant domestication beyond the realm of an academic exercise, and has serious implications for plant breeding and future food security.
Acknowledgements
S. Beebe (CIAT, Cali, Colombia) provided seeds of the IĆS population. Seeds of the ADP and MDP were provided by R. Lee and P. McClean (North Dakota State University). Undergraduates Mayara Rocha, Poliana Silva Rezende, Guilherme Coelho Portilho, Natalie Hamada, Emily Yang, Ariel Herrera, Jose Pimentel, Matthew Bustamante, Emily White, Julia Gonzales, and Paige Augello contributed to DNA extractions, pod phenotyping, and other laboratory protocols. Paola Hurtado, Andrea Ariani, and other members of the Gepts group provided ideas for data analysis. Funding for TAP was provided through a Clif Bar Family Foundation Seed Matters fellowship and Lundberg Family Farms research support. We would like to thank two anonymous reviewers for excellent, in-depth reviews. The authors declare no competing interests.
Author contributions
TAP prepared the manuscript and conducted laboratory phenotyping, QTL mapping, GWAS, microscopy, and sequencing. JCBMT genotyped the IĆS population, gathered field phenotypes, co-conducted QTL mapping, and provided guidance for other procedures. AP assisted with field and glasshouse trials. JJ led the sectioning and microscopy studies. PG conceived the initial project and provided guidance. All authors edited the manuscript.