Assemblies and analysis of the A. indica and M. azedarach genomes
We used HiFiasm (v0.19.8) for the hybrid assembly of A. indica (neem) and M. azedarach (chinaberry) using HiFi, ONT and Hi-C data (Supplementary Table 1–3), obtaining 13 haplotype-resolved T2T chromosomes for each species. The last chromosome was revealed to be an rDNA-bearing one, which was neglected in previous assemblies4,5,8. Using droplet digital PCR, the copy numbers of rDNA in neem and chinaberry were estimated to be 551 and 159, respectively (Supplementary Table 4). We manually extended the rDNA-bearing chromosome to telomeres, resulting in a complete set of T2T assembly for each species. The per-base consensus quality values (QVs) exceeded 60 (Table 1), and BUSCO evaluation suggested that the genome completeness was greater than 99.5%, indicating significant improvements in completeness and continuity (Supplementary Table 5). We compared our T2T assemblies with the currently available neem and chinaberry genomes (Table 1), addressing the respective gaps within these genomes.
In the genomes of neem and chinaberry, repetitive sequences accounted for 32.14% and 32.10%, respectively, with 28,215 and 28,889 genes annotated (Supplementary Table 6–9). Phylogenetic analysis using 138 single-copy 1:1:1 orthologous genes across 19 species supported two clades of Sapindales. Clade one included Meliaceae, Rutaceae, and Anacardiaceae, whereas clade two included Sapindaceae. Within Meliaceae, a chromosomal comparison revealed high-level synteny between neem and chinaberry, whereas both Toona sinensis and Swietenia macrophylla underwent whole-genome duplications (Fig. 2a–d).

a–d Dot plots of the neem genome compared with four species, including M. azedarach (a), S. macrophylla (b), T. sinensis (c), and C. sinensis (d). e Enlarged dot plots of chinaberry compared with four species towards the long arm end of Chr12. Lineage-specific inversion on the neem chromosome is obvious. f Distribution of Ks for paired syntenic genes between three meliaceous species. The tetraploid T. sinensis genome was arbitrarily split into two subgenomes for comparison.
Karyotypic evolution underlines allopatric speciation of A. indica
While the evergreen neem trees are widely observed from Africa to America, they are endemic to Assam of the Indian subcontinent, and their global distribution started approximately 100 years ago through private or commercial initiatives9. On the other hand, chinaberry is endemic to the Sichuan Province of Southwest China. The distance between the two endemic sites is approximately 700 miles, passing through the Himalaya–Hengduan Mountains. The vertical distribution of neem can reach 1500 m, while most mountains of the Himalaya–Hengduan region are higher than 4000 m, preventing further gene flow between the two sites. This implied that the distinct morphological traits of the two species resulted from allopatric speciation (Supplementary Fig. 1).
We performed a comparative analysis of the two genomes to evaluate the genetic basis of species divergence. Corresponding chromosomes between neem and chinaberry are highly conserved and syntenic, with only one terminal inversion of 1.08 Mb on chr 12 observed. Detailed comparison with outgroup species of Sapindales confirmed that this inversion occurred only in the neem lineage (Fig. 2e), suggesting that chinaberry represents the ancestral chromosomal block while neem is derivative.
Chromosomal inversion has long been viewed as driving force of speciation, which will disrupt recombination in heterozygotes by reducing crossing-over within inverted regions10,11,12,13. Gene flow will then be restricted, leading to final speciation. It seemed that the neem-chinaberry species pair, which were isolated by uplift of the Himalaya–Hengduan Mountains, represents an ideal model of allopatric speciation.
We calculated the divergence of orthologous syntenic gene pairs between neem and chinaberry (n = 18,165), which peaked at approximately 0.09 (Fig. 2f). The neutral mutation rate of 2.5 × 10−9 mutations per bp per year14 implies that allopatric speciation of neem from chinaberry started approximately 18 MYA. Till now, the rise and growth history of the Tibetan Plateau has been extensively analysed using geological data, with various estimates ranging from 15 to 55 MYA15,16. The allopatric speciation of neem from chinaberry provided a rare biological perspective on the geological evolution of the Tibetan Plateau, which agreed well with the cessation time of rapid Pacific trench migration (15–20 MYA)17.
Expansion of genes involved in the biosynthesis of sulphur-containing volatiles in Meliaceae
Meliaceae trees, notably those of the Toona and Azadirachta species, have sulphur-containing volatiles with garlic-like smells18. Tender leaves of T. sinensis, which contain abundant diallyl disulphide derived from γ-glutamyl-S-ally-L-cysteine, have been enjoyed by the Chinese as woody vegetables for thousands of years. The proposed biosynthetic pathway of volatile organosulphur compounds includes (1) the conjugation of cysteine and glutamic acid by γ-glutamylcysteine synthetase, (2) the addition of glycine to the C-terminal site of γ-glutamylcysteine by glutathione synthetase, (3) the S-conjugation of glutathione, (4) the removal of the glycyl group by phytochelatin synthase, (5) the modification of the S-alk(en)yl group, and (6) the removal of the γ-glutamyl group by γ-glutamyl transpeptidase (GGT)19 (Supplemental Fig. 2).
Accumulation of ordor compounds implies possible expansion of genes involved in sulphur-containing volatile biosynthesis in Meliaceae. As shown in Fig. 3, C. sinensis has only one GGT copy on chr1, and a total of 16 and 14 copies were observed on the orthologous neem and chinaberry chromosomes, respectively (Fig. 3a, b). A significant expansion of GGTs was also detected in the T. sinensis genome, but the tetraploid S. macrophylla genome had only one copy, suggesting that this expansion occurred during Meliaceae species radiation. In comparison, the garlic genome is characterised by extensive duplication of downstream alliinase genes, which catalyse the hydrolysis of S-alk(en)yl-l-cysteine sulfoxides20. High expression of tandemly duplicated GGTs, particularly in stems and leaves, was observed, which presumably contributed to the distinctive odours of these plants (Fig. 3c). We quantified the content of S-allylcysteine in the leaves of these species. It turned out that the content sorting was T. sinensis > neem > chinaberry > C. sinensis, which was consistent with GGT copies in these species (Fig. 3b and Supplementary Fig. 2).

a Phylogeny of five Sapindales species with pairwise chromosome alignments. Each horizontal bar represents one chromosome, and the chromosome IDs of the chinaberry are labelled at the top. Chromosomes that experienced GGT duplication are highlighted in colour. b Orthologous relationships of duplicated GGTs across five species. The synteny of genes around GGTs is illustrated by grey bands connecting orthologues across species. c Heatmaps of GGT expression across different tissues. The expression of each gene was measured as a log (1 + TPM). d Orthologous relationships of duplicated ATs across five species. e Heatmaps of AT expression across different tissues.
Expansion of acyltransferases for limonoid biosynthesis in Meliaceae
Meliaceous plants contain large amounts of limonoids with unique anti-insect activities21,22,23. Despite recent progress in protolimonoid biosynthesis, the elucidation of the complete biosynthetic route of diverse limonoids from azadirone to toosendanin or azadirachtin remains a major challenge. Hydroxylation and acetylation, which are catalysed by CYP450s and ATs, respectively, are involved in protolimonoid modifications8,24 and limonoid diversity. For this reason, we focused on acyltransferase (AT) family genes in neem and chinaberry.
Systematic annotation and curation revealed 11 ATs in C. sinensis and various copies in Meliaceae plants, from 15 in T. sinensis to 39 in chinaberry. Tandem clusters of 9 and 15 ATs on chr3 were identified in neem and chinaberry, respectively. Meanwhile, 8, 6, and 0 copies were annotated in syntenic regions of outgroup species T. sinensis, S. macrophylla and C. sinensis, respectively (Fig. 3d), suggesting that tandem expansion postdates Meliaceae species radiation but predates the divergence of neem and chinaberry.
We performed transcriptome sequencing of different neem and chinaberry tissues in triplicate. Weighted gene co-expression network analysis (WGCNA) revealed a gene module that strongly correlated with fruit in chinaberry, containing 1352 genes. The GO enrichment analysis of these genes revealed that among the top five enriched GO terms, two were significantly associated with acetylation functions: acyltransferase activity and O-acyltransferase activity (Supplementary Fig. 3). We subsequently extracted all the genes from the two GO terms and aligned them with all the chinaberry proteins, then the tandem cluster of ATs on chr3 stood out. We further analysed the expression of these ATs. While the syntenic relationships of these ATs were obvious, their expression patterns diverged extensively, especially in fruits (Fig. 3e). We postulated that different expression levels of these ATs in fruits contributed to acetylation diversity of limonoids between neem and chinaberry.
Characterisation of the ATs for limonoid C12-OH acylation
Since acylation at C12 is a divergent feature between limonoids of neem and chinaberry which present different activities against insects and cancer cell lines21,22,23,25. We speculate that the AT gene cluster from WGCNA of chinaberry might be involved. We selected six chinaberry ATs (MaAT-1, 3, 6, 7, 9, and 15) and four neem copies (AiAT-1, 2, 3, and 4) from the tandem clusters for in vitro functional characterisation using Escherichia coli, since these ATs are highly expressed in fruit, as indicated by WGCNA (Supplementary Fig. 3). First, all the candidate MaATs were expressed in E. coli. Two compounds 1 and 3 were then prepared by chemical hydrolysis, and their structures were further verified by nuclear magnetic resonance (NMR) spectroscopy (Supplementary Figs. 4 and 5). The recombinant proteins were assessed with compound 1 as the acceptor and acetyl-CoA as the acyl donor. The liquid chromatography‒mass spectrometry (LC-MS) analysis revealed that the MaAT8824 and MaAT1704 proteins formed a product with the same retention time (RT = 5.48 min) and mass spectrum ([M+Na]+ = 597.2215) as authentic toosendanin 2 (Fig. 4a–d and Supplementary Fig. 6). When compound 3 was used as the substrate, isotoosendanin 4 was detected (Fig. 4a, e–g and Supplementary Fig. 7). It’s noteworthy that the additional peaks at 6.5 min in Fig. 4b and at 4.3 min in Fig. 4e correspond to isomers of compounds 2 and 4, respectively, due to the existence of the unstable hemiacetal hydroxyl group in their structures26. Therefore, MaAT-6 (MaAT8824) and MaAT-7 (MaAT1704) can transfer acetyl groups into the C12-hydroxyl moieties of compounds 1 and 3 and produce toosendanin 2 and isotoosendanin 4, respectively. The remaining four chinaberry ATs (MaAT-1, 3, 9, and 15) failed to acylate compounds 1 and 3 (Supplementary Fig. 8). We also characterised the function of MaAT8824 and MaAT1704 in Nicotiana benthamiana. LC-MS analysis of N. benthamiana leaf extracts confirmed the acetylation of compounds 1 and 3 into compounds 2 and 4 towards C12-OH, respectively, which agreed well with results from E. coli (Supplementary Fig. 9).

a Reactions for compounds 1 to 2 and compounds 3 to 4 catalysed by MaAT8824 and MaAT1704. b LC-MS chromatograms of the standard, the control samples and the reaction mixtures for MaAT8824 and MaAT1704 using acetyl-CoA as the acetyl donor and compound 1 as the substrate. c, d LC-MS and LC-MS/MS analyses of reaction product 2 for MaAT8824 and MaAT1704 in positive ion mode; e LC-MS chromatograms of the standards, the control samples and the reaction mixtures for MaAT8824 and MaAT1704 using acetyl-CoA as the acetyl donor and compound 3 as the substrate. f, g LC-MS and LC-MS/MS analyses of reaction product 4 for MaAT8824 and MaAT1704 in positive ion mode. h–k The contents of the four compounds in the four tissues of the two plants. The data are presented as the means ± SDs (n = 3 biological replicates). Source data are provided as a Source Data file.
We quantified the contents of compounds 1 to 4 in different tissues. Compounds 1 and 3 had the highest contents in the fruits of both plants (Fig. 4h, j), whereas compounds 2 and 4 were detected only in chinaberry, with the highest contents in fruits (Fig. 4i, k). This finding agreed well with the transcriptomic and enzymatic results. These observations suggested that MaAT8824 and MaAT1704 could transfer an acetyl group into the C12-OH of limonoids to finalise toosendanin/isotoosendanin biosynthesis in chinaberry. In contrast, the syntenic neem genes have no acetylation activity, resulting in limonoid diversity between the two allopatric species.
Missing of compounds 2 and 4 in neem may result from the loss of function of the syntenic neem genes or neo-functionalization of the chinaberry copies27. To clarify the exact evolutionary scenario, genes from neem and outgroup species were assayed for acetylation activity. As expected, none of the neem copies could transfer acetyl groups to C12-OH of compounds 1 or 3 (Fig. 4b, e and Supplementary Fig. 8). Four ATs from S. macrophylla, T. sinensis, and C. Sinensis were selected according to collinear relationship (Fig. 3d and Supplementary Fig. 10), and MaAT8824 was used as positive control. These genes were expressed in E. coli, and the recombinant proteins were assessed with compound 1 as the acceptor and acetyl-CoA as the acyl donor. Results suggested that all these ATs, except MaAT8824, had no acetylation activity towards compound 1 (Supplementary Fig. 11). Accordingly, we propose that the C12-OH acylation activities of MaAT8824/MaAT1704 represented neo-functionalization and emerged after divergence of chinaberry and neem.
Acylation activity of MaAT8824 towards the C3-OH group of limonoid 6
We then tested the acylation activity of MaAT8824/MaAT1704 towards other positions, such as C3-OH, by preparing compounds 5 and 6 (Supplementary Figs. 12 and 13). When compound 5 was used as the substrate, to the best of our knowledge, a previously unobserved product was detected for MaAT8824 and MaAT1704. The product was confirmed to be compound 6 when the retention time (RT time = 3.6 min) and mass spectrum ([M+Na]+ = 555.2204) were compared with those of the authentic product 6 (Fig. 5a–d and Supplementary Fig. 14). This result indicated that MaAT8824 and MaAT1704 could also catalyse acetylation at the C-12 hydroxyl group of compound 5 to form compound 6. Interestingly, compound 6 can be further catalysed into toosendanin 2 by C3-OH acylation (Fig. 5a, e and Supplementary Fig. 15).

a Reactions from compounds 5 to 6 and compounds 6 to 2 catalysed by MaAT8824 and MaAT1704. b LC-MS chromatograms of the standard samples, the control samples and the reaction mixtures of MaAT8824 and MaAT1704 with acetyl-CoA as the acetyl donor and compound 5 as the substrate. c, d LC-MS and LC-MS/MS analyses of the reaction product 6 catalysed by MaAT8824 and MaAT1704 in positive ion mode. e LC-MS chromatograms of the reaction mixtures for MaAT8824 and MaAT1704 using acetyl-CoA as the acetyl donor and compound 6 as the substrate. f, g The contents of compounds 5 and 6 in different neem and chinaberry tissues. The data are presented as the means ± SDs (n = 3 biological replicates). Source data are provided as a Source Data file.
The acylation activity of MaAT8824/MaAT1704 against compound 14, which was derived by deacetylation of compound 4 at both C-3 and C-12 (Supplementary Fig. 16), was further tested. While a peak of suspected compound 15 was detected using MaAT8824 (Supplementary Fig. 17), attempts to purify this compound failed because of low catalytic efficiency (Supplementary Fig. 18), and the catalytic capacity was even lower when using MaAT1704 (Supplementary Fig. 17). As expected, the remaining ATs within the chr3 cluster had no acetylation activity towards C3-OH.
Taken together, these results implied that MaAT8824 and MaAT1704 could catalyse the acetylation of both the C-12 and C-3 hydroxyl groups of limonoids, with a preference for the C-12 hydroxyl group. These activities were also confirmed in vivo in N. benthamiana (Supplementary Fig. 19).
It’s noteworthy that high contents of compound 5 were detected in both chinaberry and neem, especially in fruits, yet compound 6 was only detected in neem (Fig. 5f, g). This may be attributed to the higher catalytic activity of MaAT8824 towards compound 6 than compound 5, resulting in rapid consumption of 6. On the other hand, since the syntenic neem ATs had no acylation activity, accumulation of compound 6 in neem might imply the existence of unidentified AT gene(s) from elsewhere of the genome with C12-OH acylation activity towards 5, which warrants further analysis.
Catalytic mechanism of MaAT8824
Since MaAT8824 possesses higher catalytic activity than MaAT1704 (Supplementary Fig. 20), attempts were made to obtain the complex crystal structure of MaAT8824 but failed. We therefore predicted its structure using alphafold2 with PLDDT = 90. We performed molecular docking with compound 1 as the acceptor and acetyl-CoA as the donor into the catalytic pocket, aiming to elucidate the catalytic mechanism of MaAT8824. Ac-CoA and compound 1 were docked into the structure according to the donor and substrate positions in SbHCT28 and AmAT7-329 (Fig. 6a and Supplementary Fig. 21). Compounds 3, 5 and 6 were also docked in the same way (Fig. 6b–d). We selected key residues within 5 Å that interact with the substrate or donor and mutated them according to the docking results with compound 1 to verify the reliability of our analysis. Mutation of residues I39, H164, E360, and W377 to alanine led to enzyme inactivation, confirming the reliability of the molecular docking data (Fig. 6e).

a–d Docking conformations of compounds 1, 3, 5, and 6 and Ac-CoA into MaAT8824. e In vitro assay of the MaAT8824 mutants using compound 1 as the substrate. f The distance of C1-O1/C1-O3 during the MD simulations for compounds 1, 3, 5, and 6 (C1-O1 for compounds 1, 3, and 5, and C1-O3 for compound 6). Box plots show median values (solid horizontal lines), 25th and 75th percentile values (box), and 90th percentile values (whiskers). g The distance (C1-O1/C1-O3) frequency during the MD simulations of compounds 1, 3, 5, and 6. h The average rates of reaction of three MaAT8824 mutants for compounds 1, 3, 5, and 6. The data are presented as the means ± SDs (n = 3 biological replicates). Source data are provided as a Source Data file.
Molecular dynamics (MD) simulations were further conducted with three replicates to understand the acetylation mechanism of this enzyme at C12-OH and C3-OH. In the acetylation of BAHD-AT, the nucleophilic attack of the oxyanion of substrates towards the carbonyl carbon of the acyl donor is critical for catalysis28. Therefore, the distances between the acyl C and the substates, C1-O1/C1-O3 (C1-O1 for compounds 1, 3, and 5, and C1-O3 for compound 6), were calculated elaborately. The average distances between the acyl C and the oxyanions in compounds 1 and 3 are 5.34 Å and 5.6 Å, respectively, which are shorter than those in compounds 5 and 6 ( > 6.1 Å) (Fig. 6f). The distance frequencies between the acyl C and the oxyanion in the four substrates are shown in Fig. 6g. For compounds 1 and 3, this distance was predominantly within 5 Å but was mostly longer than 6 Å for compounds 5 and 6. These simulations indicated that the relatively shorter distance between the acyl C and the oxyanion of compound 1 and the steady binding of Ac-CoA are responsible for its preferential catalysis toward C12-OH.
The substrate specificity of MaAT8824 was further validated by kinetic analysis. The apparent affinities (Km) of MaAT8824 for acceptors 1, 3, 5, and 6 were calculated as 64.06 µM, 205.1 µM, 390.7 µM, and 232.5 µM, respectively. Moreover, the catalytic efficiencies (kcat/Km) of MaAT8824 with respect to compounds 1, 3, 5, and 6 were 318.4 s−1 M−1, 180.2 s−1 M−1, 35.1 s−1 M−1, and 106.8 s−1 M−1, respectively. These results implied that the catalytic efficiency of MaAT8824 towards compound 1 was much greater than that towards compounds 5 and 6 (Table 2 and Supplementary Fig. 22). A comparison of compounds 1, 6 and 5 implied that MaAT8824 presented higher catalytic activity towards substrates with acetyl groups than towards those without acetyl groups, regardless of the presence of C3-OH or C12-OH. Finally, the preference for substrate with C12-OH rather than C3-OH was evident by comparing compounds 1 and 6. Taken together, the substrate preferences of MaAT8824 were in the order of compounds 1 > 3 > 6 > 5, which was consistent with the MD analysis. Kinetic analysis of MaAT1704 was also conducted, and the data indicated that the substrate preference of MaAT1704 was in the order of compounds 1 > 3 > 5 > 6 (Supplementary Fig. 22). These data revealed that both MaAT8824 and MaAT1704 preferentially catalyse the acetylation of the C12-OH of compounds 1 and 3, and MaAT8824 presented notable higher catalytic activity.
We further conducted site-specific mutagenesis to elucidate the structural basis of substrate specificity. To this end, dozens of key amino acids located in the catalytic pocket were mutated, and the substrate specificity of these mutants was detected by calculating their average rates of reaction towards compounds 1, 3, 5, and 6 (Fig. 6h). Among these mutants, C32A decreased reaction towards compound 1, 3, 6 and increased reaction towards compound 5, indicating that the C32 residue plays a critical role in substrate specificity. Similarly, compared to the wild type, H33A decreased reaction rates by 90% towards compounds 1, 3 and 6, but had no obvious effect towards compound 5. Moreover, W377A cannot catalyse compounds 3 and 5, with residual activity against compounds 1 and 6, which was consistent with the MD simulation results (Fig. 6h).
In addition to compounds 1, 3, 5, and 6, other compounds, including compounds 7, 9, and 13, were also tested as possible acetyl acceptors of MaAT8824 (Supplementary Fig. 23). However, no products were observed when these chemicals were used as substrates (Supplementary Fig. 23). Different acyl donors, including malonyl-CoA, isobutyryl coenzyme A, succinyl coenzyme A, and benzoyl-CoA, were tested using compound 1 as the acceptor. MaAT8824 could not recognise these acceptors but could recognize only acetyl coenzyme A (Supplementary Fig. 24). Taken together, these results implied that MaAT8824 has relatively strict substrate specificity towards donors and slightly broader substrate specificity towards acceptors.
A critical region drives the gain-of-function of AiAT0635
Neo-functionalization of MaAT8824 and MaAT1704 for C12-OH acetylation of limonoids, and the existence of non-functional ancestral copy in neem (AiAT0635), enable us to identify critical regions responsible for emergence of unique function. We aligned the three protein sequences to zoom in candidate residues (Supplementary Fig. 25). We observed seven divergent regions along the whole alignment, among which a critical 7-aa region towards the N-terminus (31–37) diverged significantly between MaAT8824 and AiAT0635 (Fig. 7a and Supplementary Fig. 26). We generated seven mutants by substituting the seven divergent regions of AiAT0635 into the corresponding sequence of MaAT8824 (Supplementary Fig. 26a), after which the catalytic activity of these mutants was assayed in vitro using compound 1 as a substrate. Interestingly, a peak appeared when AiAT0635-8824-F1 was overexpressed (substituting segment 31–37 (ISAGAVP) of AiAT0635 into segment 31–37 (LCHRSSG) of MaAT8824). The distinct product was proven to be toosendanin 2 (Supplementary Fig. 26b). Compared with that of MaAT8824, the conversion rate of this mutant was recovered to approximately 7.3%, even higher that of wild MaAT1704 (Fig. 7b). In comparison, no product was detected for the remaining six mutants (Supplementary Fig. 26). The acetylation activity of AiAT0635-8824-F1 was also confirmed in N. benthamiana (Supplementary Fig. 27). Conversely, we swapped the segment of MaAT8824 into those of MaAT1704 (ISASAAP) and AiAT0635 (ISAGAVP). Compared with that of MaAT8824, the relative conversion rates decreased by 15% and 60% for MaAT8824-mut-1704 and MaAT8824-mut-0635, respectively (Fig. 7b). The acetylation activity of MaAT1704 was completely abolished when its segment (ISASAAP) was swapped into the AiAT0635 version (Fig. 7b). Taken together, these results confirmed the critical role of the segment (LCHRSSG) for optimal acetylation activity in MaAT8824, while the segment (ISAGAVP) of AiAT0635 present no acetylation activity. The critical region was further narrowed down to segment 32–37 (SAGAVP) by substituting individual amino acids in different combinations (Fig. 7c–e and Supplementary Fig. 26c). These results suggested that the acylation function of the neem copy could be resurrected by substitution of six amino acids within the critical region.

a N-terminal sequence alignment of MaAT8824 and AiAT0635. b Relative conversion rates of different mutants when compound 1 was used as the substrate. c LC/MS chromatograms of the standard and the reaction mixtures for AiAT0635 and AiAT0635-mut using compound 1 as the substrate and acetyl-CoA as the acetyl donor. d, e LC-MS and LC-MS/MS analyses of products detected in reaction mixtures for AiAT0635-mut in positive ion mode. f, g Reactive snapshots of conformations for compound 1 in AiAT0635 and AiAT0635-mut from MD simulations. h The distance (O2-W377) between acyl O and W377 during the MD simulations using Substrate 1. Box plots show median values (solid horizontal lines), 25th and 75th percentile values (box), and 90th percentile values (whiskers). i The distance (C1-O1/C1-O3) frequency in AiAT0635 and AiAT0635-mut. The data are presented as the means ± SDs (n = 3 biological replicates). Source data provided as a Source Data file.
We performed MD simulations of AiAT0635 and AiAT0635-mut with compound 1 as the acceptor and acetyl-CoA as the donor to elucidate the possible catalytic mechanism of AiAT0635-mut. Compound 1 was docked into AiAT0635 and AiAT0635-mut based on the two key distances between the O atom of the acetyl donor and the H atom of the catalytic tryptophan (O2-W377) and between the C atom of acetyl-CoA and the O atom of the acceptor substrate (C1-O1). We found that the relative position of compound 1 was overturned in AiAT0635-mut since the mutated amino acids have larger side chains (such as 32 C), leading to a smaller cavity and a changed position of the substrate (Fig. 7f, g and Supplementary Fig. 28a, b). According to the results of the MD simulations, the root mean square deviations (RMSDs) of the two proteins and the donor did not differ substantially, whereas the RMSD of acetyl-CoA in AiAT0635-mut varied slightly after 6 ns, implying steady binding of the acetyl donor in AiAT0635-mut compared with that in AiAT0635 (Supplementary Fig. 28c–h). Moreover, the key distance of O2-W377 for AiAT0635-mut was 4.17 Å, which was shorter than that of wild-type AiAT0635 (5.6 Å) (Fig. 7h). The steady binding of the acetyl donor here was also consistent with the RSMD results. Taken together, these results indicate that the stable binding of the acetyl group of the donor in the protein may account for the gain-of-function of AiAT0635 after mutation. Another key distance of C1-O1 for AiAT0635-mut was similar to that of AiAT0635 (Fig. 7i and Supplementary Fig. 29), indicating that the distance between the donor and substrate did not change substantially after mutation. This result may explain why although AiAT0635-mut exhibited acetylation activity, its catalytic activity was still lower than that of MaAT8824 (Fig. 7b).