ELUCIDATING THE FUNCTIONAL ANNOTATION AND EVOLUTIONARY RELATIONSHIPS OF CYTOCHROME P450 GENES IN XYLARIA SP . FL1777 USING IN-SILICO APPROACHES

The higher level of human activities has resulted in several forms of anthropogenic activities with diverse adverse effects on human and environmental sustainability. The traditional means of handling xenobiotics pollutants are no longer sustainable due to the high cost involved, complex procedures and demanding regulatory requirements. Bioremediation using fungi (mycoremediation) is now recognized as an efficient and workable biotechnological tool that effectively employ fungal enzymes via the process of absorption and mineralization to get rid of contaminants. Cytochrome P450s (CYPs) are diverse and unique gene families with varying degree of complexities in the eukaryotes. CYPs mainly utilize molecular oxygen to modify substrate conformation, thereby establishing a mechanism of action for achieving their important physiological and ecological processes. Xylariaceae belongs to the main and highly diversified families of filamentous Ascomycota; it plays an important role as saprotrophs of wood, soil, litter and dung. Genome-wide annotation analysis was carried out to explore the possibility of utilizing the CYPs of Xylaria sp . for achieving mycoremediation. The evolutionary analysis has divided the 214 Xylaria CYPs into fifteen (15) clades. The CYPs were categorized into forty-seven (47 clans) and eighty-six (86) families. MEME suite identified ten (10) conserved motifs. The gene structural investigation reveals high dynamic intron-exon organization. Most of the CYPs have been predicted to be localized in the endoplasmic reticulum. This study therefore calls for deeper exploration of the Xylaria sp and its high potential for application in bioremediation for the degradation of environmental contaminants.


INTRODUCTION
Humankind is involved in various activities for survival, which consequently led to the liberation of different pollutants into the environment. Li et al. (2019) reported that pollutants from human activities constitute a major risk to the health of human and environmental sustainability. The agelong solution of eliminating xenobiotic pollutants includes the use of ultraviolet decomposition, incineration at high temperature, pit disposal and chemical degradation (Bhandari et al., 2021). These techniques are gradually phasing out due to high-cost implications, complex procedures, burdensome regulatory requirements, inadequate provision for space and secondary pollutants arising from the processes (Bhandari et al., 2021). Hence, there is an urgent need for environmentally friendly remediation techniques that could be applied for effective bioremediation of these contaminants mentioned above (Kevin et al., 2019). Singh (2006) stated that fungi because of their ability to actively decompose various chemicals have been recognized as a potential workable biotechnological tool that could be applied in the bioremediation of heavily polluted environments. Similarly, Buddolla et al. (2014) identified fungi as an essential organism for bioremediation due to their ability to exploit significantly minimal living conditions by producing enzymes capable of undertaking chemically difficult reactions. It has been discovered that fungi could effectively remove toxic and intractable products like waste from pharmaceuticals, polyaromatic and chlorinated hydrocarbons, pesticides and mineral oils from the soil have also been reported by Jasu et al. (2021) where they named cytochrome P450 monooxygenases as one of the intracellular enzymes to perform such task. Bioremediation using fungi (mycoremediation) is a method that utilizes enzymes in live fungi to clear up contaminants through mineralization or absorption (Kevin et al., 2019). Similarly, shiyuki et al. (2013) reported using microorganisms in the activated sludge process as bioremediation techniques for industrial extract chemicals and the polychlorinated forms of dibenzo-p-dioxin and dibenzofuran (PCDD &PCDF). Cytochrome P450s (CYPs) are diverse and unique gene families with varying degree of complexities in the eukaryotes. CYPs mainly utilize molecular oxygen to modify substrate conformation, thereby establishing a mechanism of action for achieving their important physiological, toxicological and ecological processes (Nelson et al., 2013). The Cytochrome P450 enzymes (P450s) are largely disseminated across organisms and perform essential roles in the biosynthesis (of steroids and natural products), xenobiotics degradation, and metabolism drugs. P450s are generally regarded as the most adaptable natural biocatalysts because of the wide range of substrate configuration and the kinds of reactions they catalyze (Li et al., 2010). Generally, ascomycetes inhabit a wider niche in soil than their basidiomycetes counterpart, yet they have not received attention for bioremediation studies when compared with basidiomycetes that have been well-studied (Li et al., 2019). Xylariaceae belongs to the major and highly diversified families of filamentous Ascomycota. According to U' Ren et al., (2016), Xylariaceae are active saprotrophs of litter, dung, wood, soil, and plant pathogens in a natural and agricultural system, as other facultative fungal organisims (Dauda et al., 2018;Palnam et al., 2019;Zarafi and Dauda, 2019) Xylariaceous are progressively been recognized as a chief source of new products of metabolism for utilization in biofuel, environment, agriculture, medicine and industrial applications (Wu et al., 2017). Li et al., (2019) reported that Xylaria sp. BNL1 can degrade carbaryl in contaminated soil with a degradation rate of 59.0% in fifteen (15) days; this implies that Xylaria sp. BNL1 can survive various attacks from indigenous microorganisms. The role played by P450s in economically important fungi such as Aspergillus spp., (Kelly et al., 2009;Dauda et al., 2022a Alternaria spp., (Dauda et al., 2022b) Candida tropicalis (Dauda et al., 2022c) , Trichoderma spp., (Chadha et al., 2018) have been well elucidated.
Considering the diverse potential biotechnological applications of Xylaria and the impact of cytochrome P450 in the biological, physiological and biochemical activities of fungi this study intends to perform an evolutionary relationship and genome-wide analysis of cytochrome P50 genes in Xylaria sp. FL1777 to open room for commercial exploitation of these proteins, especially in bioremediation.

MATERIALS AND METHODS Sequence Retrieval and Alignment
Protein, genomic and coding sequences of cytochrome P450 of Xylaria sp FL1777 were downloaded from the Joint Genome Institute (JGI) fungal genome database-MycoCosm (mycocosm.jgi.doe.gov/pages/search-for-genes.jsf).

Structural feature analysis of CYP protein sequences:
The conserved domain of cytochrome P450 in Xylaria sp. FL1777 were analyzed using the conserved domain database (CDD). The proteins sequenced were analysed for the presence of the CYP family signature domains viz; hemebinding and oxygen-binding motifs. The sequences used were only those with the two CYP signature domains (Matowane et al., 2018).

Evolutionary relationships of taxa
The evolutionary relationship was analysed using 214 amino acid sequences. The pairwise deletion was used to clear off all ambiguous positions on each sequence. The Phylogenetic tree was conducted in MEGA X (Kumar et al., 2018) using the Neighbour-Joining method (Olszewska-Tomczyk et al., 2016) and as described by Dauda et al., (2021). The optimal tree with the sum of branch length = 110.19813185 is shown. Poisson correction method was used to compute the evolutionary distances (Tomczyk et al., 2016). The final dataset comprises a total of 2078 positions.

Identification of clans, families and putative functions
The putative CYP names for all P450 genes in Xylaria sp FL1777 were assigned by the logic in the FCDP pipeline (http://p450.riceblast.snu.ac.kr) following the nomenclature format as proposed by Nelson (2006), two CYPs with more than 40% sequence similarity belong to the same family. Therefore, each Xylaria CYPs was blasted against all known fungal cytochrome P450 available at "Fungal cytochrome P450 database" where blast result with a best hit (greater than 40% sequence similarity) to the query sequence is assigned to that family. Clans were identified by comparing families obtained against clans and families in the fungal cytochrome P450 database.

Identification of Motif and Analysis of Gene Structure:
The conserved motifs of cytochrome P450 gene of Xylaria sp. FL1777 were identified by an online server, Multiple Expectation Maximization for Motif Elicitation (MEME) Suite (http://meme-suite.org/tools/meme) using the genomic sequence (Bailey et al., 2009). A set of 214 protein sequences between 95 and 1153 in length with an average length of 491.6 have been submitted. The number of motif counts was set at 10, the minimum width of the motif was set at 6 amino acids, while the maximum was 100 amino acids. Similarly, structures of both intron and exon of cytochrome P450 gene in Xylaria sp. FL1777 were analysed using an online server called Gene Structure Display Server (GSDS 2.0) (http://gsds.gao-lab.org/) (Bo et al.,2015), the positions and numbers of both introns and exons were graphically displayed by the server after loading the coding and genomic FASTA sequences of Xylaria sp. FL1777.

Sub-cellular localization analysis:
The localization of the Xylaria CYPs was predicted using an online web server for predicting the subcellular localization of eukaryotic proteins, including those with multiple sites in a different organism known as Euk-mPLoc 2.0 (Cheng et al., 2018), which is accessible at http://www.csbio.sjtu.edu.cn/bioinf/Cell-PLoc/. The protein FASTA sequence of the organism was used.

Identification of Xylaria CYPs Involved in secondary metabolism-related gene clusters
Secondary metabolism-related gene clusters of Xylaria sp. FL1777 were identified from the joint genome institute mycoCosm using the annotations on the homepage and a search for all cluster types, namely; dimethylallyltryptophan (DMAT), PKS/NRPS (HYBRID), Non-ribosomal Peptide Synthase (NRPS), NRPS-like, polyketide synthase (PKS), PKS-like and terpene cyclase (TC).

RESULTS AND DISCUSSION Phylogenetic Analysis:
The result obtained from the phylogenetic analysis shown in figure 1 revealed that 214 protein sequences were divided into thirteen (13) clades. About half of these proteins (95) were clustered in four clades (I, VIII, IX, and XI), having 24, 24, 18 and 31 proteins, respectively. In contrast, clades with the least cluster of proteins were IV, VI, and X with 3, 2 and 3 proteins respectively. Clades II, V and XIII were having relatively equal distribution of proteins consisting of 14, 15, and 14 proteins respectively.  (8), consisting of families CYP531, CYP532, CYP536, CYP629, CYP631, CYP675, CYP5080 and CYP5104. The entire clan has 11 proteins that are involved in xenobiotic metabolism. Clan CYP58 is the second-highest in family size consisting of six (6) families which are; CYP58, CYP682, CYP5104, CYP5112, CYP5094 and CYP551. The clan has the highest number of proteins (12) which were participate in xenobiotic and secondary metabolism. Clan CYP54 has four families with 11 proteins that are involved in secondary metabolism. Nineteen clans are orphans with only a single protein each. Twenty-one (21) clans have no corresponding putative function in the fungal cytochrome P450 database. Generally, five (5) clans are involved in primary metabolism, seven (7) in secondary metabolism while thirteen (13) in xenobiotic metabolism.

Spread of Conserved Motifs in Xylaria sp. FL1777
Furthermore, the spreading of the ten conserved motifs across the 214 cytochrome P450 genes was established during this study, as shown in Figure 2. The study revealed that thirty-six (36) genes have all the ten conserved motifs. The result also revealed that Thirteen (13) CYPs have only one conserved motif each. Motifs 1(FXXGXXXCXG), motif 2 (EXXR), motif 3 (PERW), motif 5 (LXXPXXXLXE) and motif 7 (HXGXRXP) appeared the most, occurring at 154, 154, 126, 128 and 123 sites, respectively. On the other hand, motif 10 (HXXXRXFSXXR) is the widest, while motif 3 is the shortest. The other motifs (2,4.5,6,7 and 8) have relatively equal width. Similarly, motif 6 was the least conserved as it appeared at 48 sites only.

Exon-intron Analysis
The result of exon-intron structures of cytochrome P450 gene in Xylaria sp. FL1777 was shown in figure 3. All the genes have a minimum of one and a maximum of nine introns except for twelve (12) genes (XYFL 801046, 789729,787010, 783228, 781286, 761684, 643781, 164544, 350436, 324633, 437355, 783920 which have none. XYFL437355 exists as the longest single exon with about 2,500bp. XYFL799538 has the highest number of introns (9), while twelve others have eight introns each in their sequences. All the genes have no untranslated regions (UTR).

Sub-cellular localization of CYPs of Xylaria sp FL1777
The Xylaria CYPs were established in this study (Table 2) to be majorly localised in the endoplasmic reticulum (160 out of the 214 CYPs), representing 74.77% 49 CYPs representing 22.9% were found to be localized in the cytoplasm. Three genes were found each in the plasma membrane, chloroplast, peroxisome, nucleus and microsome. The extracellular compartments and mitochondrion were each shown to contain 10 CYPs. Twenty-one (21) CYPs are localized in at least two organelles with XYFL 763710, 413182 and 382656 occurring in 6, 5 and 4 locations respectively. The three CYPs mentioned above were all present in mitochondria, cytoplasm, and plasma membrane.

Secondary metabolism-related gene clusters
The annotations on the mycoCosm homepage of joint genome institute for Xylaria sp.

Discussion
The phylogenetic analysis performed during this study revealed an unequal distribution of cytochrome P450 cluster sizes in Xylaria sp. FL1777 and is in line with Chadha et al. (2018), who stated that there are high expansions and contractions of certain CYP families in the course of evolution. Expansion of cytochrome P450 across different clades in Xylaria sp FL1777 could be very instrumental in their survival in respective habitats. The observed numerous branches in the tree imply their highly evolved divergence (Chen et al., 2014). The high evolutionary diversity observed in the Xylaria CYPs may not only be due to significant sequence variation but also incredible functional diversification as earlier reported in a similar study by Sezutsu et al. (2013). Most of the cytochrome P450 genes in Xylaria sp FL1777 have demonstrated a close relationship in phylogeny, hence inferring a common ancestral lineage which agrees with the earlier report of Chen et al. (2014) on fungal cytochrome P450. The observed variation in Xylaria cytochrome P450 might be linked to gene duplication; more so, the resemblance in protein sequence identity of cytochromes P450 in Xylaria as seen in clustering of about half of the genes in just four clades is an indication of recent duplication in that specie. This also agrees with the findings of Chen et al., (2014).
In Xylaria sp. FL1777, clan CYP52 comprises four families (CYP 52, CYP538, CYP539 and CYP 655) with five (5) proteins. Werner et al., (2017) reported that this clan is known to catalyze alpha-omega-dicarboxylic acids from alkanes and fatty acids. CYP 51, CYP61 and CYP505 proteins observed in this study are linked to primary metabolism in Xylaria and consist of only five proteins. CYP61 has been reported by Venegas et al., (2020) to be responsible for the coding of sterol 22 desaturase, which plays a significant role in the advanced phase of the ergosterol pathway in metabolizing Ergosta-5,7,24(28)-trienol to Ergosta5,7,22,24(28)-tetraenol by introducing a C-22(23) double bond in the sterol side chain. Clan CYP51 has been reported to be involved in sterol biosynthesis in basidiomycetes and ascomycetes and is known as housekeeping CYP. This has made them target most antifungal control of fungal human diseases (Shin et al., 2018). Seven clans out of the forty-seven clans discovered in this study in Xylaria (CYP54, CYP65, CYP526, CYP547, CYP550, CYP559 and CYP574) have been linked to secondary metabolism. CYP65 has been reported to catalyze the epoxidation reaction during the synthesis of trichothecenes biosynthesis in F. graninearumn (Gao et al., 2020) and radicicol (Chedha et al., 2018). Similarly, thirteen clans (CYP613, CYP548, CYP537, CYP533, CYP531, CYP530, CYP528, CYP507, CYP504, CYP62, CYP59, CYP53 and CYP52) comprising of 56 proteins in Xylaria have been linked to Xenobiotic metabolism. This finding has agreed with an earlier study by Chedha et al., (2018) where they reported the involvement of CYP507, CYP530, CYP531, CYP532 and CYP548 to be involved in Xenobiotic metabolism. The Copiousness of these proteins in Xylaria may be responsible for the exceptional ability of this fungus to degrade a diverse range of xenobiotics, including fungicides.
The study established Motif 2 (EXXR) in Xylaria sp. which has arginine and glutamic acid residues to be highly conserved. This signature motif was earlier reported by Deng et al. (2007) to be actively involved in stabilizing the main structure of CYP proteins. Motif 4 (AGXDTT) was reported to constitute the domain for binding and activation of oxygen (Chen et al., 2014). This shows that these motifs were widely distributed and have the strongest conservatism in the gene sequence of Xylaria sp. The evaluation of the conserved motifs is one way to predict the functions of Xylaria CYP genes (Jiu et al., 2020). Many signature motifs have been reported to be conserved in the CYP protein of fungi (Chadha et al., 2018). Generally, the sequence similarities are minimal looking at the characteristics differences in the motif, however, the motifs are highly conserved across the cytochrome P450 genes. This finding agrees with the earlier submission of Yu et al. (2014), who reported highly conserved characteristics motif of fungal cytochrome P450 with very low overall sequence resemblance. The FXXGXXXCXG conserved motif (also known as CXG) is reported to have a domain that binds heme which contains a consistent cysteine residue that binds to the iron in the heme. Also, it was reported by Deng et al. (2007) and Moktali et al. (2012) that the few conserved domains in fungal cytochrome P450 are responsible for their major characteristics, which tallies with the conservation of enzymatic functions and the tertiary structure. The result of exon-intron structures of cytochrome P450 gene in Xylaria sp. during this study revealed twelve (12) intronless genes (mono-exonic) hence can easily be translated. One gene has 9 introns. The longest single exon has about 2,500bp.

ELUCIDATING THE FUNCTIONAL ANN…
All the genes have no untranslated regions (UTR). This observed variation in the length of both introns and exons of Xylaria CYP also agrees with the findings of Raghavendra et al. (2012) where they reported about the highly dynamic nature of the intron-exon structure of the cytochrome P450 superfamily. The localization of over 70% of Xylaria CYPs in the endoplasmic reticulum and about 20% in the cytoplasm as shown in this study also validates the claim by Kelly et al. (2009) where they stated that cytochrome P450 of Eukaryotes generally attached themselves to the endoplasmic reticulum via its cytoplasmic surface. The functions of cytochrome P450 in Xylaria sp. depend on these enzymes' ability to relate with their oxidation-reduction partners, NADPH-cytochrome P450 reductase and cytochrome b5 in the endoplasmic reticulum (Park et al., 2014). These interactions with redox partners in the endoplasmic reticulum might be responsible for the intercellular catalysis of xenobiotics and other environmental pollutants. Fungi are among the most prolific producers of secondary metabolites, which are both beneficial (as antibiotics and pharmaceuticals) and harmful (toxic and carcinogenic) to mankind in particular and the universe in general (Keller et al., 2005). The 29.9% of cytochrome P450 that are involved in secondary metabolism-related gene clusters in Xylaria sp. FL1777 shows the abundance of secondary metabolites in the organism. Therefore, there is the need for systematic studies that will lead to a discovery of a new pathway, intermediate or metabolite, that can be harnessed for an improved bioremediation application.

CONCLUSION
This study revealed the distribution of 214 protein sequences into fifteen (15) clades, with more than half of them (125) clustering in just four clades. The result obtained implies a close relationship in phylogeny, hence inferring a common ancestral lineage. Moreover, the 214 genes were putatively distributed into 47 clans and 86 families. The majority of these CYPs have been implicated in xenobiotic metabolism. Furthermore, ten conserved motifs have been predicted in the study. The signature motifs; EXXR, AGXDTT and FXXGXXXCXG have been linked with stabilizing CYPs structures, binding and activation of oxygen and hemebinding domains, respectively. The study also showed consistency in organizations of exon-intron structures. More so, 97.67% of Xylaria CYPs are localized in the endoplasmic reticulum and cytoplasm. Xylaria CYPs have not been characterized nor classified before now; this work has laid a foundation for further characterization and systematic studies that will fully annotate the functions of these genes. Therefore, there is a need to identify a gene or set of genes that can be effectively harnessed for application in bioremediation.

Declaration of Interest:
The