Genomic And Proteomic Approaches

Arabidopsis thaliana was the first higher plant to be sequenced (The Arabidopsis Genome Initiative, 2000). The same benefits to human health attributed to the human genome project will be realized for plants. Namely, the number of genes, their location, and most importantly, the ability to assign function to genes based upon the translation of the nucleotide sequence into amino acids and proteins. The Arabidopsis genome has been estimated to contain about 25,500 genes encoding 11,000 protein families (The Arabidopsis Genome Initiative, 2000). Thirty-five percent of the predicted proteins are unique to the Arabidopsis genome. Functional analysis of the proteins was based upon sequence similarity to proteins of known function of all organisms, which has been shown to share a surprising amount of trans-kingdom sequence homology at the primary protein level (DellaPenna, 1999). It is expected that many phytochemicals will be unique to one or a few species; these enzymes will not have homologs in Arabidopsis.

Nevertheless, Arabidopsis has been useful in elucidating phytochemical pathways using a variety of strategies. For example, Borevitz et al. (2000) using activation-tagged Arabidopsis lines found that different transcription factors were involved in the phenylpropanoid pathway, and the authors discussed how phyto-chemical production could be increased by activation tagging methodology. Taking advantage of the public database, a genomic approach based on fungal and human orthologs and Arabidopsis sequence data was used to increase a-tocopherol in seeds by 80-fold by over-expressing y-tocopherol methyltransferase (Shintani and DellaPenna, 1998).

Proteomics has been defined as the systematic analysis of expressed proteins of a given genome (Jacobs et al., 2000). Because sequencing and transcription profiling does not directly give information about gene function, proponents in the field of proteomics believe a more efficient method of determining gene function can be realized by global protein profiling. The most complete protein database has been derived from yeast, where 6100 of the 6800 proteins have been identified and 56% of them experimentally characterized (Jacobs et al., 2000). In contrast, only 9% of the expressed proteins have been characterized experimentally in Arabidopsis (The Arabidopsis Genome Initiative, 2000). The question that remains to be answered is how much of the information gained from other (trans-kingdom) organisms will be useful in manipulating phytochemical content of plants. In terms of improving vitamin and micronutrient content, this information will be useful as the genes utilized in these pathways are highly conserved among plants as well as bacteria and yeast. Genetic databases can be used to identify orthologous genes, elucidate pathways, and serve as a blueprint to add enzymes to pathways in staple crops in which these are lacking (Croteau et al., 2000; DellaPenna, 1999; Ye et al., 2000). The

Arabidopsis databases contain a wealth of gene sequence information and it should be utilized to the extent possible, but many phytochemicals are limited to specific plant taxa, and therefore orthologs will not exist in Arabidopsis databases. Gene function is not known for approximately 30% of the Arabidopsis proteins and will have to be empirically determined. Thus, at this point, having the genomic sequence of a higher plant will not aid phytochemical engineering of pathways of rare secondary metabolites.

0 0

Post a comment