S.cerevisiae - Yeastract

Tutorial

YEASTRACT (Yeast Search for Transcriptional Regulators And Consensus Tracking; www.yeastract.com) database presently contains more than 206000 regulatory associations between the yeast genes, based on more than 1300 bibliographic references. Each regulation has been annotated manually, after examination of the relevant references. The database also contains the description of 326 specific DNA binding sites shared among 113 characterized TFs. Further information about each yeast gene was obtained from Saccharomyces Genome Database (SGD), Regulatory Sequence Analysis Tools (RSAT) and Gene Ontology (GO) Consortium.

YEASTRACT database provides assistance in three major issues: prediction of gene transcriptional regulation, DNA motif and global expression analysis according to yeast transcription networks described in the literature. This tutorial presents three case-studies, exemplifying the use of different query options and utilities. Various other ways to exploit available options and utilities are possible.

- Example 1: Identification of the documented and potential regulatory associations for an ORF/Gene

- Example 2: Gene expression analysis based on regulatory associations

- Example 3: Search for a DNA motif within known TF binding sites and promoter regions

Throughout YEASTRACT database and this tutorial, the regulatory associations are denominated "Documented" or "Potential":

a documented association between a Transcription Factor (TF) and a target gene is supported by published data showing at least one of the following experimental evidences: i) Change in the expression of the target gene due to a deletion (or mutation) in the gene encoding transcription factor; these evidences may come from detailed gene by gene analysis or genome-wide expression analysis; ii) Binding of the transcription factor to the promoter region of the target gene, as supported by band-shift, foot-printing or Chromatine ImmunoPrecipitation (ChIP) assays. Therefore, the user is urged to check the literature references provided in the database to fully understand the nature of the evidences underlying the identified regulatory associations.
a potential association between a TF and a target gene is based on the occurrence of the TF binding site in the promoter region of the target gene. The binding sites associated to each TF in this database are supported by published experimental evidence for the binding of the TF to the specific nucleotide sequence (data coming from foot-printing or ChIP assays). Again, the user is urged to check the literature references provided in the database.

The accuracy and updating of the information gathered, curated and inserted in this database is crucial to YEASTRACT users. Thus, we will value any contribution from the yeast community to achieve this goal.

The results presented for the 2.3 section (RankByTF) were computed with data from YEASTRACT on June 16, 2013. However, due to subsequent updates the current ranking may differ from the presented one.

Example 1: Identification of the documented and potential regulatory associations for an ORF/Gene

The functional analysis of an ORF or gene can be guided through the identification of its documented and potential transcription factors (TF). This example describes one of the possible ways to explore the regulatory associations for ORF YNR070w, encoding a putative ATP-binding cassette transporter, using various queries and utilities provided by YEASTRACT.

1.1 - Search for Documented Transcription Factors (TFs) The use of "Search Transcription Factors" query allows the identification of TFs which, are Documented and/or Potential transcriptional regulators of a given gene. The search for documented transcription factors acting directly upon YNR070w uncovers Nrg1p and Rap1p. The associated bibliographic references may be checked by the user to know the experimental basis for these regulatory associations.

According to the SGD description of Nrg1p and Rap1p these regulators are involved in glucose repression and chromatin silencing, respectively. Therefore, it may be considered of interest to examine the eventual link of ORF YNR070w to these biological processes.

1.2 - Search for Potential Transcription Factors (TFs) The use of "Search Transcription Factors" query may also identify the potential regulators of YNR070w.By default, all of the searched potential transcription factors will be displayed in tabular form. The Promoter link can be followed to see the binding sites for each TF in the promoter sequence of YNR070w. The distribution of TF binding sites in the promoter region of YNR070w can be viewed by checking the option image while searching.

The display of potential TFs on the image can be controlled by un-checking their respective box in the color pallet below the image and pressing the Redisplay button. The color pallet displays the color for only those TFs for which binding sites are found in the promoter region of the given gene(s). A close observation of the image for TFs which are documented regulators for YNR070w (i.e., Nrg1p and Rap1p) reveals that the binding site for Nrg1p is present, while that for Rap1p is not. The role of Rap1p in YNR070w regulation may be indirect or through a binding site still not described in the literature or not listed in the database.
This query provides a large number of potential regulators for the ORF under study, which can be grouped based on Gene Ontology terms associated to them.

1.3 - Gene Grouping based on shared Gene Ontology (GO) terms The YEASTRACT utility "Group Genes by GO" allows the grouping of a list of genes according to the GO terms they share. The following list of genes, identified as potential regulators of YNR070w, is subjected to the GO based grouping, selecting Ontology Biological Process and Level 5.

Abf1p
Ace2p
Ash1p
Azf1p
Bas1p
Cup2p
Fkh1p
Mot3p
Nrg1p
Oaf1p
Pho4p
Pip2p
Rtg1p
Rtg3p
Swi5p
Tec1p
Yap1p
Yap3p
Yap8p

The output (Table 1) displays GO terms in the first column, the percentage of genes out of the given list associated with respective GO terms in the second column and the cluster of genes associated to the GO term in the last column. Depending on the chosen Gene Ontology and level, grouping may differ.

Table 1 - GO associations for genes using Biological Process at level 5

Go terms	%	Genes
cell cycle	20.8 %	FKH2,ABF1, ACE2, SWI5, FKH1
DNA metabolism	4.2%	ABF1
chromatin silencing	8.3%	FKH1, ABF1
nucleobase, nucleoside, nucleotide and nucleic acid metabolism	87.5%	YAP1, ABF1, OAF1, PIP2, ACE2, SWI5, AZF1, BAS1, CUP2, RTG1, RTG3, FKH2, GCN4, HAC1, HSF1, NRG1, YAP3, GCR1, TEC1, ARR1, MOT3
regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolism	62.5%	ARR1, ABF1, OAF1, PIP2, ACE2, SWI5, AZF1, FKH2, GCN4, HAC1, HSF1, NRG1, YAP3, GCR1, TEC1
positive regulation of metabolism	29.2%	HAC1,ABF1, OAF1, PIP2, GCR1, TEC1, ARR1
pseudohyphalgrowth	16.7%	TEC1,ASH1, FKH1, FKH2
regulation of transcription, mating-type specific	4.2%	ASH1
response to carbohydrate stimulus	4.2%	AZF1
amino acid and derivative metabolism	8.3%	GCN4, BAS1
amine metabolism	8.3%	GCN4,BAS1
organic acid metabolism	16.7%	PIP2,BAS1, GCN4, OAF1
cellular biosynthesis	8.3%	GCN4, BAS1
aromatic compound metabolism	4.2%	BAS1
heterocycle metabolism	4.2%	BAS1
response to abiotic stimulus	12.5%	YAP1,CUP2, NRG1
alcohol metabolism	8.3%	NRG1, GCR1
carbohydrate metabolism	8.3%	NRG1, GCR1
cellular macromolecule metabolism	8.3%	NRG1, GCR1
macromolecule catabolism	4.2%	GCR1
cellular catabolism	4.2%	GCR1
generation of precursor metabolites and energy	4.2%	GCR1
regulation of catabolism	4.2%	GCR1
regulation of carbohydrate metabolism	4.2%	GCR1
cellular lipid metabolism	12.5%	PIP2, HAC1, OAF1
intracellular signaling cascade	4.2%	HAC1
response to unfolded protein	4.2%	HAC1
response to heat	4.2%	HSF1
invasive growth (sensu Saccharomyces)	4.2%	NRG1
organelle organization and biogenesis	8.3%	PIP2, OAF1
phosphorus metabolism	4.2%	PHO4
cellular response to starvation	4.2%	PHO4
response to oxidative stress	4.2%	YAP1

The information in Table 1 reveals that most of the TFs potentially binding to the promoter region of YNR070W are involved in cell cycle, pseudohyphal growth, organic acid metabolism, response to abiotic stimulus and cellular lipid metabolism. The eventual involvement of YNR070W in these processes can thus be hypothesized. The association of this ORF, with the GO term "response to abiotic stimulus" appears to be consistent with its previous association to the PDR network (de Risi et al., 2000), as encoding a putative multidrug transporter (Bauer et al., 1999).

If the ORF/gene under study is predicted to encode a TF, it would be convenient to use the query, "Search Regulated Genes", options Documented or Potential, to retrieve all documented and potential targets for the TF, respectively. The grouping of the searched target genes by GO may also provide clues on the biological processes or molecular functions controlled by the TF.

1.4 - References

Bauer, B. E., Wolfger, H., and Kuchler, K., 1999, Inventory and function of yeast ABC proteins: about sex, stress, pleiotropic drug and heavy metal resistance. Biochim Biophys Acta 1461: 217-236.

Example 2: Gene expression analysis based on regulatory associations

YEASTRACT provides tools for the classification and grouping of large lists of genes of interest, such as those found up- or down-regulated under a specific environmental or biological situation, as suggested by genome-wide expression data inspection. These analyses are based on known or algorithmically identified potential regulatory associations, deposited in the YEASTRACT database, and on the GO-based schema.

2.1 - Transform an ORF list into a gene list and vice-versa

The utility ORF List<->Gene List converts a given list of ORFs or Genes to a list of Genes or ORFs, respectively. In addition, it filters a mixed list into two separate lists of ORFs or Genes. This is useful to make the gene/ORF list reading more intuitive.

2.2 - Rank Genes - Rank by Gene Ontology (GO)

The grouping of genes based on the GO terms they share is a feature common to a number of gene expression analysis software and is also implemented in YEASTRACT. Depending on the chosen Gene Ontology and level, grouping may differ. To exemplify this utility, we used the list of genes up-regulated in response to the expression of a point mutation in the PDR1 gene, encoding a transcription factor involved in Pleiotropic Drug Resistance in yeast, named PDR1-3, retrieved from de Risi et al. (2000).The grouping of this gene list, based on the Biological Process ontology at level 5 results in the following table:

Table 2 - GO associations for genes using Biological Process at level 5

Go terms	%	Genes
drug transport	4.0%	PDR5
response to abiotic stimulus	16.0%	PDR16, PDR5, SNQ2, YOR1
response to oxidative stress	4.0%	SNQ2
amine transport	4.0%	TPO1
lipid transport	4.0%	PDR16
cellular lipid metabolism	8.0%	IPT1, PDR16
alcohol metabolism	8.0%	HXK1, PDR16
aldehyde metabolism	4.0%	YPL088w
cytokinesis	8.0%	DSE4, SCW11
response to nutrients	4.0%	YGP1
carbohydrate metabolism	4.0%	HXK1
cellular macromolecule metabolism	4.0%	HXK1
amino acid and derivative metabolism	4.0%	MET17
amine metabolism	4.0%	MET17
organic acid metabolism	4.0%	MET17
sulfur metabolism	4.0%	MET17
cellular biosynthesis	4.0%	MET17
siderochrome transport	4.0%	FRE4
cell cycle	4.0%	REV1
DNA metabolism	4.0%	REV1
response to DNA damage stimulus	4.0%	REV1
vesicle-mediated transport	4.0%	COS10

In agreement with the published analysis of these results (de Risi et al., 2000) the main functional groups include "response to abiotic stimulus" (drugs included), "drug transport" and "cellular lipid metabolism", among others.

2.3 - Rank Genes - Rank by TF

The query “Rank by TF” enables automatic selection and ranking of transcription factors potentially involved in the regulation of the genes in a list of interest. The TFs and their direct targets are presented in a table in decreasing order of a relevance score calculated for each TF, based on either regulations or regulatory paths targeting the genes in the list of interest and deposited in the YEASTRACT database. Different filters can be used in order to steer the search to a particular type of regulatory activity. To exemplify the “Rank by TF” utility, we analyse the results obtained for a list of genes found up-regulated upon exposure to quinine (dos Santos et al., 2009), which we hereby term QN dataset. This is a relatively simple dataset, which corresponds to a well characterized biological response, and is thus adequate to illustrate the usefulness of the different ranking methods. The results presented below were obtained using YEASTRACT on June 16, 2013.

2.3.1 - Rank by TF based on regulation enrichment

When ranking by statistical significance of regulations, the TF score is given by a p-value denoting the overrepresentation of regulations of the given TF targeting genes in the list of interest relative to the regulations of that TF targeting genes in the whole YEASTRACT database. The p-value further denotes the probability that the TF regulates at least the number of genes found to be regulated in the list of interest if we were to sample a set of genes of the same size as the list of interest from all the genes in the YEASTRACT database. This probability is modelled by a hypergeometric distribution and the p-value is finally subject to a Bonferroni correction for multiple testing.

Below is the output of the utility “Rank by TF” based on regulation enrichment for the QN dataset, using the default filtering options. In Table 3, the first column indicates the name of the TF, the second column the % of genes in the list targeted by the TF, the third column the % resulting from the ratio between the number of genes in the list targeted by the TF and the number of genes targeted by the TF in the whole YEASTRACT database, the fourth column the enrichment p-value, and the fifth and final column the genes from the list of interest targeted by the TF.

Table 3 - Genes grouped by TF, ordered by regulation enrichment p-value for the QN dataset. Only the first 15 rows of the table are shown.

Transcription Factor	% in User Set	% in Yeastract	p-value	Target ORF/Genes
Mig1p	41.25%	8.01%	0.000000000000000	PHO89 MAL33 MAL31 SNF3 ADR1 MTH1 HXT7 HXT6 YFL054C GSY1 FMP43 MAL11 HXT4 GUT2 HAP4 JEN1 YKR075C UBP11 ALT1 TMA10 RPM2 HXT2 YMR103C GCV2 CAT8 TDA1 YNL144C SUR1 ALD6 YPL113C GAL4 CSR2 YPR196W
Mig3p	63.75%	3.06%	0.000000000000002	ACS1 FUI1 AGP2 OM14 PHO89 MAL33 MAL31 SNF3 GDH2 ENA5 GIS1 ADR1 MTH1 HXT7 HXT6 GLC3 SIT1 FAA2 RGI1 AVT6 YFL054C CMK1 GSY1 CLD1 FMP43 MAL11 MUP3 GUT2 ATG36 HAP4 JEN1 GLG1 YKR075C ALT1 YLR149C BOP2 TMA10 HXT2 YMR103C ASI2 PFK27 CRC1 PYK2 ALD4 SUR1 ALD6 YPL113C GAL4 HAL1 ATH1 CSR2
Aft1p	50.00%	3.59%	0.000000000000111	ACS1 FUI1 AGP2 OM14 PHO89 MAL33 MAL31 GLK1 HXT7 GLC3 SIT1 FAA2 RGI1 YFL054C GSY1 FMP43 MUP3 HXT4 ATG7 ATG36 HAP4 JEN1 GLG1 YKR075C UBP11 YLR122C YLR149C RPM2 NDI1 HXT2 GCV2 CAT8 YNL144C MDG1 ALD4 YPL113C CIT3 PDH1 ATH1 CSR2
Nrg2p	18.75%	10.27%	0.000000000003908	AGP2 ENA5 ENA2 MTH1 HXT6 GLC3 GSY1 CLD1 RPM2 NDI1 MDG1 CRC1 PYK2 ALD4 YPL113C
Adr1p	32.50%	4.86%	0.000000000008845	ACS1 AGP2 YCL042W ADR1 GLC3 FAA2 RGI1 YFL054C GSY1 MUP3 GUT2 REE1 JEN1 YKR075C YLR149C TMA10 HXT2 YMR103C GCV2 TDA1 CRC1 ALD4 ALD6 YPL113C CIT3 PDH1
Sok2p	63.75%	2.52%	0.000000000009713	ACS1 FUI1 AGP2 OM14 PHO89 MAL33 YCL042W SNF3 ENA5 GIS1 MTH1 HXT7 HXT6 GLC3 SIT1 FAA2 ICL1 RGI1 YFL054C CMK1 RRT6 RMR1 HXT4 ATG7 GUT2 ATG36 REE1 HAP4 SSH4 JEN1 GLG1 YKR075C UBP11 SDH2 ALT1 YLR149C BOP2 TMA10 RPM2 HXT2 CAT8 YNL144C PFK27 CRC1 ALD4 SKS1 ALD6 CIT3 PDH1 ATH1 CSR2
Nrg1p	28.75%	4.39%	0.000000001317895	AGP2 PHO89 ENA5 ENA2 MTH1 HXT6 GLC3 GSY1 CLD1 HAP4 YKR075C ALT1 BOP2 RPM2 NDI1 HXT2 TDA1 MDG1 CRC1 PYK2 ALD4 YPL113C ATH1
Snf1p	20.00%	5.86%	0.000000006117325	PHO89 MAL31 HXT7 HXT6 AVT6 FMP43 MAL11 UBP11 YLR122C YMR103C GCV2 TDA1 YNL144C SUR1 ALD6 GAL4
Msn2p	61.25%	2.19%	0.000000009033572	FUI1 OM14 PHO89 MAL31 GLK1 YCL042W GDH2 ENA2 GIS1 ADR1 MTH1 HXT7 HXT6 GLC3 ICL1 RGI1 CMK1 GSY1 RRT6 CLD1 FMP43 MAL11 HXT4 GUT2 REE1 HAP4 JEN1 GLG1 SDH2 ALT1 BOP2 TMA10 YLR345W RPM2 NDI1 YMR103C TDA1 YNL144C ASI2 MDG1 PFK27 PYK2 ALD4 SKS1 SUR1 ALD6 CIT3 ATH1 YPR196W
Stp3p	6.25%	26.32%	0.000000040863892	PHO89 SIT1 CLD1 BOP2 HAL1
Gcn4p	62.50%	2.08%	0.000000041302342	ACS1 FUI1 PHO89 MAL31 GLK1 GDH2 ENA5 ENA2 ADR1 MTH1 GLC3 SIT1 FAA2 ICL1 RGI1 AVT6 YFL054C CMK1 GSY1 CLD1 FMP43 MAL11 MUP3 HXT4 ATG36 REE1 HAP4 SSH4 JEN1 GLG1 UBP11 SDH2 ALT1 YLR149C BOP2 TMA10 RPM2 NDI1 GCV2 TDA1 PFK27 CRC1 PYK2 SUR1 ALD6 YPL113C CIT3 HAL1 ATH1 YPR196W
Msn4p	36.25%	3.04%	0.000000050178772	FUI1 PHO89 MAL31 GLK1 YCL042W ADR1 HXT7 HXT6 GLC3 ICL1 RGI1 GSY1 IMO32 MAL11 HXT4 GUT2 GLG1 SDH2 BOP2 TMA10 YLR345W YMR103C YNL144C CRC1 PYK2 ALD4 SUR1 ALD6 CIT3
Pdr1p	35.00%	3.05%	0.000000080739877	OM14 PHO89 GLK1 YCL042W ENA2 MTH1 HXT7 HXT6 GLC3 SIT1 RGI1 FMP43 MUP3 HXT4 ATG36 HAP4 YLR149C TMA10 RPM2 HXT2 YMR103C GCV2 TDA1 YNL144C CRC1 ALD4 ALD6 PDH1
Bas1p	66.25%	1.93%	0.000000155422455	ACS1 FUI1 OM14 PHO89 MAL33 GLK1 KIN82 SNF3 GDH2 ADR1 HXT7 HXT6 GLC3 SIT1 ICL1 YFL054C GSY1 RRT6 CLD1 FMP43 MUP3 GUT2 ATG36 REE1 HAP4 SSH4 NNK1 JEN1 YKR075C UBP11 SDH2 ALT1 YLR122C YLR149C BOP2 TMA10 YLR345W RPM2 NDI1 GCV2 CAT8 TDA1 CRC1 PYK2 ALD4 SKS1 ALD6 CIT3 PDH1 HAL1 ATH1 CSR2 YPR196W
...	...	...	...	...

The enrichment-based ranking of transcription factors reported several transcription factors (Mig1, Nrg2 and Adr1) involved in glucose derepression as being among the top ranking TFs and potential key regulators of the yeast response to low-inhibitory concentrations of quinine, while these TFs would only appear after other more general regulators known to play a role in the regulation of yeast response to several environmental stresses if the genes were ranked according to the number of genes they regulate in the dataset. Significantly, these results are consistent with the fact that yeast adaptation to quinine was shown to involve a glucose limitation response, probably as a consequence of glucose uptake inhibition by the drug (dos Santos et al., 2009).

2.3.2 - Rank by TF using TFRank

The second kind of ranking involves the use of the TFRank method (Gonçalves et al., 2011).This method exploits every regulatory path containing the genes in the list of interest to select the relevant part of the network. It achieves the prioritization of regulators by computing a relevance measure reflecting their contribution within the network under study. Advantages of the TFRank algorithm include its ability to consider multiple levels of regulation and interactions between transcription factors in an integrated, rather than isolated-per-TF, network analysis perspective.

The relevance score is obtained using a personalized ranking method related to local clustering on graphs based on a discrete approximation of the heat kernel. It works by diffusing a signal through the transpose of the network (to diffuse in reverse order, that is, reach TFs from their target genes), starting from the genes of interest, and accumulating a score in every gene/node in the network. In YEASTRACT, the TFRank method enables the customization of a parameter, termed heat diffusion coefficient, which allows to control the range of influence of the regulatory cascade in the network. A low value causes slow diffusion and thus sets a preference for more local regulators, while a large value promotes rapid diffusion resulting in a preference for more global regulatory players.

Below we present the output of the utility “Rank by TF” using TFRank for the QN dataset, combined with the default filtering options. In Table 4, the first column indicates the position of the TF in the ranking, the second column the name of the TF, the third column the number of regulations of the TF in the YEASTRACT database, the fourth column the score given to the TF by TFRank, and the fifth and final column the genes from the list of interest targeted by the TF (note that this list does not necessarily contain all the genes in the regulatory paths flowing from the TF and leading to the genes of interest, as only direct regulations between the TF and the target genes in the list of interest are considered).

Table 4 - Potential regulators of the genes in the QN dataset, selected and ranked using the TFRank method with a heat diffusion coefficient value of 0.25. Only the first 15 rows of the table are shown.

Rank	Transcription Factor	Regulations in Yeastract	Weight	Target ORF/Genes
1	Adr1p	535	0.92583	ACS1 ...
2	Hap4p	579	0.90014	ACS1 ...
3	Gal4p	1213	0.87210	MAL31 ...
4	Mal33p	640	0.85672	OM14 ...
5	Gis1p	265	0.85117	PHO89 ...
6	Cat8p	175	0.84575	ACS1 ...
7	YPR196W	167	0.79740	GLK1 ...
8	Mth1p	346	0.79404	PFK27 ...
9	Sok2p	2021	0.39849	ACS1 ...
10	Bas1p	2750	0.38652	ACS1 ...
11	Mig3p	1667	0.37966	ACS1 ...
12	Gnc4p	2408	0.35873	ACS1 ...
13	Msn2p	2232	0.34218	FUI1 ...
14	Sfp1p	3264	0.31548	FUI1 ...
15	Cst6p	2593	0.29658	ACS1 ...
...	...	...	...	...

In this analysis the TFRank algorithm was used with the diffusion coefficient set to a low value (0.25) in order to favor proximal regulators, presumably more specific to the biological response under study. In this case, TFRank indicated Adr1, Hap4, Gal4, Mal33, Gis1 and Cat8 as the most relevant mediators of the yeast transcriptional response to quinine. Notably, all these TFs were found up-regulated in response to quinine stress and are known to play a role in yeast adaptation to alternative carbon sources. This is in agreement with the fact that quinine induces intracellular glucose limitation. In clear contrast, the top TFs obtained based on the percentage of documented regulatory associations targeting the up-regulated genes in the quinine dataset are associated with more general cellular responses, and none of them was found up-regulated in response to quinine stress (dos Santos et al., 2009). Even when compared to the TF enrichment tool described above, TFRank highlights a higher number of glucose limitation TFs and proposes the network frame shift in which they operate. This highlights the importance of using the different methods of ranking as complementary tools of analysis.

2.4 - Search for Regulatory Associations

The query "Find regulatory associations" may be used to group genes according to their documented and potential co-regulations. This query displays all the information obtained using the several options of "Group genes by TF" functionality in a single table, allowing the comparison of the potential and documented regulons deduced for an array gene list. To save space, this comparison is exemplified in Table 5 just for Pdr1p, although the whole list of the implicated TFs appears when using this functionality.

Table 5 clearly shows that there is a significant discrepancy between the genes, which are considered documented or potential targets of Pdr1p. The same is registered for other TFs. The observed differences may be due to the fact that: i) the documented targets of each TF may include indirect targets; ii) the existence of the TF binding site in the promoter region of a gene does not necessarily makes it a target of the corresponding TF; iii) there may be gene targets and binding sites for a specific TF that are not yet described in the literature or included in the database. For example, HXK1, SCW11, MET17, FMP43, FRE4, DSE4, COS10, REV1 genes, all confirmed targets of Pdr1-3p do not possess any Pdr1p binding site in their promoter regions. These genes may be indirect targets of Pdr1p, or their promoter region may include a binding site for this TF, which is not yet defined (or introduced in this database).

Table 5 - Regulatory Associations

Transcription Factor	Documented Regulated Genes - Ref	Potential Regulated Genes
Pdr1p Zinc cluster protein that is a master regulator involved in recruiting other zinc cluster proteins to pleiotropic drug response elements (PDREs) to fine tune the regulation of multidrug resistance genes

Notice that within the query "Find regulatory associations", there are two search options, Any Transcription Factors to Any Gene and All Transcription Factors to Any Gene. The former option was used in the previous analysis to search for regulatory association. The later option searches a regulatory association where all the input TFs control at-least one of the input genes. This option enables the identification of groups of genes whose transcription is potentially under the simultaneous control of a number of different transcription factors. For instance, we may search for the regulatory association between the PDR related TF, Pdr1p,Pdr3p, Pdr8p and Yrr1p and the de Risi gene list using the All Transcription Factors to Any Gene option:

Table 6 - Regulatory associations

Transcription Factor	Documented Regulated Genes - Ref	Potential Regulated Genes
Pdr1p Pdr3p Yrr1p Pdr8p

The results in table 6 reveals that there are four documented gene targets for Pdr1p, Pdr3p, Yrr1p and Pdr8p, although there is no potential gene target for all four TFs within the list under examination. The possibility that these TFs act together in the up-regulation of their overlapping targets has been examined to some extent. For instance, Pdr1p and Pdr3p can act as homo- or heterodimers (Mamnun et al., 2002) and the transcriptional regulation of Yrr1p or Pdr8p was found to be dependent on Pdr1p or Pdr3p (Hickell et al., 2003, Akache et al., 2004).

2.5 - References

Akache, B., and Turcotte, B., 2002, New regulators of drug sensitivity in the family of yeast zinc cluster proteins. J Biol Chem 277: 21254-21260.
Akache, B., MacPherson, S., Sylvain, M. A., Turcotte, B., 2004, Complex interplay among regulators of drug resistance genes in S. cerevisiae. J. Biol. Chem. 279: 27855-27860.
de Risi, J., van den Hazel, B., Marc, P., Balzi, E., Brown, P., Jack, C., Goffeau, A., 2000, Genome microarray analysis of transcriptional activation in multidrug resistance yeast mutants. FEBS Letters 470: 156-160.
dos Santos, S.C., Tenreiro, S., Palma, M., Becker, J. and Sá-Correia, I., 2009, Transcriptomic profiling of the Saccharomyces cerevisiae response to quinine reveals a glucose limitation response attributable to drug-induced inhibition of glucose uptake. Antimicrob Agents Chemother, 53, 5213-5223.

Gonçalves, J.P., Francisco, A.P., Mira, N.P., Teixeira, M.C., Sá-Correia, I., Oliveira, A.L. and Madeira, S.C., 2011, TFRank: network-based prioritization of regulatory associations underlying transcriptional responses, Bioinformatics, 27, 3149-3157.

Hikkel, I., Lucau-Danila, A., Delaveau, T., Marc, P., Devaux, F., Jacq, C., 2003, A general strategy to uncover transcription factor properties identifies a new regulator of drug resistance in yeast. J Biol Chem 278(13):11427-11432
Le Crom, S., Devaux, F., Marc, P., Zhang, X., Moye-Rowley, W. S., Jacq, C., 2002, New insights into the pleiotropic drug resistance network from genome-wide characterization of the YRR1 transcription factor regulation system. Mol Cell Biol 22: 2642-2649
Mamnun, Y. M., Pandjaitan, R., Mahé, Y., Delahodde, A., Kuchler, K., 2002, The yeast zinc finger regulators Pdr1p and Pdr3p control pleiotropic drug resistance (PDR) as homo- and heterodimers in vivo. Mol. Microbiol. 46: 1429-1440.
Onda, M., Ota, K., Chiba, T., Sakaki, Y., Ito, T., 2004, Analysis of gene network regulating yeast multidrug resistance by artificial activation of transcription factors: involvement of Pdr3 in salt tolerance. Gene 332: 51-59.
Van den Hazel, H. B., Pichler, H., do Valle Matta, M. A., Leitner, E., Goffeau, A., and Daum, G., 1999, PDR16 and PDR17, two homologous genes of Saccharomyces cerevisiae, affect lipid biosynthesis and resistance to multiple drugs. J Biol Chem 274: 1934-1941.

Example 3: Search for a DNA motif within known TF binding sites and promoter regions

The search for over-represented consensus or DNA motifs in the promoter regions of co-regulated genes, revealed by global expression analysis, may contribute to the identification of known or new transcription associations underlying the yeast response under study. YEASTRACT provides "Search by DNA Motif" option to facilitate this analysis. This is exemplified below for the motif CGGGC found to be over-represented in the upstream regions of the genes up-regulated in yeast cells under glucose- or ethanol-limited growth (Wu et al., 2004).

3.1 Search for a DNA Motif within known TF Binding Sites

This query allows the user to check if the DNA motif has already been documented as the binding site for a specific TF. This search allows the user to check whether a newly identified DNA motif matches perfectly, is contained in or contains a previously describe TF binding site.

The result of this query shows that the CGGGC motif has no exact matches to any of the 284 different TF binding sites described in the literature and compiled in YEASTRACT, but is contained by the Cup2p binding site, in its most degenerate region (HTHNNGCTGD; Beaudoin and Labbé, 2001). This conclusion appears to suggest that the examined motif does not correspond to any of the TF binding sites described so far.

3.2 Search for Genes having a DNA Motif in their Promoter Regions

This query search the existence of a new DNA motif in the promoter regions of all genes present in the yeast genome. The result of this query shows that the CGGGC motif occurs in the promoter region of 2169 among the approximately 6000 yeast genes. In the promoter region of 567 of these genes it occurs at least twice. This information, together with the tests on statistical significance (Wu et al., 2004), may be useful to anticipate the biological significance of a newly proposed consensus.

3.3 References

Beaudoin, J., Labbé, S., 2001, The Fission Yeast Copper-sensing Transcription Factor Cuf1 Regulates the Copper Transporter Gene Expression through an Ace1/Amt1-like Recognition Sequence,J Biol Chem 276: 15472-15480
Wu, J., Zhang, N., Hayes, A., Panoutsopoulou, K., Oliver, S. G., 2004, Global analysis of nutrient control of gene expression in Saccharomyces cerevisia during growth and starvation, Proc Natl Acad Sci 101: 3148-3153.