This web page was produced as an assignment for Genetics 564, an undergraduate course at UW-Madison
Protein domains
Protein domains are specific structural and/or functional units of a protein. Each domain is usually responsible for a specific function that contributes to the overall role of the protein within a cell. [1] Proteins that contain the same domain may be derived from a common ancestor, and therefore, fall into the same protein family, but proteins may carry the same domain and yet be part of different families. [2] Knowing the domains that are found within a protein can give insight into potential functions of your protein of interest. Three different databases, namely PFAM, SMART, and Prosite can be used to identify protein domains.
THE TUB DOMAIN
The TUB protein contains a single domain, namely the tub domain (PF01167). This domain is arranged as a 12-stranded, all anti-parallel, closed beta-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core [3]. This domain is a member of the Tubby C superfamily (CL0395) that contains the scramblase protein family, the Tub family, and DUF567, a family of plant and bacterial proteins of unknown function. All members of this superfamily are membrane-tethered transcription factors. [4]
Three different protein databases using slightly different algorithms all found the same tub domain within the TUB protein as shown below. The SMART database also highlighted a number of regions of low complexity regions, as shown by the small purple boxes. The low complexity region closest to the N-terminal contains the nuclear localization signal that was previously identified in the TUB protein [5]. The fact that the TUB protein only carries this single domain indicates that this single domain may be responsible for much of the protein's function within the cell. This is consistent with the hypothesis that TUB acts as a membrane bound transcription factor as that is the characteristic function of the tub domain.
Three different protein databases using slightly different algorithms all found the same tub domain within the TUB protein as shown below. The SMART database also highlighted a number of regions of low complexity regions, as shown by the small purple boxes. The low complexity region closest to the N-terminal contains the nuclear localization signal that was previously identified in the TUB protein [5]. The fact that the TUB protein only carries this single domain indicates that this single domain may be responsible for much of the protein's function within the cell. This is consistent with the hypothesis that TUB acts as a membrane bound transcription factor as that is the characteristic function of the tub domain.
PROTEIN MOTIFS
Protein motifs are sequences of amino acids whose signatures can often be used as tools for the prediction of protein function. [6] While motifs are often found within domains, they can be found outside of known domains in the intervening amino acid sequences. The conservation of motifs across species can provide insight into regions of a protein that may be functionally significant, especially when they lie outside of known protein domains. The NCBI MEME database can be used to identify conserved protein motifs.
TUB MOTIFS
MEME was used to identify protein motifs that were highly conserved among TUB homologues. Three motifs were identified all fall within the tub domain. The first motif contains the G-to-T transversion site that is responsible for the tubby mouse phenotype (Figure 1). The fact that this motif was found in all homologues tested points to the importance of this site within the protein, and justifies the loss of function phenotype found upon its disruption. The second motif was also found in all homologues included on this site and is located at the far 5' end of the tub domain. (Figure 2). The third motif, located just to the 5' end of the first motif, was slightly less conserved, found in all homologues besides Arabidopsis (Figure 3). The conservation of these motifs throughout a great variety of species points to their importance and potential function within the tub domain. [7]
references
Cover Photo Credit
[1] EMBL-EBI. What are protein domains? Accessed 2 March 2014.
[2] EMBL-EBI. What are protein families? Accessed 2 March 2014.
[3] PFAM. Family: Tub (PF01167), accessed 24 February 2014.
[4] PFAM. Clan: Tubby C (CL0395), accessed 24 February 2014.
[5] Santagata, S., Boggon, T.J., Baird, C.L., Gomex, C.A., Zhao, J., Shan, S., Myszka, D.G., & Shapiro, L. (2001). G-Protein Signaling Through Tubby Proteins. Science, 292(5524).
[6] Patrik D'haeseleer. How does DNA sequence motif discovery work? Nature Biotechnology - 24, 959 - 961 (2006).
[7] Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.
[1] EMBL-EBI. What are protein domains? Accessed 2 March 2014.
[2] EMBL-EBI. What are protein families? Accessed 2 March 2014.
[3] PFAM. Family: Tub (PF01167), accessed 24 February 2014.
[4] PFAM. Clan: Tubby C (CL0395), accessed 24 February 2014.
[5] Santagata, S., Boggon, T.J., Baird, C.L., Gomex, C.A., Zhao, J., Shan, S., Myszka, D.G., & Shapiro, L. (2001). G-Protein Signaling Through Tubby Proteins. Science, 292(5524).
[6] Patrik D'haeseleer. How does DNA sequence motif discovery work? Nature Biotechnology - 24, 959 - 961 (2006).
[7] Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.
Site created by Rachael Baird.
Genetics 564 Assignment, Spring 2014
University of Wisconsin-Madison
Last Updated: 5-10-14
Genetics 564 Assignment, Spring 2014
University of Wisconsin-Madison
Last Updated: 5-10-14