This is the second installment in my new series on short, understandable, powerful evidences for evolution. The first installment is here:
Installment 1: Endogenous retroviral sequences
This week, we'll be considering a very interesting way that the relatedness of two species can be measured. My main source for this week's material is here:
http://www.talkorigins.org/faqs/comdesc/section4.html#protein_redundancy
To start off, let's review a few basic points.
- DNA contains instructions for creating proteins, from which all life is constructed.
- Proteins are long chains of amino acids. There are often more than 300 amino acids in one protein (reference).
- After the amino acids have been chained together, they fold up into a three-dimensional shape that performs a specific biological function. The protein is that three-dimensional shape.
There are certain proteins that are present in all life. These are called ubiquitous proteins. Everything has them - frogs, giraffes, bacteria, birds - everything. These proteins are good candidates for studying relatedness of species because they are not connected to any specific function of any species. For example, humans and chimpanzees look somewhat similar and do somewhat similar things, so we might expect them to have genes that code for similar proteins. By instead studying ubiquitous proteins - proteins that all life have in common - we can ensure that we are not studying proteins that have something to do with specific human or chimp functionality.
Another important fact about proteins is that many different combinations of amino acids can fold up into the same protein. That is to say, there are many different sequences of amino acids that will fold together into the same shape and perform the same biological function. This is called functional redundancy. You might say there are many recipes for the same basic protein.
Since proteins are nothing more than strings of amino acids, we can calculate a "percent different" value for any two proteins by lining them up and seeing which amino acids are the same and which are different. For example, let's imagine a 10-amino acid protein, where we'll use letters of the alphabet to represent the different amino acids. Say we find a protein performing the exact function in two different creatures. We sequence the proteins and we get two slightly different chains of amino acids:
Species 1: ABBEHGTAAA
Species 2: ABCEHGTABA
x x
Notice that the chains were the same except for in two locations. Two differences out of a total of ten amino acids means that the proteins are 20% different, even though they are the same protein. This is where it starts getting interesting.
The protein we'll be discussing today is called cytochrome c. This protein is absolutely essential for life - no creature can live without it. Interestingly, this particular protein can take an incredibly large number of different forms. For example, cytochrome c in yeast is 40% different than in humans. How do we even know we're talking about the same protein? Because when we take the cytochrome c out of yeast, and insert cytochrome c from a human, it works as expected. But it's not just human cytochrome c that works in yeast:
The cytochrome c genes from tuna (fish), pigeon (bird), horse (mammal), Drosophila fly (insect), and rat (mammal) all function in yeast that lack their own native yeast cytochrome c ( Clements et al. 1989 ; Hickey et al. 1991 ; Koshy et al. 1992 ; Scarpulla and Nye 1986 ).
Notice that the cytochrome c from various different classes of animal all worked in yeast. This demonstrates the fact that there are many different possible varieties of cytochrome c which are all functionally equivalent. As it turns out, analysis of this particular protein has revealed that the majority of the amino acids can be swapped out with other amino acids and still make a functional cytochrome c protein.
The number of different possible functional versions of cytochrome c is staggering:
Hubert Yockey has done a careful study in which he calculated that there are a minimum of 2.3 x 10 93 possible functional cytochrome c protein sequences, based on these genetic mutational analyses ( Hampsey et al. 1986 ; Hampsey et al. 1988 ; Yockey 1992 , Ch. 6, p. 254). For perspective, the number 10 93 is about one billion times larger than the number of atoms in the visible universe. Thus, functional cytochrome c sequences are virtually unlimited in number.
Since there is an outrageous number of possible functional cytochrome c proteins, then if species are not related in an evolutionary sense, then we should expect to find random variations in their cytochrome c protein sequences. We should not see any patterns of relatedness in the cytochrome c sequences for different species.
On the other hand, if species are related in an evolutionary sense, then we should expect to find cytochrome c protein sequences are more similar for related species, and less similar as relatedness decreases. Since other evidence leads us to believe that chimps and humans (for example) are closely related, then we will go way out on a limb and predict that human and chimp cytochrome c should be very similar, whereas yeast cytochrome c should be very different from both humans and chimps.
Those are the predictions. What are the observations?
Human and chimpanzees have identical cytochrome c sequences. Further, human and chimpanzee cytochrome c differs by about 10 amino acids from all other mammals. All mammals, in turn, have similar cytochrome c (within about 10%) to any other mammal, but a greater degree of difference with any non-mammal. As it turns out, for any species, its cytochrome c sequence is extremely similar to other species that are phylogenically related to it. As the distance between two species increases, the sequences become vastly different. Human cytochrome c and yeast cytochrome c are, as mentioned, about 40% different.
The only observed mechanism by which this can be accounted for is hereditary relationship.
So that's it for this week's installment. Hope you enjoyed it!
SNG