THE "RANDOM" PROTEINS ARGUMENT

by David C. Wise


The faith that even one protein arose by chance is tremendous. Lets look at statistics. Proteins are made up of chains of amino acids, just like a train is made up of box cars. A chain of box cars makes up a train. A chain of amino acids makes up a protein. Humans have 20 different types of amino acids that make up our proteins, and the average human protein is 400 amino acids long. Remember, the arrangement of these amino acids is crucial to the function of the protein. If it is the proper arrangement it does its job, if the order is mixed up, it is worthless chemical junk.

Imagine many box cars at a train station, and these box cars are made up of twenty different colors. The owner of the station tells you he wants a train to be 400 box cars long, and you are to pick the combination of colored box cars, but if it is not the order he has in mind (and he didn't tell you it) he will fire you.

What are the odds you will get the box cars in the right order? They are the same odds the amino acids will align themselves by chance to make one protein in you. The odds are 20 to the 400th power! This is the same as 10 to the 520th power, that is a 1 followed by 520 zeros! You have better odds of winning California Super Lotto every week for 11 years than the odds of one protein in your body having the amino acids being properly aligned by chance. The odds are really much worse because the amino acids must be left handed, they must form a chain "in series," no parallel branching, their shape (proteins are wound up like a ball of yarn) is crucial, you need an oxygen free environment, etc etc. And remember, this is for just one protein. Your body has countless trillions of proteins.

The model that a brilliant designer made proteins requires much less faith than to trust random chance and natural processes.

(from AOLCREAT.DOC by Bill Morgan)

We've all heard that argument before.


This is a reprint of a short exchange between Bill Morgan and myself. Bill is the author of a cyber-tract, AOLCREAT.DOC, which on multiple occasions in 1996 got spammed across several newgroups, including a couple of the professional football groups. He also authored a comic-book tract called " Weird Science," a crude rip-off of Chick Pubs' infamous anti-evolution "Big Daddy?".


 >  I know you know too much about protein to honestly think that teh
 >  materilaistic arguments of their origin are stronger than the argumet
 >  they are the result of design, plan and purpose.

At least I know enough about proteins to see the errors in your presentation of standard creation science doctrine concerning proteins. It's a pity that your "open and testing mind" never scrutinizes creation science claims.

From AOLCREAT.TXT:

"Life requires many things. Long amino acids chains make proteins...chains in the proper order and shape. Miller's experiment did NOT produce any chains. Life also requires DNA, RNA and never has any experiment produced DNA or RNA from base materials. Never have chains of DNA or RNA been produced. A cell membrane has never been produced.

"The faith that even one protein arose by chance is tremendous. Lets look at statistics. Proteins are made up of chains of amino acids, just like a train is made up of box cars. A chain of box cars makes up a train. A chain of amino acids makes up a protein. Humans have 20 different types of amino acids that make up our proteins, and the average human protein is 400 amino acids long. Remember, the arrangement of these amino acids is crucial to the function of the protein. If it is the proper arrangement it does its job, if the order is mixed up, it is worthless chemical junk.

"Imagine many box cars at a train station, and these box cars are made up of twenty different colors. The owner of the station tells you he wants a train to be 400 box cars long, and you are to pick the combination of colored box cars, but if it is not the order he has in mind (and he didn't tell you it) he will fire you.

"What are the odds you will get the box cars in the right order? They are the same odds the amino acids will align themselves by chance to make one protein in you. The odds are 20 to the 400th power! This is the same as 10 to the 520th power, that is a 1 followed by 520 zeros! You have better odds of winning California Super Lotto every week for 11 years than the odds of one protein in your body having the amino acids being properly aligned by chance. The odds are really much worse because the amino acids must be left handed, they must form a chain "in series," no parallel branching, their shape (proteins are wound up like a ball of yarn) is crucial, you need an oxygen free environment, etc etc. And remember, this is for just one protein. Your body has countless trillions of proteins.

"The model that a brilliant designer made proteins requires much less faith than to trust random chance and natural processes."

First, we both know (now that you have read some actual protein sequences in my HUMAN.CMP file) that your assumption that every single amino acid in a protein is specified so that any change in the specific sequence would destroy the protein's functionality ("Remember, the arrangement of these amino acids is crucial to the function of the protein. If it is the proper arrangement it does its job, if the order is mixed up, it is worthless chemical junk."). For many amino acid positions it is the class of amino acid (eg, hydrophyllic, hydrophobic, charged, uncharged) and not the amino acid itself that is important.

In reality, only some positions on a protein require a specific amino acid, others require any of a few different amino acids, and many will accept practically any amino acid. Indeed, it is precisely this fact that allows us to compare the differences in the same FUNCTIONAL protein in different species and find that the degree of difference between more closely related species to be less than between less closely related species.

At the same time that they claim that any change in that specific amino acid sequence would destroy a protein's functionality, creation science is well aware of the fact that the same FUNCTIONAL protein can have different sequences (honestly, would a little consistency be too much to expect of creation science?) and of what the patterns of relatedness that those differences show. Which is why there are so many false creation science claims of distantly related species having more similar proteins than more closely related ones. Like Walter Brown's blatantly deceptive rattlesnake protein claim. And Duane Gish's infamous bullfrog protein (which claim he made on national TV, then refused to produce his source, except to let slip at one point that it was based on a joke he had heard, and which thereafter caused similarly outrageous creation science claims to be met with the cry of "Bullfrog!" -- I have a file which tells the entire story, if you'd like to read it).

Rather than brandying about a hypothetical protein, let's look at a specific case. In the class notes of Frank Awbrey & William Thwaites' creation/evolution class at UCSD (the Institute for Creation Research conducted half the lectures and Awbrey & Thwaites the other half), they give the example of a calcium binding site with 29 amino acid positions: only 2 positions (7%) require specific amino acids, 8 positions (28%) can be filled by any of 5 hydrophobic amino acids, 3 positions (10%) can be filled by any one of 4 other amino acids, 2 positions (7%) can be filled with two different amino acids, and 14 of the positions (48%) can be filled by virtually any of the 20 amino acids.

The sequence of the 15 specified positions is:
L* L*L* L*D D* D*G* I*D* EL* L*L* L*

 Where:
    L* = hydrophobic - Leu, Val, Ilu, Phe, or Met
       Prob = (5/20)^8

    D* =  (a) Asp, Glu, Ser, or Asn
                  Prob = (4/20)^3
         OR (b) theoretically also Gls or Thr
                  Prob = (6/20)^3

    D = Asp
       Prob = (1/20)

    E = Glu
       Prob = (1/20)

    G* = Gly or Asp
       Prob = (2/20)

    I* = Ilu or Val
       Prob = (2/20)

Remaining positions = any of 20 Prob = (20/20)^14 = 1^14 = 1

Total Prob = Prob(L*) * Prob(D*) * Prob(D) * Prob(E) * Prob(G*) * Prob(I*) = (a) 3.05 x 10^(-12) OR (b) 10.2 x 10^(-12)

Your own calculation of the probability of a functional order coming up (ie, the standard creation science method) would be: (1/20)^29 = 1.86 x 10^(-38).

Comparing the lower probability to yours shows it to be 1.64 x 10^26 times greater.

This invalidates your colored-box-car analogy as it stands (to correct it, you would need to allow for a variety of different combinations) and it invalidates your probability calculations.

The second problem lies the assumptions of your protein model, exemplified in your statement: "[The odds for success in the box car analogy] are the same odds the amino acids will align themselves by chance to make one protein in you." Whatever is that supposed to have to do with evolution? What your model describes is CREATION EX NIHILO, not evolution.

Do you believe that proteins are formed by "aligning themselves by chance"? That is not how life works. I will not patronize you by describing how cells produce proteins based on DNA base sequences transcribed onto RNA; you should know about that already and doubtless do.

An evolutionary accounting for modern proteins would be that they had EVOLVED through their "descent with modification" (the basic definition for the "fact of evolution") from ancestral proteins; ie, that the genes for modern proteins were inherited from a long line of ancestors and had undergone changes along the way. The evolutionary account does not depend upon modern proteins being created ex nihilo, whereas the creationist account does. Hence your probability arguments apply to creationism and not to evolution, which uses an entirely different model to which different probabilities apply, as examined in my MONKEY program (attached).

Rather, your complaint is against Abiogenesis. Please read my discussion in WEIRDSCI.WP, so as to keep the bandwidth down here.

 >  "Life requires many things.  Long amino acids chains make proteins...
 >  chains in the proper order and shape.  Miller's experiment did NOT
 >  produce any chains.

No, the Urey-Miller experiment only produced amino acids. But Sidney Fox's experiments showed that when heated, amino acids formed quite readily into chains, some of which were observed to possess catalytic properties. From that point, all we would need is the ability to replicate these thermal proteins for evolutionary processes to come into play.

 >  The faith that even one protein arose by chance is tremendous.

Fox's experiments showed that these thermal proteins formed quite readily, so the probability is extremely high.

Ascribe not to chance that which is deterministic.

 >  honestly......your intelligently planned and purposeful program
 >  gives you"faith" that amino acids formed chains to produce parts
 >  of a cell (don't forget you need DNA RNA, a cell membrane too)
 >  and these cells redproduced into skin cells, nerve cells blood
 >  cells etc to make a living organism?

There you go again, trying to discredit that which you know absolutely nothing about. You don't know what my program does, nor how it does it, and yet you immediately try to discount it. Is this your living example of "[taking the path] of testing and examining with an open mind"?

Well, Bill, *I* did take the path of testing and examining, not only with an open mind, but also with a critical and skeptical mind. Rather than accept a claim unquestioningly on blind faith, I dared to say that I didn't believe what I had just read, so I tested it and examined the results. MONKEY is the product of that testing and examining. And as a result of that testing and examining, I am very much impressed with the power, the speed, and the certainty (ie, ability to converge rapidly) of Natural Selection (even you said "Natural Selection is a true concept.").

In Chapter 3 of "The Blind Watchmaker" (did you ever read it as you had promised, Bill?), Richard Dawkins addressed the old analogy (by Eddington, I believe) of an infinite number of monkeys at an infinite number of typewriters pounding away continuously for an infinite amount of time and thus being able to produce Hamlet. Dawkins toyed with the probability of randomly producing just one line out of Hamlet, "Methinks it is like a weasel" (two characters are looking at the shapes of clouds) and came up with an astronomical number for the odds against succeeding, such that a computer making a million attempts per second would require millions of billions of years to succeed (my calculation).

But then that is not how life would do it. In life, a parent produces a number of offspring that are almost exactly like him, yet slightly different. Then the most fit survive to become the parents of the next generation, and so on. So he wrote a program, which he called WEASEL, that started with a random string, produced copies of that string which differed only by one randomly selected letter in a randomly selected position. Then the fittest string (measured by its relative proximity to the target string) became the "parent" of the next generation. He wrote the program in interpreted BASIC, started it, left for lunch, and it had the answer by the time he returned. Indeed, it succeeded over and over again, without fail.

Well, I just could not believe that! So I wrote MONKEY (named after the aforementioned simian steno pool) in Turbo Pascal (my language of choice at that time; at present it would have been written in C++, though I'm returning to Pascal for Delphi). It produced the string (the alphabet in alphabetical order) within a minute! I ran it over and over again and it succeeded over and over again -- repeatedly, consistently, without fail.

Well, I still couldn't believe it! So I developed a mathematical model of what MONKEY was doing and calculated the probabilities involved. Then finally I could believe it because I could see how it worked. Indeed, I found that the system would converge rapidly to a probability of success of over 99.99%, near dead certainty. I gained a great appreciation for the observation of a famous biologist (Ernst Mayr? or John Maynard Smith?) that natural selection makes the improbable inevitable. Ironically, I got the idea of expressing the model as a finite-state machine (which allowed me to use Markov chains) from the math-genius son of my former boss (the kid is third-generation Fundamentalist); after hearing my description of the abysmally poor performance of single-step selection (YOUR model of selection; read MONKEY.DOC), his jaw literally dropped as he watched MONKEY succeed in 30 seconds.

As a result of my work with MONKEY, I have also become interested in artificial life experiments. Of particular interest are genetic algorithms (GA), which use mechanisms rather similar to MONKEY's to find optimal solutions of complex engineering problems. The classic GA example was Goldberg's problem of controlling pressure in a complex pipeline network. More recently, Stanford professor John Koza demonstrated using GAs to design high-order analog circuits automatically (eg, 5th-order filters, 20-rung ladder filters).

I consider it important for you and other creationists to learn about MONKEY because it points out a common mistake that you are constantly making and need to correct. Your probability arguments typically misrepresent evolution as using single-step selection, which is an abysmally poor technique, whereas evolution, and life itself, uses cumulative selection, which is an extremely powerful technique with incredibly high probability of success, as demonstrated by MONKEY. Besides, as I have already pointed out, single-step selection is more descriptive of special creation (one-time good deal get it all together in one single try from scratch). The sooner you abandon your bogus arguments and address the real issues, the sooner you MIGHT start to be taken half seriously (ie, as anything other than a threat to science education).

Attached you will find MONKEY.ZIP. Extract the files with PKUNZIP v2.04g or later. The files are:


 MONKEY.DOC -- Text file explaining how to use MONKEY
 MONKEY.EXE -- Executable copy of MONKEY.  Should run on any IBM PC
                   or compatible.  Does not need any graphics card.
 MONKEY.PAS -- Turbo Pascal source file for MONKEY.  Read it to see
                         how MONKEY works.
 MPROBS.DOC -- Text file containing a discussion of the probabilities
                         involved in MONKEY and a description of how MPROBS
                         works.
 MPROBS.PAS -- Turbo Pascal source file.  Calculates the probabilities
                         for cumulative selection using Markovian chains and
                         stochastic matrices.  Very primitive interface (ie, none)
                 which requires the program to be recompiled for every
                 change of parameters.
 README.MNK -- Distribution MONKEY README file.
 MONKEY.DIR -- List of files in MONKEY.ZIP (output from PKUNZIP -v)
 READ.ME    -- README file for Dan.  Offers a little more explanation and
                      addresses some of his specific questions.
If you do not have PKUNZIP, please let me know and I will send you a copy.

As you read it, do try to follow the path of testing and examining with an open mind.

If you can.


Earlier CompuServe Discussion on Same Subject


#: 230849 S4/Biology
    20-Feb-96  21:15:48
Sb: #229253-Lamarquian Evolution
Fm: David C. Wise 72747,3317
To: charles wagner 72401,2203

 >  Take for example a hypothetical polypeptide enzyme composed of 20
 >  different amino acids. These amino acids must be in the correct order and
 >  of the correct kind for the enzyme to work. The first amino acid has a 1:20
 >  chance of being correct. Same with the second and third anon... This gives
 >  us a probability of randomly discovering the correct sequence as about
 >  10e20. Now there are about 2000 enzymes needed for a mammal to function.
 >  This gives us a probability of 10e40000 of finding them all by random
 >  trial and error. The earth is about 10e17 seconds old. How many mutations
 >  would have had to occur to get where we are?

Your first mistake is in assuming, quite incorrectly, that one and only one
amino acid sequence would be correct and any variation in that sequence would
be non-functional.  Human lysozyme, chicken lysozyme, and bullfrog lysozyme all
have different amino acid sequences and yet are still the same protein,
lysozyme (BTW, human and chimpanzee lysozyme are identical).  Human and
rattlesnake cytochrome c differ from each other by about 14 amino acids,
rattlesnake cytochrome c differs even more from cytochrome c in other species
(in the Dayhoff study, which included no other snakes, the rattlesnake protein
happened to be slightly more similar to the human protein than to the other 15
or so species in the study, which gave rise to a particularly and intentionally
deceptive creationist claim), and human and rhesus monkey cytochrome c differ
by 1 amino acid (which the creationist claim carefully avoided mentioning), and
yet it is still cytochrome c.  I have a protein database with many more such
examples.

From the class notes of Frank Awbrey & William Thwaites' creation/evolution
class at UCSD (the Institute for Creation Research conducted half the lectures
and Awbrey & Thwaites the other half), we read their response to Duane Gish's
standard argument which you have repeated here.  Gish calculates the
probability of a replaceable pool of 20 amino acids randomly forming into
RNAase with 124 amino acid sequences as:

 Prob = (1/20)^124 = 1/3.4 x 10^(-166).

His basic assumptions, identical to yours, are:
 1. each amino acid is uniquely specified.
 2. parts of proteins serve no useful purpose

Evolutionist response:
 1. only a few amino acids in a protein are uniquely specified, many are
 loosely specified, and many are not specified at all; eg, in hemoglobin,
 although sickle cell trait involves only one amino acid substitution, 17
 other hemoglobin variants are known which are not harmful -- these have
 substitutions at nonessential sites.
 2. parts of proteins may well have serve useful functions in primitive
 life forms.

A&T point out that there are different classes of amino acids:  hydrophyllic,
hydrophobic, charged, uncharged, etc, and that for many amino acid positions it
is the class of amino acid and not the amino acid itself that is important.

Their counter example is of a calcium binding site with 29 amino acid
positions:  only 2 positions require specific amino acids, 8 postions can be
filled by any of 5 hydrophobic amino acids, 3 positions can be filled by any
one of 4 other amino acids, 15 of the positions can be filled by virtually any
of the 20 amino acids.

The sequence of the 15 positions is:  L* L*L* L*D D* D*G* I*D* EL* L*L* L*
 Where:
 L* = hydrophobic - Leu, Val, Ilu, Phe, or Met  Prob = (5/20)^8
 D* =  (a) Asp, Glu, Ser, or Asn           Prob = (4/20)^3
                         (b) theoretically also Gls or Thr   Prob = (6/20)^3
 D = Asp (1/20)
 E - Glu (1/20)
 G* = Gly or Asp (2/20)
 I* = Ilu or Val (2/20)

 Total Prob =  (a) 3.05 x 10^(-12)
                                                        (b) 10.2 x 10^(-12)

 By Gish's (and your) method:  (1/20)^29 = 1.86 x 10^(-38)

So you see that the world and its probabilities work quite differently that
Duane Gish and you imagine it to.


Your second mistake should have been covered in your biology courses.  Each
generation's proteins are not specified de novo and randomly, but rather each
generation inherits the previous generation's encoded sequences with the
possibility of some random changes.  Of course, you can now see that several
sites in those sequences can suffer base substitutions very gladly and still
produce functional proteins.  Indeed, we repeatedly find that the degree of
difference found between the proteins of different species fits very well what
we would expect had they descended from a common ancestor (AKA "evolved").
These patterns show up even when one tries to demonstrate otherwise, as in the
case of Michael Denton (whose attempts to disprove the standard phylogenetic
trees by grouping species according to the degree of difference between their
corresponding proteins only resulted in the very same standard phylogenetic
trees that he was trying to disprove).

Come to think of it, how does your panspermia account for the patterns of
relatedness shown by protein comparisons, or for pseudogenes, pieces of "junk"
DNA that are only shared by related species?

And as for the very first such protein having to have formed at random,
research by Sidney Fox shows that, when heated, amino acids form quite readily
into short proteins which, in the presence of water, form into microspheres.
These microspheres are very stable, so long as there are no micro-organisms
about to eat them.  Many of these microspheres also display catalytic activity.
All that is missing at this point is a mechanism for replicating these thermal
proteins, after which the power of cumulative selection could be brought to
bear.

Which brings us to your third mistake.  You assume that life would operate by
single-step selection whereas it is blatantly obvious that life operates by
cumulative selection.  As much as you would love to discount my MONKEY, it does
still demonstrate, through a controlled experiment, the immensely vast
probability difference between the single-step and cumulative selection
methods.  Your single-step example would indeed be very improbable and require
an immense time-span to produce results, whereas cumulative selection would
succeed in a matter of generations.


Walter Brown's Rattlesnake Protein Claim

Please permit me to share with you again the facts of Walter Brown's
rattlesnake protein claim, still in abbreviated form (to save a little time).
There are some obvious parallels to be drawn with your comparing the lamprey
to other species, none of which were fish:

Walter Brown's Rattlesnake Protein Claim
Creation/Evolution Newsletter Vol4 No5 Sep/Oct 1984 pp15-17

data came from cytochrome c comparisons by Dr. Margaret Dayhoff
Walter Brown's son used the data in a high school science fair project in
   which he concluded that rattlesnakes were more closely related to humans
   by cytochrome c than to any other organism.
When the author asked Brown to explain his claim, Brown explained that of
   the 47 organisms in the study, the one closest to the RATTLESNAKE was
   the human.  However, Brown was careful NOT to say that the one closest
   to the human was the rattlesnake, which would have been totally false.

   human -- rattlesnake  14 amino acids difference
   human -- rhesus monkey  1 amino acid different
   human -- chimp          identical, no differences

   no other snakes were included in the study, so the rattlesnake was
   about equally different from all the other organisms in the study and
   just happened to be one amino acid less different from the human.

   Brown was shortly afterwards observed after a debate telling a group of his
   followers about
   rattlesnakes being more closely related to humans.  When the author
   started explaining to the group the facts about the claim, Brown very quickly
   changed the subject.

There is no way that Brown could have not known that he was misrepresenting
the facts and that he was lying (ie, telling a deliberate falsehood) with the
intent of deceiving his audience.


Return to Top of Page
Return to DWise1's Creation/Evolution Links Page
Return to DWise1's Creation/Evolution Home Page

Contact me.


First uploaded on 1997 July 02.
Last updated on 2011 August 02.