GC-NSF(a) new gene generation hypothesis

I started from a study on the entirely new ancestor genes, i.e. the first ancestor genes in gene families consisting of homologous genes. From analyses of microbial genes and proteins obtained from the GenomeNet Database, I found that the first ancestor genes could be produced from non-stop frames on anti-sense strands of, not AT-rich, but GC-rich microbial genes [GC-NSF(a)].

This conclusion was mainly based on the facts that hypothetical proteins encoded by GC-NSF(a)s satisfied six conditions for folding of polypeptide chains into water-soluble globular proteins (hydropathy, α-helix, β-sheet and turn/coil structure formations, acidic amino acid and basic amino acid compositions) and that the probability of stop codon appearance is sufficiently small to produce non-stop frames on the GC-NSF(a)s.

The six conditions were obtained by examining the average values of extant proteins plus/minus standard deviations. Those average values of most proteins held nearly-constant levels, regardless of GC contents, which were obtained by calculation using amino acid structural indexes and amino acid compositions of currently observed microbial proteins encoded by seven microbial genomes with different GC contents.