PAST WORK
Data Analysis -
Sequence Alignments and Indels
DNA sequence analyses
are critically reliant upon the ability to identify
and compare homologous nucleotide positions. This task
is greatly complicated by insertion and deletion events
because, unless they are recognized, sequence comparisons
downstream of their occurrence will involve sites that
are not homologous. The analysis of ribosomal DNA is
particularly difficult because indels are relatively
common since the excision or insertion of a few nucleotides
often has little impact on rRNA function. Moreover,
there are no clear structural cues to signal their precise
position. By contrast, indels are much less common in
protein-coding genes because they ordinarily lead to
a shift in the reading frame, savaging the gene product.
Because of the rarity of indels, protein-coding genes
are easier to align. Moreover, when an indel is present,
it invariably involves the addition of a multiple of
3 nucleotides to enable preservation of the reading
frame.
Indels in COI
Complete COI sequences
are now available for 70 mammals, 66 bony fishes, 30
arthropods and 26 members of other phyla. The shortest
reported COI gene product is 510 amino acids long (in
the silkworm, Bombyx mori), while the largest
is 534 amino acids long (in the colubrid snake, Dinodon
semicorinatus).
Although further work is required to provide a detailed
view of their occurrence, past studies have indicated
that most of the indels responsible for these length
variations occur in the COI-3' section. However, indels
are not unknown in the COI-5' segment. In some cases,
all members of a major taxonomic group share an indel,
suggesting its occurrence early in evolution of the
lineage. For example, members of the phylum Aschelminthes
possess a 3bp insert at nucleotide position 78. In other
cases, indels have a much narrower taxonomic distribution,
suggesting their recent origin. These indels tend to
recur in certain regions of COI-5'. For example, almost
all of the indels detected in gastropod molluscs occur
in E2, the second external loop.
|