Sequence Alignment Homework Online

This course represents all the training materials for the BMMB:852 Applied Bioinformatics course offered at Penn State.

The course offers a structured path through the Biostar Handbook. Various sections of the book are presented via smaller, logical consistent units. We recommend learning two-four units per week.

The lectures consist of slides, links to various chapters, links to supporting materials and homework. There are no videos.

Please consult the synopsis for details on what is covered and how to learn the materials.

LectureQuizzes
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ
TAKE QUIZ

MSA Tools

compare the amino acid sequence of protein 1 with nine homologous proteins and make a multi-alignment of the sequences.

EBI have a portal for many MSA tools and there are also other MSA tools available elsewhere.

In research, it's good practice to use several alignment techniques and look at which generates sensible indels. Usually, this is the lowest number of indel events.

Clustal Omega is probably the most sophisticated MSA tool hosted on the EBI site, however, it is relatively new and isn't as established as T-coffee or MUSCLE.

Note that these tools are updated fairly regularly. This question and top answer about "cutting edge" MSA tools from 2014 refer to a paper from 2011 that attempts to benchmark MSA tools. As you can imagine the state-of-the-art tools change rapidly (for example, clustal-w2 was released, and now clustal omega since that benchmarking paper). For most researchers though, it's a personal preference, and different MSA tools are "better" for different situations (speed of computation, number of alignments, similarity of sequences, complexity of secondary structure, local vs global alignments etc.).

Consensus sequence

Determine a consensus sequence for the proteins based on the multialignment.

This depends entirely on the information from your alignment.

A common way to make a consensus sequence is to just take the most abundant residue at each position in the MSA. I don't like this approach because it heightens the significance of abundant residues, and diminishes the appearance of less abundant residues. This distorts the biochemistry and can easily lead to sequences that would be impossible or useless in biology. I would always rather see a raw alignment.

Conservation amongst alignments

Find any specific parts of the proteins that are conserved, then explain why these parts are conserved.

Conservation usually implies function, and hopefully you'll have a homologue that has a categorised function. Perhaps in your case one could attribute this function to a conserved sequence in the other homologues.

One brilliant tool is called consurf.

You can upload your MSA file to it and it will colour code regions of conservation from purple to blue. A purple region implies an evolutionary selection to not change that region, implying that it is a "functional" region.

Consurf works better with more sequences, so perhaps it isn't appropriate for this project. Instead try loading the alignment in Jalview and show "sequence conservation".

Caution.

When performing an MSA remember that the algorithm assumes that the sequence is homologous and that assumption can produce errors. If it looks wrong, it probably is!

0 Replies to “Sequence Alignment Homework Online”

Lascia un Commento

L'indirizzo email non verrĂ  pubblicato. I campi obbligatori sono contrassegnati *