Structural Bioinformatics
1 Introduction
- Bioinformatics is an interdisciplinary field combining biology and computer science. It is crucial for analyzing massive amounts of biological data.
- Structural bioinformatics focuses on the structure of biological macromolecules like proteins, DNA, and RNA. Understanding their structure is essential for analyzing and understanding their function.
- Predicting the structure of biological macromolecules is a significant part of structural bioinformatics. This involves using bioinformatics tools and algorithms to determine a protein’s structure.
- Proteins are essential for various bodily functions, including hormone production, tissue formation, and enzymatic activity. Understanding protein folding is crucial for understanding their function and potentially developing treatments for diseases caused by misfolded proteins.
- Structural bioinformatics uses computational modeling and analysis to understand protein structure. This involves collecting biological data, building computational models, interpreting results, and informing further experimental design.
2 Viewing protein structures
- Molecular visualization is crucial in structural biology and biophysics, helping researchers understand and communicate their findings.
- Software tools have become essential for displaying and analyzing complex 3D protein structures on 2D screens.
- Popular molecular visualization software includes Chimera, Jmol, PyMol, and VMD, each offering unique features and strengths.
- Chimera (including ChimeraX) excels in exploring molecular structures and analyzing cryo-EM data.
- Jmol is versatile, suitable for education and web-based applications.
- PyMol is popular among experimentalists due to its features for crystallographic and NMR-derived structures.
- VMD specializes in visualizing molecular dynamics data.
- These tools offer scripting capabilities for automation and are continually updated to leverage hardware advancements.
3 Alignment of protein structures
- Template-based approaches are the most reliable for predicting macromolecule structures.
- There are three main alignment-based strategies:
- Sequence-based: High specificity but lower sensitivity.
- Profile-based: Higher sensitivity than sequence-based, but lower specificity.
- Structure-based: Most sensitive, but also the most specific.
- Alignment methods differ in their algorithms, gap penalties, and amino acid substitution matrices.
- Examples of alignment algorithms include:
- Needleman-Wunsch: Global alignment.
- Smith-Waterman: Local alignment.
- FASTA and BLAST: Heuristic-based alignment.
- PAM and BLOSUM matrices are used to compare amino acid substitutions.
- Sequence profiles improve alignment accuracy and are created using multiple sequence alignment (MSA).
- PSI-BLAST can be used to find homologs for creating a sequence profile.
- Tools for aligning sequences with profiles include:
- HMMER and SAM.
- DIALIGN and FFAS.
- Tools for profile-to-profile alignment include:
- FORTE, HHpred, and Sculptor.
- Profile-based methods rely on position-specific scoring matrices (PSSMs) and position-specific frequency matrices (PSFMs).
- Hidden Markov Models (HMMs) offer advantages over PSFMs and PSSMs by modeling gaps and correlations between residues.
- Examples of tools for protein structure alignment include:
- MADOKA and iPBA.