Pymol 1.8.x now fetches mmCIF files by default and shows missing residues in Sequence View

Long time Pymol users will be familiar with the fetch command by which it retrieved the selected file from the Protein Data Bank (PDB). Up until version 1.8, Pymol fetched the PDB version of the file by default over the mmCIF option, which the biomolecular crystallography community has long been advocating due to some deficiencies in the legacy PDB file format. Now however, Pymol fetches the mmCIF version by default and has a new default behaviour as well. When viewing the sequence bar,* Pymol now show the residues that were missing in the crystal structure, whether due to insufficient resolution or electron density. These are now displayed in grey lettering in the sequence bar (Figure 1).

Figure1. Pymol imports mmCIF files now by default with the fetch command, and displays missing residues as described in the metadata

There are also significant differences in how water molecules are listed. Previously water molecules were associated with each subunit or chain within the PDB file. In figure 2 I have loaded a second copy of the 3e90 structure which I had downloaded as a .pdb file. I had to rename it to 3e90a.pdb otherwise Pymol would have assumed it was a second state of the molecule which I did not want. You can see clearly here that for the PDB version, the water molecules associated with chain A are listed after the amino acids.

Figure2. Second copy of crystal structure manually loaded from a previously downloaded PDB file.

Similarly the water molecules and ligands associated with the PDB file are shown at the end of chain B (Figure 3). Whereas the initially loaded mmCIF file puts them all at the end of the line as separate objects, along with the extracted ligands and salts (Figure3).

Figure 3. in mmCIF files, ligands and water molecules are listed as separate chains, whereas with PDB files that are associated with a protein chain

 

I have not yet worked out a way to undisplay those missing residues short of deleting them, and they can cause issues when trying to align different structures. For example in Figure 4, the alignment of the missing residues in Box A with all the water molecules of chain A doesn’t make a lot of sense. On the other hand, the small Box B shows  how the mmCIF file conveys useful information about the residues that are missing in that portion of the PDB version of the structure.

Figure 4. Alignments in Pymol can be problematic when missing residues “align” with water molecules

 

This way of displaying missing residues quickly gets confusing when overlaying multiple structures, in the case of Figure 5, multiple versions of the NS2B/NS3 protease from West Nile Virus. For the moment I would recommend downloading the PDB files individually as PDB files and hold off switching to mmCIF if Pymol is your principle viewer.

Figure 5. Using mmCIF versions for multiple alignments with several deletions represented by grey lettering can become confusing

 

Test Platform: MacBook Pro (2011), MacOS El Capitan 10.11.3, Pymol v. 1.8.4.0

*To turn the sequence bar on you can either press the S button third from the right at the bottom of the screen, or use the seq_view command.

The sequence bar toggle button

 

About martin

almost on holidays
This entry was posted in Chem, mmCIF, mmCIF files, modeling, Pymol, software. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s