OK rather than string a bunch of tweets together I’m going to try and collect my thoughts here. This page will be evolving over the course of the weekend. Forgive me if it seems a bit higgledy-piggledy.🐖
Disclaimer: I have seen no confirmation of price increases from Perkin Elmer.
Firstly I became aware of this when someone (I forget who sorry), RT’d this from Ken Knott into my timeline this morning. As you can see he’s been quoted at 500% increase in price for their license from $6K to $30K. I’m not sure how many concurrent users this is for, I suspect this is the single user price in US$.
This is bad, really bad. This would be catastrophic for our chemistry researchers. Not just going forward looking for a replacement product (hello Chemdoodle), but the inability to open ~30 years of documents, edit old schemes and thesis figures etc would be incalculable.
Now currently the Perkin Elmer Informatics web page lists prices for both subscription and perpetual licenses for the 2021 version. Again, I’m assuming $USD and the following are for single users:
We have students with hundreds of thesis figures etc halfway through writing theses. Remember for a moment the historical problems of retrieving ChemDraw figures out of Word documents (“Round-trip editing“). Forget that, this is now about our ability to use documents we deliberately saved as .cdx files to avoid the problem. Like many other we moved to an annual site license rather than buying each researcher a perpetual single seat. We share this cost between the Chemistry and Pharmacy departments and the Institute for Molecular Bioscience, where I’m based. I don’t have access to what our split of the cost was in 2021, but this gives all students and staff access to their software on their work machines, not take-home copies as we have for some things (e.g. MS Office).
Apparently ChemDoodle can open ChemDraw .cdx and .cdxml files but I’m not sure if it can handle the older file formats (e.g. chm files).
I’ve not used ACDLabs Chemsketch in over a decade so I’m not sure of it’s current status. But they have a freeware version which I should probably look at.
I’ve used Marvin Sketch in the past and it’s plugins like pKa predictions. I see that they’ve now moved to a Cloud option as well with the Marvin Pro product.
Strategies moving forward
Don’t Panic. Let’s wait and see what, if any response is received from PE
As a contingency, any new files you create between now and the end of your current subscription should probably be saved in both native .cdx format as well as an older interchange format such as .mol.
If you’re not a time constrained graduate student just trying to get their thesis figures finalised maybe try out one or two of the alternatives above. They may not have the plugins you’d like in the free/cheaper versions (think NMR prediction, Scifinder access) however neither does the base version of ChemDraw Prime.
Anyway I will try and collect more useful information, and if anyone has actual info from PE about this (new pricing/quotes etc. they can drop me a DM.
There continue to be social media arguments that the recently approved Pfizer SARS-CoV-2 antiviral drug Pavloxid is “an analog of GC373”, and some of these have spilled over into mainstream media. These reports, as is often the case arise from superficial Googling, which leads people to an inaccurate Wikipedia article.
So lets do a Nirmatrelvir tear-down.
To understand how this misconception arises, it is firstly necessary to speak briefly about protease biology. Protease enzymes catalyse the hydrolysis of peptide bonds. Viral protease enzymes break down the functional viral proteins from a large precursor polyprotein. In the case of coronaviruses, these come from the open reading frame 1a (ORF1a), towards the 5′ end of the genome.
Some terminology may also be required. The classic Schechter and Berger paper describes amino acid side chains (P4-P1 on left of cleavable bond (red), P1′-P3′ on right of cleavable bond). The P4-S3′ amino acid side chains occupy corresponding binding pockets S4-S3′ of the enzyme.
GC373 is an older (2012) inhibitor of a related coronavirus responsible for feline infectious peritonitis (FIPV). Saying that PF-07321332 is an analog of GC373 is akin to saying that a Bugatti Veyron is an analog of a Volkswagen Beetle. They might both have 4 wheels, but they’re very different.
So let’s break it down chemically.
GC373 is a dipeptide, with an N-terminal Cbz cap, which spans the active site of the FIPV 3CL protease enzyme (also known as Mpro) from the S3 to S1 pockets. It binds covalently to the catalytic cysteine residue of the protease through an electrophilic aldehyde group.
PF-07321332 is a modified tripeptide with an N-terminal trifluoroacteyl cap, spanning the SARS-CoV-2 enzyme’s active site from the S4 to S1 pockets. Instead of an aldehyde group, PF-07321332 uses an electrophilic nitrile to covalently bind to the enzyme.
As PF-07321332 covers a greater area of the enzyme than does GC373, it is more specific, tailored for the SARS-CoV-2 enzyme.
Breaking them down into their components we see that GC373 breaks down into three components. Firstly a cyclised analog of glutamine, the pyrrolidone aldehyde, Then comes a leucine residue, and finally the benzyloxycarbonyl (Cbz) moiety.
Binds in S1 pocket
Binds in S2 pocket
Binds in S3 pocket
GC373 binding residues
PF-07321332 on the other hand breaks down into 4 components. Again the pyrrolidone but this time as a nitrile, then a specialised bicyclic group that was previously used in the Hepatitis C protease inhibitor boceprevir. Then a tert-leucine residue (sometimes confusingly also called tert-butyl glycine), also seen in boceprevir, and finally the trifluoroacetyl cap.
Binds in S1 pocket
Binds inS2 pocket
Binds in S3 pocket
Binds in S4 pocket
PF-07321332 binding residues
Normally when medicinal chemists talk about analogs, they mean a series of molecules which contain small changes in order to determine which one is likely to be the best candidate to be taken into the clinic. As can be seen from the above breakdowns, PF-07321332 and GC373 share the same P1 residue but are otherwise quite different.
The binding of PF-07321332 to the SARS-CoV-2 protease has also be studied using Xray crystallography. Interestingly the P3 tert-leucine residue doesn’t occupy a discrete pocket, with the side chain being directed more into solvent. It does however contribute two critical hydrogen bonds with the protease enzyme residue Glu166.
Postscript. Neither of these molecules looks anything like that other social media conspiracy theory: Ivermectin. No, Nirmatrelvir is not repackaged Ivermectin either.
One of the most notable things about the way that science is pushing back against the current Coronavirus pandemic is the way that research results are being pushed out at extraordinary rates in a mostly open manner, often with the attendant raw data being also available. New results have pushed out nearly all other content on my Twitter timeline with commentary, new results, links to breaking preprints, and a healthy dose of cautious scepticism coming from the scientific community. And in the small subsection of it that I call my virtual home, #Chemtwitter, people are kicking around ideas and contributing to a number of collaborative projects such as the COVID MoonShot, starting with fragment binding data from the UK’s Diamond Light Source, and now using the combined expertise of the world’s medicinal chemists.
From a seemingly random observation on Twitter, Peter Kenny and I have been discussing the implications of some of the new crystal structures of the SARS-CoV-2 protease (3CLpro, aka Main protease Mpro) on future inhibitor design. And with open platforms such as Figshare it is easy to us to share those opinions and discussions with everyone who’s interested.
Crystal structure of the SARS-CoV-2 protease 3CLpro with a bound inhibitor. Source PDB:6lu7
Note: I have replaced links to the dodgy website with links to Snopes and Wikipedia instead.
Google Scholar is linking to alkaline quackery
Pseudoscience has no business being included in scholarly research so I was most alarmed to see this alert turn up in one of my weekly Google Scholar email alerts.
The link title promises a article called “The Annual Conference on Bacterial, Viral and Infectious Diseases held in Dubai, UAE and published in The Journal of Infectious Diseases & Therapy”. I get hundreds of similar alerts a week, most are only tangentially related to my own work. Mostly I just review the titles quickly and add the interesting ones to my Endnote databases. But something made me pause on this one. That name seemed somehow familiar…
Whoop, whoop. Danger Will Robinson!
That Google Scholar link actually resolves to a WordPress blog, not a scholarly journal!
And finally to a URL https://phoreveryoung.wordpress.com/page/6/. It’s a alkaline diet quackery website by the notorious Dr. Robert O. Young. And of course it includes a horrific collection of links to more bunk and woo-woo in the sidebar. A Universal Cure for Cancer! The Truth Shall Set You Free! Are YOU Prepared for the BioHazardous Effects from 5G EMF Radiation? In case you don’t know, Young is a convicted fraudster, sentenced in 2017 to 3 years and eight months in jail for practising medicine without a license. In 2018 hehad a US$105 million settlement awarded against him in a lawsuit against him by a woman, who claims he advised her to not undergo surgery and to forego traditional chemotherapy treatment for her breast cancer. Instead he advised her to undertake alternative therapies including “pH injections”, and subsequently her cancer progressed to incurable stage IV..
“Dr” Young’s alkaline website
As it turned out the original URLwas a dead link, which is an inherent problem with blog URLs. A quick WordPress search turned up the original article from September 2018 announcing that the site’s owner Dr. Robert. O. Young was going to be presenting his breakthrough alkaline diet “research” in a keynote ata workshop on the “New Biology” at this Dubai conference.
So what exactly is this conference. Tracking his screenshots gives the conference website as http$://b@cterialdisease$.infectiou$conference$.com which is part of a “global” collection of >3000 “events” of what I’m guessing are mostly predatory conferences, which is run by a known dodgy group called ME Conferences. Hilariously, attempting visit their website failed while I was connected to our network:
So this is a known quack and woo-spammer, who’s also notorious for spamming social media. Spamming Facebook or Instagram isn’t particularly unusual, but what’s highly troubling to me is that Google Scholar’s algorithm thought it noteworthy to include a dodgy blog post from 2018 into my weekly alerts. Worse still there is no known reporting mechanism to have this corrected. Google is very opaque about how it’s algorithm ranks or rates a scholarly source, saying only:
“Google Scholar aims to rank documents the way researchers do, weighing the full text of each document, where it was published, who it was written by, as well as how often and how recently it has been cited in other scholarly literature.”
Google Scholar’s inclusion documentation says that the usual forms of scholarly output such as journal articles are the primary source of content, and that news and editorials are not appropriate. So where do blogs lie? I would call my own blog pages more opinion/editorial in nature so would never consider them worthy of inclusion in Google Scholar. Google Search sure, but not Google Scholar.
Personal Blogs and Pseudoscience do not belong on Google Scholar
I hope that this is an isolated incident. As there is no way to see how this made it through Google’s algorithm, nor any feedback form or other mechanism to report or stop it, it is difficult to guess how many other such pseudoscience articles, blogs, and opinion pieces have made it’s way into the Google Scholar database, and thus acquired an underserved badge of legitimacy.
This is a particularly alarming development, particularly at a time when engaging the public with evidence-based medicine is critical, rather than purveying modern-day snake oil. Science has built-in mechanisms like peer review in place to fact-check and build up a reliable body of knowledge, which is needed to gain and retain public trust. It is deeply problematic that pseudoscience and quackery has been allowed to by-pass these processes and be served up by Google as legitimate scholarly work.
If anyone else has seen this type of thing in their Google Scholar searches or alerts, let me know. Either here in the comments or on Twitter where I’m @MartinStoermer
Just a short post/hint about how to get your data back from DVD backups made with Apple’s Backup.app utility. Backup was introduced in 2002 as part of Apple’s .Mac service, later MobileMe, but languished relatively unloved by the Mac community, and was last updated to version 3.2 in 2010. I liked it however because it could split large backups over several DVDs. [Important: no data was lost in this exercise – multiple redundant backups y’all.] I haven’t had much reason to recover data from Backup backups lately, but it always worked smoothly when it needed to.
Until MacOS Sierra.
Backup.app fails to run under Sierra
As I found out today, while searching for some old NMR data, Backup no longer runs under Sierra, leaving me with a pile of DVDs with no easy was of recovering the data. Searching Google and Apple’s support forums gave no joy.
So what was I to do? In such circumstances my standard trick is to drag and drop the (in this case large) file onto a bare-bones text editor such as Smultron/Fraise in order to discover what sort of file I’m dealing with. In this case Smultron threw up this useful error dialog:
Error shown when dragging a Backup file into Smultron
So Control-clicking, or right-clicking on the first part of the backup enabled me to look inside the package structure and begin searching for my files.
The first disk contains a file named “NMR – 2008.08.05-10.25.18.540 – Part 1.FullBackup”. Drilling down through:
Contents -> Contents -> Backup.sparseimage
located in image file whch contains all the actual data.
The image file could be mounted with DiskImageMounter and was simply named NMR. It contained a directory structure which ultimately contained my files in the desired folder, nested in their own project folders. Path = Users/u.sername/Documents/nmr_backup/etc.
This is not the end of the story however, as the data does not appear to be backed up in any particular order (that I could discern – perhaps creation date?), so every DVD had to be opened in this way, giving me ultimately 4 directory trees that needed to be merged by hand to get all the data into the right common subdirectories. Ultimately I was able to locate all the data, which matched another backup location I had, using a different backup protocol.
So to lessons learned. 1) Backup apps that use a proprietary format are probably a bad idea. In this case I was able to eventually dig all the files out, which ultimately didn’t matter because I also had simple file-copies in at least two other “cloud” and two hard disk locations to compare the file lists to. 2) Verify your backups.
In the continuing saga of broken round trip editing with the latest (17.1) version of ChemDraw, and Microsoft Office 2016/365 I thought I’d share with you, with permission, some communication I have had with Pierre Morieux (aka the ChemDraw Wizard, @ChemDrawWizard) from the ChemDraw team at PerkinElmer. It contains details about what exactly has broken in the last update.
There is an explanation for what you observed: the reason is a change in the way Microsoft decides what to paste from the clipboard. The way SMILES was registered as textual data meant that MS Office preferred to paste it instead of the PDF image from the document. We worked around this by removing the problematic registration from the SMILES type. The problem is that if there are multiple versions of ChemDraw installed on the system, the old registration will conflict with the newer cleaned up one. Users need to remove old versions entirely from their system and clean up there launch services database using the command line.
The following command line can be used to remove an old version: /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/Support/lsregister -kill -r -domain local -domain system -domain user
If you think it may help your readers, could you convey the information?
Please let me know,
I don’t know much about what else resetting the Launch Services database does on your system, so I would strongly advise you to back everything up properly before trying this fix. I haven’t been brave enough to delete the old versions of ChemDraw on my main systems yet, as I keep them to guard against future problems. And I certainly don’t know if the problem would recur if you tried to go back and use an older version in between 17.1 sessions, but I may go and do so on a separate external USB High Sierra boot disk that I have set up for testing. I’ll let you know how that pans out. – Martin
Just got my hands on the 17.1 Release of ChemOffice Professional (Mac) and ran through my normal initial sequence of tests, i.e. Word round-trip editing. There are a bunch of changes listed in the version history as part of the online manual which I won’t cover here yet, but there are changes to the File, Edit Menus, Hotkeys, HELM support, the introduction of ChemDraw Add-ins. – Martin 5th April, 2018.
Update 5th April, 2018. Doesn’t contain a fix for roud-trip editing with Pages, Keynote.
Round-trip editing with Word, Powerpoint 2011. This works as previously, no surprises here:
Round Trip Editing, ChemDraw 17.1, Word 2011
Round-trip editing with Word 2016. Here something odd happened.
Opening the Word 2011 file in Word 2016, objects could be copied and pasted back into ChemDraw 17.1, but when those objects were then edited, and copy-pasted back into Word 2016, they were pasted as a SMILES string:
This also occurs when there is no round-trip involved. Brand-new objects drawn in ChemDraw 17.1 17.2 and pasted into a new Word 2016 document paste as SMILES. However, using Word’s Paste Special as PDF function, it inserts the graphic correctly:
Round trip editing, ChemDraw17.1, Word 2016
The good news is that as long as you do the Paste Special in Word 2016, you can go back and forth several times successfully. Clearly the two apps are using different sections of the Clipboard as supplied by ChemDraw. Summary: in this task, ChemDraw 17.1 is performing more poorly than ChemDraw 17.0.
ChemDraw 17.1 with PowerPoint 2016
PowerPoint 2016 had no such SMILES issues, using structures from either old 2011 documents, or all brand new items from ChemDraw 17.1, copy-paste worked as you’d expect. (Using Paste or Paste Special – PDF works identically over multiple back-and-forths).
Round trip editing, ChemDraw17.1, PowerPoint 2016
Multiple sequential edits in ChemDraw 17.1 and Powerpoint 2016
The observant reader will have noticed the alert in the Word windows captured above suggesting that I should install a Word 2016 update. To date I have avoided this, as a previous update of Word 2011 has been shown to break round-trip editing, and I don’t want to mess with my production machine. In addition, There are often discrepancies in version number within Office depending on whether the product was purchased as a standalone license, a volume license, or as part of an Office 365 subscription. My copy is an Office 365 subscription and contains the subversions below. So it is possible that a later upgrade may work better.
Test platform: ChemDraw Professional 17 (22.214.171.124), iMac 2011 8GB RAM, MacOS 10.12.6 Sierra; Microsoft Office 2011 (Word 14.7.7, PowerPoint 14.7.7), 2016 (Word 15.37. PowerPoint 15.38)
This years theme #ChemTogether is all about our chemistry community. I’ve chosen this year to talk about our recent paper and how it came to be, and the people involved. This paper covers some work that was done over the course of 25 years. That’s a heck off a long time you say. The second thing you’re likely to say is “how can this work that is so old still be relevant?” And that’s a fair question. So buckle up, settle in and let me tell you a tale. Also depending on your timezone and inclination, you may want to make yourself a cup of tea or coffee, or maybe a beer. Or you might want to snag a copy of the paper here.
“Stereoelectronic effects on dienophile separation influence the Diels-Alder synthesis of molecular clefts” Martin J. Stoermer, Wasantha A. Wickramasinghe, Karl A. Byriel, David C. R. Hockless, Brian W. Skelton, Alexandre N. Sobolev, Alan H. White, Jeffrey Y. W. Mak, David P. Fairlie, European Journal of Organic Chemistry, 2017, in press.
This project was only ever supported by one grant, at the very beginning. The grant application was submitted in 1990 while we were employed by Bond University on the Gold Coast, and with an overseas collaborator who was visiting on sabbatical. At the time Bond was Australia’s only private university, but by the time that the grant was successful, Bond was in trouble financially, and by Christmas the science department had been shut, the academics sacked, and the postgraduate students cast adrift.
By the time the 3D Centre had found a new home at the University of Queensland which eventually, after waiting for the political fallout to subside took over administration of the grant, I began working on other, more medchem-based projects. This was to be the core focus of the new 3D Centre. And so, while working on our long running HIV protease program, this project was marginalised. Nevertheless I was still tinkering with the system in my spare time.
This pattern continued for a number of years, with the research not getting any more government funding including for me personally several postdoctoral fellowship applications. Occasionally we’d get spurred into action and do a little bit of work and have a minor success or two, but then we’d hit the wall and I’d go back to doing what I was paid to do. Some of this earlier work got published in 2003 and we thought we’d be able to wrap up the newer results up quickly. We did, but the paper got rejected several times in one form or another. Mostly the reviewers thought there wasn’t a complete package.
From the outset we knew that this project would need X-ray crystallography to help guide us in our chemistry, and also to see if we’d accomplished our goals. In this we grew our own crystals and sent them to the team at UWA to get the structures solved. Alan White and his team were a delight to work with in those early days, and we are all the poorer for his passing last year. I can still remember when the fax machine would spring into life when Alan sent through the structures. Yes, by fax. On more than one occasion I had the joyful job of re-typing in XYZ coordinate data by hand as it we couldn’t seem to get the data electronically. Times have certainly changed. As is often the case when projects are not officially funded, but manage to move along in fits and starts, some of the crystal structures were done a little closer to home, by Karl Byriel at UQ’s own chemistry department.
And so to the chemistry. Fundamentally we wanted to build a new class of rigid hydrophobic molecules, which we grandly defined in the original grant application as “enzyme mimics”. In phase one of the work we wanted to build simple systems based on Diels-Alder reactions of templates such as (1) above to create U-shaped structures, capable of binding very small hosts. One such molecular cleft (below) was shown to bind chloroform inside it’s narrow “binding site”. Others bound units of pyridine, cyclohexane, dichloromethane and water.
A rigid synthetic molecular receptor which binds small molecules
Sadly, the lack of granting support in later years meant we couldn’t move along to more complex systems which we had envisaged as catalysts and artificial enzymes.
One of the more interesting aspects of chemistry in this series of molecules, was the relative lack of reactivity at the carbonyl carbon of templates like (2). Whilst we could relatively easily reduce them to the corresponding diols, they were completely inert to reductive amination, which we wanted to use to make better diamine chelators. In addition, instead of reacting with Grignard reactions or alkyllithiums in the expected manner, the diene-dione instead gets alkylated four times with, in this case methyl groups.
Novel tetra alpha-alkylation of the diene dione
The steric constraint of this tetra methylation pushes the two alkene moieties even closer together. I won’t go into more details here, but pushing those two alkenes back and forth by changing the central ring has interesting effects on reaction rates.
The last hurdle is always to get the work published. And as I’ve said above, without funding, it’s particularly hard to get time and money to do that one last experiment that get’s it over the line. I’ve lost count of the number of rejections we’ve had on this over the years, both on this paper and a bunch or related work that still hasn’t found a home. But over the years, the negative and positive comments of reviewers have helped us along, so a big shout out to the often maligned reviewers, who are after all our peers and part of the chemistry community. As many of you know my ill health has enforced my early retirement from the lab, so it has been additionally hard to get this work published. And the final push in this case came when a colleague Jeff Mak came on board with new ideas, perspectives, and importantly, a pair of lab hands. And with the support of my longtime boss Professor David Fairlie, we finally got this one done. There’s more to do, but for now I can look happily at those 5 beautiful crystal structures, and say yes! They’ve been set free.
Test platform: ChemDraw Professional 17 (126.96.36.199), iMac 2011 8GB RAM, MacOS 10.12.6 Sierra; Microsoft Office 2011, 2016; and the latest Apple Pages 6.3, Keynote 7.3.
[Update 4 October: Added a short Youtube video showing the Drawing and Reaction Hotkeys in action]
I’ve spent about a week with the new version of ChemDraw now and have to say that this is probably for me the biggest feature leap for the product since Scifinder integration. My test machine is a comparatively old kit and it has run perfectly well. I should point out again, as I have noted before, that when these new ChemDraw versions come out, they are checked and certified for compatibility with the latest, shipping version of the Windows and Mac operating systems. On Mac, this means Sierra, as High Sierra was only released last week. Whilst I can’t foresee any reason why ChemDraw 17 wouldn’t run under High Sierra, it is not supported at this time. I normally expect formal certification for new OS’s in a *.0.1, *.1 or *.2 release.
The headline features of the new version of ChemDraw are new enhanced Hotkeys, HELM support, Document Tagging (metadata), and compatibility with the latest 64 bit Windows systems. But of course you all want to know one thing: Does it support round-trip editing with Word? In a word – Yes (but more on that later). This short review just covers the new features in the ChemDraw Professional application, which forms just a part of the ChemOffice Suite. As with previous versions which I have covered before, various packages are available for Macintosh and Windows PCs at different price points and features lists. The list of versions at the PerkinElmer SciStore is to be found here.
The feature that I’m most excited by are the new enhanced Hotkeys. The first thing you’re going to want to do is take a look at the ways Hot Keys have been improved to make drawing of molecules much, much faster. Helpfully, Perkin-Elmer have provided a handy cheat sheet. It’s available under File -> Open Samples-Enhanced Hot Keys Cheat Sheet.
The Hotkey cheat sheet provided with ChemDraw 17
These enhanced shortcuts will, with practice make drawing structures so much faster. I recommend printing that Cheat Sheet out, laminating it and sticking it to the top of your monitor or pinning it above your desk. Take some time to learn the most common ones such as “3” for a benzene ring and “6” for cyclohexyl. If a significant chunk of your life is spent drawing and editing structures (think thesis writing!) then this will save your hands from repetitive back-and-forth mousing to click on template icons. PerkinElmer and Chemistry World are running a Webinar on the new ChemDraw 17 on October 24th at 4pm (UK time), so I’d check that out if you want to see this in action.
Actually, while you’re looking at that Samples menu take some time to look at some of the others that are included there. Most of these have been there before but they are worth reiterating as really useful starting points for many types of complicated figures. I particularly like the Grignard reaction summary slides made by Roman Valiulin (@RomanValiulin), and the GPCR pathway templates which you can use as a starting point for any signalling pathway you care to draw.
Also new in ChemDraw 17 is a really neat new Reaction Hotkey feature. Lets say you have a simple 3 step synthesis with a core component. To generate a quick reaction scheme, simply select any object, hold down the command key and use an arrow key to automatically copy-paste a duplicate molecule in the direction you chose, including an arrow. So in the example below I drew the initial biphenyl, and then used “Command-Right Arrow” to create the first reaction, then “Command-Down” and “Command-Left” to generate the other two elements of the scheme. I then went back and using the hotkey “1” added a methyl group which was to become the changing element at the para position. The amino and iodo groups can be added without clicking and editing the newly created atom by simply using the “n” and “i” hotkeys respectively in mouse-over mode. The nitro group was the only thing that had to be edited manually.
HELM (Hierarchical Editing Language for Macromolecules) Support
What is HELM? Think of it as a kind of SMILES for biomolecules. Chemists use SMILES strings as a convenient way to move chemical structures from one format to another, or from database to database. Typically SMILES notation is used for small molecules although they can be used for larger biomolecules. The HELM project was started by scientists at Pfizer as a way of getting complicated biomolecules and their derivatives into a searchable form for corporate compound databases. The HELM project is trying to provide a way of systematically representing complicated biomolecules such as post-translationally modified proteins, peptide-DNA/RNA conjugates, glycosylated proteins, cyclic peptides and all of the above containing a range of synthetic and semi-synthetic amino acid monomers, and fluorescent tags.
Hexaleucine in HELM editor notation, HELM string, and expanded structure
Chemdraw now provides a way of creating and sharing HELM strings for these systems. If you’ve ever used ChemDraw’s peptide or DNA drawing tool (The BioPolymer Toolbar), the new HELM tool will feel familiar. The tool now provides tabs with which you can create the components of your molecule, lets say a short peptide sequence and a fluorescent tag as separate objects. In the sequence of pictures below below I have drawn hexaleucine, and an Alexafluor tag (Step 1), and then connected the objects using normal chemdraw bonding methods (Step2). Then using the Expand label tool (Step3) you can see the full molecule (Step4).
There are a few quirks in the HELM database currently. For example in the the CHEM tab of the HELM toolbar below, if you look closely you will see a few items that are named somewhat differently. For example the shortcut for adding an Alexafluor tag to a peptide has the abbreviation Alex and the mouseover name calls it Alexa Fluor 488 NHS ester, which as peptide chemists know is the reagent used to introduce an AlexaFluor tag (The N-hydroxysuccinimide ester), not the name of the tag itself. There are also somewhat confusingly two chem entities abbreviated chol.
Step 1: Using the HELM toolbar to draw the components as separate objects
Step2: Draw the bond between your desired positions
When you expand the abbreviations, ChemDraw also leaves a small label adjacent to each element (which I normally immediately delete)
Step3: Using the Expand Label and Clean up tools.
Step 4: The tagged peptide structure after clean up
Document tagging – User-defined metadata
Document tagging is a way of adding user-defined metadata to your ChemDraw documents, to enhance their searchability. It is invoked in ChemDraw 17 from the Edit -> Document Properties menu. If you have started from an empty document and not a corporate template then everything you see in the initial dialog is empty. There are no predefined fields so you can add items to the list pretty much as you want. I can imagine that in corporate settings there may be fields for user, compound identifiers, Project codes, company and division name etc. When you add a property, you can define it as being either optional, recommended, or compulsory to have the field filled.
Customisable metadata entry in Document Properties dialog
In academic settings this feature may be less rigorously adhered to as I can see that it would take a great deal of compliance. I can see the potential benefits though in chemistry groups where more than one person is working on a project. You could set project keywords, and you of course want to know who created the document in the first place. On their product page, Perkin Elmer note that these metadata are searchable by third party software including Attivio and Elastic. It will be interesting to see if they will also be searchable from within MacOS (Spotlight Search) or Windows Desktop Search.
Once again, PerkinElmer and Chemistry World are running a Webinar on the new ChemDraw 17 on October 24th at 4pm (UK time), during which they are sure to be showcasing these features live.
ChemDraw 2017 Review Part 2: Round-trip editing, a deeper look.
Whenever a new version of ChemDraw comes out the first question I get asked is always “Does it break round-trip editing with Word?” If I may steal for a moment a paragraph of my initial review of the 2016 release:
Much of this review is concerned with it’s interoperability with Microsoft Office 2011 (specifically Word), as probably the vast majority of theses and papers in (organic) chemistry written on Macs would be written with this or older versions of Word as leapfrogging software package upgrades have a long, and sometimes chequered history together on the Mac. For my previous posts on this subject see here, here, here, and here.
I am happy to report that this new version of ChemDraw does support round-trip editing with both Office 2011 and 2016 (as part of Office365). It behaves exactly as expected. Objects you’ve pasted in previously can be copy-pasted back into ChemDraw and edited, then pasted back into Word or Powerpoint.
Round Trip editing with Word 2016 and ChemDraw 17
Round trip editing with Powerpoint
Not such great news is that round-trip editing with Apple’s suite of Productivity apps, Pages and keynote is still broken. It was broken during the beta cycle with an earlier version of Pages (v5.6.1) as well so it’s not just the High Sierra-ready versions that have broken something. It seems that unfortunately due to the way that objects get transferred to the clipboard in Pages/Keynote, those objects are not editable when they get pasted back into ChemDraw.
That’s the short version but for those who are interested, I dug a bit deeper and uncovered a few clues about what parts of the process do work.
1) Open ChemDraw 17
2) Open ChemDraw file “2NaphKKRaldehyde.cdx”
3) Select structure
4) Copy to Clipboard
5) Open Pages 6.3
6) Paste item
7) Save Pages document “CD17Pages6test.pages”
8) Quit Pages
9) Open “CD17Pages6test.pages”
10) Select the structure,
11) Copy to clipboard
12) Open ChemDraw 2017
13) Paste item into new empty ChemDraw17 document
14) Structure is not editable, see object with blue boundaries only
Once again, the blue bounding box means it’s no longer editable.
Interestingly however, the required vector graphic information is present in the Pages file, as evidenced by the fact that whilst within Pages if you right-click on the ChemDraw image you can choose the Edit Mask function:
Applying the Pages Edit Mask function
And you are then provided with the Pages Mask popup, which includes a resize by slider function. Somewhat weirdly when you drag the slider to increase the size the structure initially expands beyond the bounding box, but when you finish, the popup disappears and you’re left with a truncated, but scaled up portion of the structure.
This truncated chemical structure was then pasted back into ChemDraw and unfortunately was still not editable, but did appear nicely scaled back in ChemDraw, and printed with no blurring of the lines.
Therefore it seems apparent that all the vector graphic information is present in the Pages document, but only some parts of it are transferring to the clipboard. Investigating further within ChemDraw 17, this newly pasted truncated structure behaves in some ways like a regular ChemDraw object. The initial structure above was created using the ACS template, but if you take the new truncated, but scaled up fragment back to ChemDraw and paste it in, it pastes at the scaled size. BUT, if you then use the “Apply Object Settings from…” function and choose a different template, in this case the RSC two-column format, it attempts a rescale, but gets it wrong. So there’s sufficient information placed back in the Mac clipboard from Pages that it knows it has scalable bonds, but can’t convert the information into an editable structure.
Long time Pymol users will be familiar with the fetch command by which it retrieved the selected file from the Protein Data Bank (PDB). Up until version 1.8, Pymol fetched the PDB version of the file by default over the mmCIF option, which the biomolecular crystallography community has long been advocating due to some deficiencies in the legacy PDB file format. Now however, Pymol fetches the mmCIF version by default and has a new default behaviour as well. When viewing the sequence bar,* Pymol now show the residues that were missing in the crystal structure, whether due to insufficient resolution or electron density. These are now displayed in grey lettering in the sequence bar (Figure 1).
Figure1. Pymol imports mmCIF files now by default with the fetch command, and displays missing residues as described in the metadata
There are also significant differences in how water molecules are listed. Previously water molecules were associated with each subunit or chain within the PDB file. In figure 2 I have loaded a second copy of the 3e90 structure which I had downloaded as a .pdb file. I had to rename it to 3e90a.pdb otherwise Pymol would have assumed it was a second state of the molecule which I did not want. You can see clearly here that for the PDB version, the water molecules associated with chain A are listed after the amino acids.
Figure2. Second copy of crystal structure manually loaded from a previously downloaded PDB file.
Similarly the water molecules and ligands associated with the PDB file are shown at the end of chain B (Figure 3). Whereas the initially loaded mmCIF file puts them all at the end of the line as separate objects, along with the extracted ligands and salts (Figure3).
Figure 3. in mmCIF files, ligands and water molecules are listed as separate chains, whereas with PDB files that are associated with a protein chain
I have not yet worked out a way to undisplay those missing residues short of deleting them, and they can cause issues when trying to align different structures. For example in Figure 4, the alignment of the missing residues in Box A with all the water molecules of chain A doesn’t make a lot of sense. On the other hand, the small Box B shows how the mmCIF file conveys useful information about the residues that are missing in that portion of the PDB version of the structure.
Figure 4. Alignments in Pymol can be problematic when missing residues “align” with water molecules
This way of displaying missing residues quickly gets confusing when overlaying multiple structures, in the case of Figure 5, multiple versions of the NS2B/NS3 protease from West Nile Virus. For the moment I would recommend downloading the PDB files individually as PDB files and hold off switching to mmCIF if Pymol is your principle viewer.
Figure 5. Using mmCIF versions for multiple alignments with several deletions represented by grey lettering can become confusing
Test Platform: MacBook Pro (2011), MacOS El Capitan 10.11.3, Pymol v. 188.8.131.52
*To turn the sequence bar on you can either press the S button third from the right at the bottom of the screen, or use the seq_view command.