CREST
This is the webpage for containing everything to do with CREST.
CREST is an acronym for Conformer Rotamer Ensemble Sampling Tool and it uses semi-emperical methods from xTB in addition to genetic algorithms to find the lowest energy conformer for an input structure.
CREST Documentation
The CREST Documentation is really informative and descriptive. I would recommend you take a look at it. Also you should check out the xTB documentation which is the main QM/MM calculator CREST uses. Good examples on how CREST and xTB operate are on the Grimme-Lab Workshops webpage linked below.
- CREST: https://crest-lab.github.io/crest-docs/
- xTB: https://xtb-docs.readthedocs.io/
- Grimme-Lab Workshops: https://grimme-lab.github.io/workshops/
How to Install CREST using conda
-
If you are on Windows and have Anaconda/miniconda installed, go to search bar, look for "Anaconda Prompt" and open it. We can run CREST locally in this terminal.
-
If you are on Mac or Linux and you have conda/miniconda installed, open the terminal. We can run CREST locally in this terminal.
-
If you are on the HPCC, make sure you have conda installed. When you run crest, you need to run it either in an SLURM script (that you sbatch to run) or on an interactive node which you can do so by running the command below
salloc -p quanah -N 1 -n 36 -t 60
. This salloc command will request and give you a full interactive node on quanah (36 ntasks) for 1 hour (60 minutes). DO NOT RUN CREST ON THE LOGIN NODES ON THE HPCC!!! IF YOU DO THIS, 1) IT IS ILLEGAL and 2) IT MAKES IT SLOW FOR EVERYONE ELSE USING THE HPCC.
If you need to install conda, read my install conda tutorial Here.
For all operating systems, run the commands below. This command creates a new conda environment named crest with CREST and xTB downloaded in it. To activate the conda environment and load CREST you type conda activate crest
.
To know if you crest is loaded. You should see (crest) on the left of the terminal prompt line, such as the example below.
How to use CREST for Conformational Sampling
Calculation Setup and Running CREST
There are two example molecules we can try, Hexane (\(\mathrm{C_{6}H_{12}}\)) or Squalene (\(\mathrm{C_{30}H_{50}}\)). Hexane is a hydrocarbon and Squalene is an organic molecule with two possible ways to draw it on a 2D surface. Both molecules have many rotatable C-C bonds with each bond rotation being multiple possible different conformations. The Hexane CREST calculation will finish alot faster than the Squalene CREST calculation.
SMILE String for Hexane:
SMILE String for Squalene:
For CREST to run, we need an initial .xyz structure. Lets generate it using the RDKit library and the SMILE string of Squalene. Run the below code either in a python file or the python kernel.
from rdkit import Chem
from rdkit.Chem import AllChem
import os
# For Squalene
smile_string = r"CC(=CCC/C(=C/CC/C(=C/CC/C=C(/CC/C=C(/CCC=C(C)C)\C)\C)/C)/C)C" #we placed r before the string to make it a raw string because of the backslashes present in the smile string. without the r and python could interpret them as special characters.
xyz_filename = "squalene_initial.xyz" #file you want the molecule to be saved to.
# For Hexane
smile_string = "CCCCCC"
xyz_filename = "hexane_initial.xyz" #file you want the molecule to be saved to.
#Main RDkit code for SMILE generation to a .xyz file
rdkit_mol = Chem.AddHs(Chem.MolFromSmiles(smile_string)) #generates a 2d rdkit_mol object and adds Hydrogens where it is not specified in the SMILE string
AllChem.EmbedMolecule(rdkit_mol) #makes the 2d rdkit_mol object into a 3d rdkit_mol object
Chem.MolToXYZFile(rdkit_mol, os.path.join(os.getcwd(), xyz_filename)) #saves rdkit_mol object to a file in an .xyz format
CREST documentation reccomends to pre optimize the initial structure using the same level of theory we will use in CREST so, lets first run xTB to optimize squalene_initial.xyz. Let's run this command in a new directory. Also xTB, is downloaded when we install CREST so we can use the same conda environment.
Now lets run CREST! Run the command below in a new directory or sbatch the SLURM script below. We can add > crest.out
to the command to direct the stdout to a file so we can save the CREST terminal output. The default algorithim CREST uses is the iMTD-GC algorithim which is explained HERE. We can add a solvent model with this flag --gbsa hexane
. You can specify the amount of CPU Threads (on HPCC we call it ntasks) by using flag -T 36
for 36 ntasks on quanah for example.
- NOTE: If you see an OpenBLAS Warning, it is OKAY, xTB has this warning and I believe it comes from how we installed xTB through conda. The xTB and CREST calculation will still work fine to my knowledge.
If you use the SLURM script, replace ERAIDER
in the source and export lines with your eraider on the HPCC.
#!/bin/bash
#SBATCH --job-name=crest
#SBATCH --partition quanah
#SBATCH --nodes=1
#SBATCH --ntasks=36
#SBATCH --time=04:00:00
#SBATCH --mem-per-cpu=1G
#set up environment for crest calculation
source /home/ERAIDER/conda/etc/profile.d/conda.sh
export PATH=/home/ERAIDER/conda/bin:$PATH
conda activate crest
#check to see if initial geometry file exists
if [ ! -f initial_opt.xyz ]
then
echo "NO initial_opt.xyz file exists. exiting!!!"
exit
fi
#run crest
crest initial_opt.xyz --gfn2 -T 36 > crest.out
Looking at the Output Files CREST Generates
The list below is what files CREST generates for a conformational sampling. Important files are listed below and the rest are in the table:
- crest_best.xyz - the lowest energy conformer that CREST found.
- crest_conformers.xyz - has all of the conformers listed that CREST found.
- crest_rotamers.xyz - has all of the rotamers listed that CREST found. It contains all of the conformers in crest_conformers.xyz and all of the rotamers CREST found for each conformer.
- crest.out - the terminal output of CREST.
confcross.xyz | coord | cre_members |
crest_0.mdrestart | crest_dynamics.trj | crest.energies |
crest_input_copy.xyz | crestopt.log | crest.restart |
crest_rotamers.xyz | ensemble_energies.log | gfnff_adjacency |
gfnff_topo | initial_opt.xyz | wbo |
What CREST Does
Looking at our example of Hexane. CREST takes the initial structure we give it and determines the bonds and connectivity seen in the .xyz file. The algorithms CREST uses are described in more detail in this paper linked TTU Library Link. The below slideshow shows the initial structure and the best structure CREST found.
The Below Slideshow shows all of the different Conformations CREST produced for Hexane. All of these structures are from the crest_conformers.xyz file in the original order it produced.
Conformer 1
\(\mathrm{E_Total}\) = -19.993838 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{1}-E_1}\)= 0.000000 \(\mathrm{E_h}\) = 0.000000 \(\mathrm{eV}\) = 0.000000 \(\mathrm{\frac{kcal}{mol}}\) = 0.000000 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 2
\(\mathrm{E_Total}\) = -19.992963 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{2}-E_1}\)= 0.000875 \(\mathrm{E_h}\) = 0.023629 \(\mathrm{eV}\) = 0.551332 \(\mathrm{\frac{kcal}{mol}}\) = 2.275338 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 3
\(\mathrm{E_Total}\) = -19.992925 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{3}-E_1}\)= 0.000913 \(\mathrm{E_h}\) = 0.024638 \(\mathrm{eV}\) = 0.574881 \(\mathrm{\frac{kcal}{mol}}\) = 2.372526 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 4
\(\mathrm{E_Total}\) = -19.992111 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{4}-E_1}\)= 0.001726 \(\mathrm{E_h}\) = 0.046615 \(\mathrm{eV}\) = 1.087682 \(\mathrm{\frac{kcal}{mol}}\) = 4.488848 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 5
\(\mathrm{E_Total}\) = -19.992034 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{5}-E_1}\)= 0.001804 \(\mathrm{E_h}\) = 0.048703 \(\mathrm{eV}\) = 1.136407 \(\mathrm{\frac{kcal}{mol}}\) = 4.689932 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 6
\(\mathrm{E_Total}\) = -19.991982 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{6}-E_1}\)= 0.001856 \(\mathrm{E_h}\) = 0.050113 \(\mathrm{eV}\) = 1.169299 \(\mathrm{\frac{kcal}{mol}}\) = 4.825678 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 7
\(\mathrm{E_Total}\) = -19.991885 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{7}-E_1}\)= 0.001953 \(\mathrm{E_h}\) = 0.052718 \(\mathrm{eV}\) = 1.230088 \(\mathrm{\frac{kcal}{mol}}\) = 5.076552 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 8
\(\mathrm{E_Total}\) = -19.991154 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{8}-E_1}\)= 0.002684 \(\mathrm{E_h}\) = 0.072464 \(\mathrm{eV}\) = 1.690838 \(\mathrm{\frac{kcal}{mol}}\) = 6.978062 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 9
\(\mathrm{E_Total}\) = -19.990123 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{9}-E_1}\)= 0.003715 \(\mathrm{E_h}\) = 0.100303 \(\mathrm{eV}\) = 2.340406 \(\mathrm{\frac{kcal}{mol}}\) = 9.658818 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 10
\(\mathrm{E_Total}\) = -19.990069 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{10}-E_1}\)= 0.003769 \(\mathrm{E_h}\) = 0.101760 \(\mathrm{eV}\) = 2.374401 \(\mathrm{\frac{kcal}{mol}}\) = 9.799114 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 11
\(\mathrm{E_Total}\) = -19.989594 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{11}-E_1}\)= 0.004244 \(\mathrm{E_h}\) = 0.114593 \(\mathrm{eV}\) = 2.673846 \(\mathrm{\frac{kcal}{mol}}\) = 11.034920 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 12
\(\mathrm{E_Total}\) = -19.989445 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{12}-E_1}\)= 0.004393 \(\mathrm{E_h}\) = 0.118606 \(\mathrm{eV}\) = 2.767464 \(\mathrm{\frac{kcal}{mol}}\) = 11.421280 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 13
\(\mathrm{E_Total}\) = -19.989391 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{13}-E_1}\)= 0.004447 \(\mathrm{E_h}\) = 0.120066 \(\mathrm{eV}\) = 2.801541 \(\mathrm{\frac{kcal}{mol}}\) = 11.561914 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 14
\(\mathrm{E_Total}\) = -19.989332 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{14}-E_1}\)= 0.004506 \(\mathrm{E_h}\) = 0.121660 \(\mathrm{eV}\) = 2.838742 \(\mathrm{\frac{kcal}{mol}}\) = 11.715444 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 15
\(\mathrm{E_Total}\) = -19.989165 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{15}-E_1}\)= 0.004673 \(\mathrm{E_h}\) = 0.126168 \(\mathrm{eV}\) = 2.943921 \(\mathrm{\frac{kcal}{mol}}\) = 12.149514 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 16
\(\mathrm{E_Total}\) = -19.989108 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{16}-E_1}\)= 0.004730 \(\mathrm{E_h}\) = 0.127702 \(\mathrm{eV}\) = 2.979705 \(\mathrm{\frac{kcal}{mol}}\) = 12.297194 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 17
\(\mathrm{E_Total}\) = -19.988352 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{17}-E_1}\)= 0.005486 \(\mathrm{E_h}\) = 0.148125 \(\mathrm{eV}\) = 3.456243 \(\mathrm{\frac{kcal}{mol}}\) = 14.263860 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 18
\(\mathrm{E_Total}\) = -19.988347 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{18}-E_1}\)= 0.005490 \(\mathrm{E_h}\) = 0.148242 \(\mathrm{eV}\) = 3.458990 \(\mathrm{\frac{kcal}{mol}}\) = 14.275196 \(\mathrm{\frac{kJ}{mol}}\)
Conformer 19
\(\mathrm{E_Total}\) = -19.986849 \(\mathrm{E_h}\)
\(\Delta\mathrm{E_{gs}}\) = \(\mathrm{E_{19}-E_1}\)= 0.006989 \(\mathrm{E_h}\) = 0.188705 \(\mathrm{eV}\) = 4.403114 \(\mathrm{\frac{kcal}{mol}}\) = 18.171582 \(\mathrm{\frac{kJ}{mol}}\)
All of the Structures shown above are viable Hexane conformations and you can notice that the \(\Delta\mathrm{E_{gs}}\) increases across the conformers.
So now we can be confident in continuing with the crest_best.xyz structure to use in further studies and calculations.