Molecular Docking with GNINA 1.0

David Ryan Koes

Royal Society of Chemistry Chemical Information & Computer Applications Group

May 27, 2021

Get Started: https://colab.research.google.com/drive/1GXmk1v8C-c4UtyKFqIm9HnsrVYH0pI-c

In [1]:
%%html

<style>
div.prompt {display:none}
div.output_subarea  {max-width: 100%}
</style>

<script>

$3Dmolpromise = new Promise((resolve, reject) => { 
    require(['https://3dmol.org/build/3Dmol-nojquery.js'], function(){       
            resolve();});
});


require(['https://cdnjs.cloudflare.com/ajax/libs/Chart.js/2.2.2/Chart.js'], function(Ch){
 Chart = Ch;
});

$('head').append('<link rel="stylesheet" href="https://bits.csb.pitt.edu/asker.js/themes/asker.default.css" />');


//the callback is provided a canvas object and data 
var chartmaker = function(canvas, labels, data) {
  var ctx = $(canvas).get(0).getContext("2d");
     var dataset = {labels: labels,                     
    datasets:[{
     data: data,
     backgroundColor: "rgba(150,64,150,0.5)",
         fillColor: "rgba(150,64,150,0.8)",    
  }]};
  var myBarChart = new Chart(ctx,{type:'bar',data:dataset,options:{legend: {display:false},
        scales: {
            yAxes: [{
                ticks: {
                    min: 0,
                }
            }]}}});
};

$(".input .o:contains(html)").closest('.input').hide();

</script>

<script src="https://bits.csb.pitt.edu/asker.js/lib/asker.js"></script>

Acknowledgements

Andrew McNutt, Paul Francoeur, Rishal Aggarwal, Tomohide Masuda, Rocco Meli, Matthew Ragoza, Jocelyn Sunseri

In [2]:
%%html
<div id="whydock" style="width: 500px"></div>
<script>
$('head').append('<link rel="stylesheet" href="https://bits.csb.pitt.edu/asker.js/themes/asker.default.css" />');

    var divid = '#whydock';
	jQuery(divid).asker({
	    id: divid,
	    question: "Why do you most want to dock?",
		answers: ['Predict pose','Virtual screening','Affinity prediction',"I don't know"],
        server: "https://bits.csb.pitt.edu/asker.js/example/asker.cgi",
		charter: chartmaker})
    
 $(".input .o:contains(html)").closest('.input').hide();


</script>

What is molecular docking?

Predict the most likely conformation and pose of a ligand in a protein binding site.

  • Sample conformational space
  • Score poses
    • Ideally score equals affinity or can be used to productively rank compounds
    • Score $\ne$ Free Energy
In [3]:
%%html
<iframe width="560" height="315" src="https://3dmol.org/tests/docking.html" title="docking" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<script>
 $(".input .o:contains(html)").closest('.input').hide();
</script>

Inherent limitations of docking

Docking is intended to be high-throughput and fundamentally limiting approximations are made to achieve this.

  • Receptor usually kept rigid or mostly rigid (limited side-chain flexibility)
  • Ligand flexibility usually limited to torsions
  • No explicit solvent model

Software Lineage

AutoDock Vina

Designed and implemented by Dr. Oleg Trott at the Scripps Research Institute.

Shared no code with AutoDock.

Focus on performance. Created new scoring function optimized for pose prediction.

Open Source Apache License

Published 2009, last update (version 1.1.2) 2011

Software Lineage

smina
Scoring and minimization with AutoDock Vina

We forked Vina to make it easier to use, especially for custom scoring function development and ligand minimization.

(Almost) identical behavior as Autodock Vina (just easier to use).

Apache/GPL2 Open Source License

Very stable source code. In maintence mode. Features are a subset of GNINA.

Software Lineage

GNINA
A deep learning framework for molecular docking

A fork of smina that supports using convolutional neural networks to score protein-ligand poses.

Do not promise identical results to Autodock Vina or smina.

Requires a lot more dependencies (including CUDA).

In [4]:
!wget https://downloads.sourceforge.net/project/smina/smina.static
--2021-05-26 22:45:24--  https://downloads.sourceforge.net/project/smina/smina.static
Resolving downloads.sourceforge.net (downloads.sourceforge.net)... 216.105.38.13
Connecting to downloads.sourceforge.net (downloads.sourceforge.net)|216.105.38.13|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://versaweb.dl.sourceforge.net/project/smina/smina.static [following]
--2021-05-26 22:45:24--  https://versaweb.dl.sourceforge.net/project/smina/smina.static
Resolving versaweb.dl.sourceforge.net (versaweb.dl.sourceforge.net)... 162.251.232.173
Connecting to versaweb.dl.sourceforge.net (versaweb.dl.sourceforge.net)|162.251.232.173|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9853920 (9.4M) [application/octet-stream]
Saving to: ‘smina.static’

smina.static        100%[===================>]   9.40M  3.37MB/s    in 2.8s    

2021-05-26 22:45:28 (3.37 MB/s) - ‘smina.static’ saved [9853920/9853920]

In [5]:
!wget https://github.com/gnina/gnina/releases/download/v1.0.1/gnina
--2021-05-26 22:45:28--  https://github.com/gnina/gnina/releases/download/v1.0.1/gnina
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-releases.githubusercontent.com/45548146/47de2300-8bd4-11eb-8355-430c51e07fae?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20210527%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210527T024528Z&X-Amz-Expires=300&X-Amz-Signature=6b7e83aaead5347dbedbb339144d0b968b158ad46f25ad3a9b660244011605c7&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=45548146&response-content-disposition=attachment%3B%20filename%3Dgnina&response-content-type=application%2Foctet-stream [following]
--2021-05-26 22:45:28--  https://github-releases.githubusercontent.com/45548146/47de2300-8bd4-11eb-8355-430c51e07fae?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20210527%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210527T024528Z&X-Amz-Expires=300&X-Amz-Signature=6b7e83aaead5347dbedbb339144d0b968b158ad46f25ad3a9b660244011605c7&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=45548146&response-content-disposition=attachment%3B%20filename%3Dgnina&response-content-type=application%2Foctet-stream
Resolving github-releases.githubusercontent.com (github-releases.githubusercontent.com)... 185.199.111.154, 185.199.110.154, 185.199.109.154, ...
Connecting to github-releases.githubusercontent.com (github-releases.githubusercontent.com)|185.199.111.154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 562802104 (537M) [application/octet-stream]
Saving to: ‘gnina’

gnina               100%[===================>] 536.73M  31.0MB/s    in 14s     

2021-05-26 22:45:42 (38.8 MB/s) - ‘gnina’ saved [562802104/562802104]

In [6]:
!du -sh smina.static gnina
9.4M	smina.static
537M	gnina

I wasn't kidding about the extra dependencies!

However, if you are going to use gnina frequently you should build it from source so it uses the versions of libraries installed on your system (especially CUDA) which will result in a much smaller executable.

In [7]:
!du -sh /usr/local/bin/gnina
43M	/usr/local/bin/gnina

Running GNINA

In [8]:
!chmod +x ./gnina #make executable
In [9]:
!./gnina
Missing receptor.

Correct usage:

Input:
  -r [ --receptor ] arg            rigid part of the receptor
  --flex arg                       flexible side chains, if any (PDBQT)
  -l [ --ligand ] arg              ligand(s)
  --flexres arg                    flexible side chains specified by comma 
                                   separated list of chain:resid
  --flexdist_ligand arg            Ligand to use for flexdist
  --flexdist arg                   set all side chains within specified 
                                   distance to flexdist_ligand to flexible
  --flex_limit arg                 Hard limit for the number of flexible 
                                   residues
  --flex_max arg                   Retain at at most the closest flex_max 
                                   flexible residues

Search space (required):
  --center_x arg                   X coordinate of the center
  --center_y arg                   Y coordinate of the center
  --center_z arg                   Z coordinate of the center
  --size_x arg                     size in the X dimension (Angstroms)
  --size_y arg                     size in the Y dimension (Angstroms)
  --size_z arg                     size in the Z dimension (Angstroms)
  --autobox_ligand arg             Ligand to use for autobox
  --autobox_add arg                Amount of buffer space to add to 
                                   auto-generated box (default +4 on all six 
                                   sides)
  --autobox_extend arg (=1)        Expand the autobox if needed to ensure the 
                                   input conformation of the ligand being 
                                   docked can freely rotate within the box.
  --no_lig                         no ligand; for sampling/minimizing flexible 
                                   residues

Scoring and minimization options:
  --scoring arg                    specify alternative built-in scoring 
                                   function: ad4_scoring default dkoes_fast 
                                   dkoes_scoring dkoes_scoring_old vina vinardo
  --custom_scoring arg             custom scoring function file
  --custom_atoms arg               custom atom type parameters file
  --score_only                     score provided ligand pose
  --local_only                     local search only using autobox (you 
                                   probably want to use --minimize)
  --minimize                       energy minimization
  --randomize_only                 generate random poses, attempting to avoid 
                                   clashes
  --num_mc_steps arg               number of monte carlo steps to take in each 
                                   chain
  --num_mc_saved arg               number of top poses saved in each monte 
                                   carlo chain
  --minimize_iters arg (=0)        number iterations of steepest descent; 
                                   default scales with rotors and usually isn't
                                   sufficient for convergence
  --accurate_line                  use accurate line search
  --simple_ascent                  use simple gradient ascent
  --minimize_early_term            Stop minimization before convergence 
                                   conditions are fully met.
  --minimize_single_full           During docking perform a single full 
                                   minimization instead of a truncated 
                                   pre-evaluate followed by a full.
  --approximation arg              approximation (linear, spline, or exact) to 
                                   use
  --factor arg                     approximation factor: higher results in a 
                                   finer-grained approximation
  --force_cap arg                  max allowed force; lower values more gently 
                                   minimize clashing structures
  --user_grid arg                  Autodock map file for user grid data based 
                                   calculations
  --user_grid_lambda arg (=-1)     Scales user_grid and functional scoring
  --print_terms                    Print all available terms with default 
                                   parameterizations
  --print_atom_types               Print all available atom types

Convolutional neural net (CNN) scoring:
  --cnn_scoring arg (=1)           Amount of CNN scoring: none, rescore 
                                   (default), refinement, all
  --cnn arg                        built-in model to use, specify 
                                   PREFIX_ensemble to evaluate an ensemble of 
                                   models starting with PREFIX: 
                                   crossdock_default2018 crossdock_default2018_
                                   1 crossdock_default2018_2 
                                   crossdock_default2018_3 
                                   crossdock_default2018_4 default2017 dense 
                                   dense_1 dense_2 dense_3 dense_4 
                                   general_default2018 general_default2018_1 
                                   general_default2018_2 general_default2018_3 
                                   general_default2018_4 redock_default2018 
                                   redock_default2018_1 redock_default2018_2 
                                   redock_default2018_3 redock_default2018_4
  --cnn_model arg                  caffe cnn model file; if not specified a 
                                   default model will be used
  --cnn_weights arg                caffe cnn weights file (*.caffemodel); if 
                                   not specified default weights (trained on 
                                   the default model) will be used
  --cnn_resolution arg (=0.5)      resolution of grids, don't change unless you
                                   really know what you are doing
  --cnn_rotation arg (=0)          evaluate multiple rotations of pose (max 24)
  --cnn_update_min_frame           During minimization, recenter coordinate 
                                   frame as ligand moves
  --cnn_freeze_receptor            Don't move the receptor with respect to a 
                                   fixed coordinate system
  --cnn_mix_emp_force              Merge CNN and empirical minus forces
  --cnn_mix_emp_energy             Merge CNN and empirical energy
  --cnn_empirical_weight arg (=1)  Weight for scaling and merging empirical 
                                   force and energy 
  --cnn_outputdx                   Dump .dx files of atom grid gradient.
  --cnn_outputxyz                  Dump .xyz files of atom gradient.
  --cnn_xyzprefix arg (=gradient)  Prefix for atom gradient .xyz files
  --cnn_center_x arg               X coordinate of the CNN center
  --cnn_center_y arg               Y coordinate of the CNN center
  --cnn_center_z arg               Z coordinate of the CNN center
  --cnn_verbose                    Enable verbose output for CNN debugging

Output:
  -o [ --out ] arg                 output file name, format taken from file 
                                   extension
  --out_flex arg                   output file for flexible receptor residues
  --log arg                        optionally, write log file
  --atom_terms arg                 optionally write per-atom interaction term 
                                   values
  --atom_term_data                 embedded per-atom interaction terms in 
                                   output sd data
  --pose_sort_order arg (=0)       How to sort docking results: CNNscore 
                                   (default), CNNaffinity, Energy

Misc (optional):
  --cpu arg                        the number of CPUs to use (the default is to
                                   try to detect the number of CPUs or, failing
                                   that, use 1)
  --seed arg                       explicit random seed
  --exhaustiveness arg (=8)        exhaustiveness of the global search (roughly
                                   proportional to time)
  --num_modes arg (=9)             maximum number of binding modes to generate
  --min_rmsd_filter arg (=1)       rmsd value used to filter final poses to 
                                   remove redundancy
  -q [ --quiet ]                   Suppress output messages
  --addH arg                       automatically add hydrogens in ligands (on 
                                   by default)
  --stripH arg                     remove hydrogens from molecule _after_ 
                                   performing atom typing for efficiency (on by
                                   default)
  --device arg (=0)                GPU device to use
  --no_gpu                         Disable GPU acceleration, even if available.

Configuration file (optional):
  --config arg                     the above options can be put here

Information (optional):
  --help                           display usage summary
  --help_hidden                    display usage summary with hidden options
  --version                        display program version

In [10]:
%%html
<div id="gnsucc" style="width: 500px"></div>
<script>
$('head').append('<link rel="stylesheet" href="https://bits.csb.pitt.edu/asker.js/themes/asker.default.css" />');

    var divid = '#gnsucc';
	jQuery(divid).asker({
	    id: divid,
	    question: "Were you able to run gnina in colab?",
		answers: ['Yes','No','Eh'],
        server: "https://bits.csb.pitt.edu/asker.js/example/asker.cgi",
		charter: chartmaker})
    
 $(".input .o:contains(html)").closest('.input').hide();


</script>

How does it work?

Setup Example

In [11]:
!wget http://files.rcsb.org/download/3ERK.pdb
--2021-05-26 22:45:44--  http://files.rcsb.org/download/3ERK.pdb
Resolving files.rcsb.org (files.rcsb.org)... 128.6.158.70
Connecting to files.rcsb.org (files.rcsb.org)|128.6.158.70|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/octet-stream]
Saving to: ‘3ERK.pdb’

3ERK.pdb                [ <=>                ] 270.37K  --.-KB/s    in 0.07s   

2021-05-26 22:45:44 (3.68 MB/s) - ‘3ERK.pdb’ saved [276858]

In [12]:
!grep ATOM 3ERK.pdb > rec.pdb
In [13]:
!obabel rec.pdb -Orec.pdb  # "sanitizing" receptor for openbabel
==============================
*** Open Babel Warning  in PerceiveBondOrders
  Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is rec.pdb)

1 molecule converted
In [14]:
!grep SB4 3ERK.pdb > lig.pdb
In [15]:
import py3Dmol
v = py3Dmol.view(height=400)
v.addModel(open('rec.pdb').read())
v.setStyle({'cartoon':{},'stick':{'radius':0.15}})
v.addModel(open('lig.pdb').read())
v.setStyle({'model':1},{'stick':{'colorscheme':'greenCarbon'}})
v.zoomTo({'model':1})

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol