Filtering ligands for novelty: Difference between revisions
Chasemwebb (talk | contribs) No edit summary |
Chasemwebb (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
Written by Chase Webb 09-01-2018 | Written by Chase Webb 09-01-2018 | ||
After a large scale docking campaign, it is important to remove prospective ligands that are too similar to compounds that are already known to modulate the receptor. In this way, we can focus on assessing new chemical interactions. This is best completed after clustering has been conducted as specified here [ | After a large scale docking campaign, it is important to remove prospective ligands that are too similar to compounds that are already known to modulate the receptor. In this way, we can focus on assessing new chemical interactions. This is best completed after clustering has been conducted as specified here:[http://wiki.bkslab.org/index.php/How_to_process_results_from_a_large-scale_docking Processing Results from LSD] | ||
=This process proceeds in the following steps:= | =This process proceeds in the following steps:= | ||
Make a new directory to do similarity filtering. | |||
Make a symbolic link to the location where clustering occurred. | |||
1. '''Generate a list of smiles for the known compounds.''' The most simple way to do this is to download them from ZINC. For the Mu opioid receptor (OPRM1) for instance, go here: [https://zinc15.docking.org/genes/home/ ZINC15 Genes] | 1. '''Generate a list of smiles for the known compounds.''' The most simple way to do this is to download them from ZINC. For the Mu opioid receptor (OPRM1) for instance, go here: [https://zinc15.docking.org/genes/home/ ZINC15 Genes] | ||
Line 10: | Line 14: | ||
2. '''Generate Fingerprints for the known compounds''' Run the following script written by TEB and JKL. The inputs are name of the knowns file and the name of the output fingerprint file. | 2. '''Generate Fingerprints for the known compounds.''' Run the following script written by TEB and JKL. The inputs are name of the knowns file and the name of the output fingerprint file. | ||
python ~jklyu/zzz.github/ChemInfTools/utils/teb_chemaxon_cheminf_tools/generate_chemaxon_fingerprints.py knowns_list.smi knowns | python ~jklyu/zzz.github/ChemInfTools/utils/teb_chemaxon_cheminf_tools/generate_chemaxon_fingerprints.py knowns_list.smi knowns | ||
3. | 3. '''Convert the fingerprints from binary to unsigned integers.''' Run the following script written by TEB and JKL. The inputs are the bitstrings generated from the above script, the smiles file used to generate the above script, and the prefix of the output file. You will need to do this for the knowns and the clusterheads that were calculated in the previous tutorial: [http://wiki.bkslab.org/index.php/How_to_process_results_from_a_large-scale_docking Processing Results from LSD] | ||
python ~jklyu/zzz.github/ChemInfTools/utils/convert_fp_2_fp_in_16unit/convert_fp_2_fp_in_uint16 knowns.fp knowns.fp knowns_list.smi knowns | |||
python ~jklyu/zzz.github/ChemInfTools/utils/convert_fp_2_fp_in_16unit/convert_fp_2_fp_in_uint16 extract_all.topN.sort.uniq.fp extract_all.topN.zincid.sort.uniq.smi topN_clusterhead | |||
4. '''Calculate an all by all TC matrix for the knowns against the clusterheads.''' |
Revision as of 22:14, 1 October 2018
Written by Chase Webb 09-01-2018
After a large scale docking campaign, it is important to remove prospective ligands that are too similar to compounds that are already known to modulate the receptor. In this way, we can focus on assessing new chemical interactions. This is best completed after clustering has been conducted as specified here:Processing Results from LSD
This process proceeds in the following steps:
Make a new directory to do similarity filtering.
Make a symbolic link to the location where clustering occurred.
1. Generate a list of smiles for the known compounds. The most simple way to do this is to download them from ZINC. For the Mu opioid receptor (OPRM1) for instance, go here: ZINC15 Genes
2. Generate Fingerprints for the known compounds. Run the following script written by TEB and JKL. The inputs are name of the knowns file and the name of the output fingerprint file.
python ~jklyu/zzz.github/ChemInfTools/utils/teb_chemaxon_cheminf_tools/generate_chemaxon_fingerprints.py knowns_list.smi knowns
3. Convert the fingerprints from binary to unsigned integers. Run the following script written by TEB and JKL. The inputs are the bitstrings generated from the above script, the smiles file used to generate the above script, and the prefix of the output file. You will need to do this for the knowns and the clusterheads that were calculated in the previous tutorial: Processing Results from LSD
python ~jklyu/zzz.github/ChemInfTools/utils/convert_fp_2_fp_in_16unit/convert_fp_2_fp_in_uint16 knowns.fp knowns.fp knowns_list.smi knowns python ~jklyu/zzz.github/ChemInfTools/utils/convert_fp_2_fp_in_16unit/convert_fp_2_fp_in_uint16 extract_all.topN.sort.uniq.fp extract_all.topN.zincid.sort.uniq.smi topN_clusterhead
4. Calculate an all by all TC matrix for the knowns against the clusterheads.