Difference between revisions of "Google sheets hit picking"

From DISI
Jump to: navigation, search
(more details on how to sort by zinc number)
Line 1: Line 1:
 
To use Google sheets for hit-picking
 
To use Google sheets for hit-picking
  
1. '''Put zinc_ids in sorted order in a column, e.g. B2-B100'''
+
1. '''Sort zinc_ids numerically'''. This is really important because otherwise the results won't match up with the right compound.
 +
a. Put zinc_ids in a column, e.g. B2-B100'
 +
b. If you try to sort them directly, it will sort them lexigraphically, eg. put ZINC100 before ZINC34. So extract the zinc number from the zinc id by making a column 'ZINC_Number" by putting in cell C2
  
2. To depict the molecules copy and paste the following in each cells in new column. E.g. put this in cell C2 to depict the molecule in B2
+
    =VALUE(SUBSTITUTE(B2, 'ZINC', ''))
 +
 
 +
c. sort sort the cells by the ZINC_Number column by selecting the whole sheet and then data --> sort range... --> ZINC_Number
 +
 
 +
2. To depict the molecules copy and paste the following in each cells in new column. E.g. put this in cell D2 to depict the molecule in B2
  
 
     =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))
 
     =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))
  
3. Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put the first cell D1, the next . Look up additional columns of interest here `http://zinc15.docking.org/substances/help/`
+
3. Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put the first cell D1, the next . Look up additional columns of interest here `http://zinc15.docking.org/substances/help/`. As a sanity check include the zinc_id in the output column and verify that it corresponds with the input column
  
     Put in cell D1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=smiles mol_formula logp"))
+
     Put in cell E1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=zinc_id smiles mol_formula logp"))
     Put in cell G1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity"))
+
     Put in cell I1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity"))
     Put in cell K1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))
+
     Put in cell L1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))

Revision as of 09:38, 27 February 2019

To use Google sheets for hit-picking

1. Sort zinc_ids numerically. This is really important because otherwise the results won't match up with the right compound. a. Put zinc_ids in a column, e.g. B2-B100' b. If you try to sort them directly, it will sort them lexigraphically, eg. put ZINC100 before ZINC34. So extract the zinc number from the zinc id by making a column 'ZINC_Number" by putting in cell C2

   =VALUE(SUBSTITUTE(B2, 'ZINC', ))

c. sort sort the cells by the ZINC_Number column by selecting the whole sheet and then data --> sort range... --> ZINC_Number

2. To depict the molecules copy and paste the following in each cells in new column. E.g. put this in cell D2 to depict the molecule in B2

   =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))

3. Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put the first cell D1, the next . Look up additional columns of interest here `http://zinc15.docking.org/substances/help/`. As a sanity check include the zinc_id in the output column and verify that it corresponds with the input column

   Put in cell E1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=zinc_id smiles mol_formula logp"))
   Put in cell I1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity"))
   Put in cell L1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))