Google sheets hit picking: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
To use Google sheets for hit-picking
To use Google sheets for hit-picking


1. '''Put zinc_ids in sorted order in a column, e.g. B2-B100'''
<ol>
<li>'''Sort zinc_ids numerically'''. This is really important because otherwise the results won't match up with the right compound.
<ol>
<li>Put zinc_ids in a column, e.g. B2-B100'</li>
<li>If you try to sort them directly, it will sort them lexigraphically, eg. put ZINC100 before ZINC34. So extract the zinc number from the zinc id by making a column 'ZINC_Number" by putting in cell C2</li>


2. To depict the molecules copy and paste the following in each cells in new column. E.g. put this in cell C2 to depict the molecule in B2
    =VALUE(SUBSTITUTE(B2, "ZINC", ""))
 
<li>sort sort the cells by the ZINC_Number column by selecting the whole sheet and then data --> sort range... --> ZINC_Number</li>
</li>
</ol>
<li>To depict the molecules copy and paste the following in each cells in new column. E.g. put this in cell D2 to depict the molecule in B2</li>


     =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))
     =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))


3. Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put the first cell D1, the next . Look up additional columns of interest here `http://zinc15.docking.org/substances/help/`
<li>Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put the first cell D1, the next . Look up additional columns of interest here http://zinc15.docking.org/substances/help/. As a sanity check include the zinc_id in the output column and verify that it corresponds with the input column</li>


     Put in cell D1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=smiles mol_formula logp"))
     Put in cell E1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=zinc_id smiles mol_formula logp"))
     Put in cell G1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity"))
     Put in cell I1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity"))
     Put in cell K1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))
     Put in cell L1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))
</ol>

Revision as of 16:57, 13 June 2019

To use Google sheets for hit-picking

  1. Sort zinc_ids numerically. This is really important because otherwise the results won't match up with the right compound.
    1. Put zinc_ids in a column, e.g. B2-B100'
    2. If you try to sort them directly, it will sort them lexigraphically, eg. put ZINC100 before ZINC34. So extract the zinc number from the zinc id by making a column 'ZINC_Number" by putting in cell C2
    3. =VALUE(SUBSTITUTE(B2, "ZINC", ""))
    4. sort sort the cells by the ZINC_Number column by selecting the whole sheet and then data --> sort range... --> ZINC_Number
  2. To depict the molecules copy and paste the following in each cells in new column. E.g. put this in cell D2 to depict the molecule in B2
  3. =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))
  4. Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put the first cell D1, the next . Look up additional columns of interest here http://zinc15.docking.org/substances/help/. As a sanity check include the zinc_id in the output column and verify that it corresponds with the input column
  5. Put in cell E1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=zinc_id smiles mol_formula logp")) Put in cell I1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity")) Put in cell L1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))