Google sheets hit picking: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
To use Google sheets for hit-picking
To display substance 2D structure by smiles
<ol>
<li> Put the smiles in column e.g. B2-B100 and then put this formula in the adjacent column</li>


    =IMAGE(CONCATENATE("http://zinc15.docking.org/apps/mol/draw.png?smiles=", ENCODEURL(B59)))
<li> I sometimes have issues with the compounds loading, so it may be useful to save them once they load. Copy column and paste in adjacent column and select "by value" from the drop-down list</li>
</ol>
To Displaying substance 2D structure from ZINC_ID
<ol>
<ol>
<li>'''Sort zinc_ids numerically'''. This is really important because otherwise the results won't match up with the right compound.
<li>'''Sort zinc_ids numerically'''. This is really important because ZINC returns the values sorted numerically and if they aren't pre-sorted results won't match up with the right compound.
<ol>
<ol>
<li>Put zinc_ids in a column, e.g. B2-B100'</li>
<li>Put zinc_ids in a column, e.g. B2-B100'</li>
Line 16: Line 24:
     =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))
     =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))


<li>Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put the first cell D1, the next . Look up additional columns of interest here `http://zinc15.docking.org/substances/help/`. As a sanity check include the zinc_id in the output column and verify that it corresponds with the input column</li>
<li>Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put ZINC_Number in column B and to retrieve data from ZINC into columns E-H, make those columns empty an put into cell E1 the IMPORTDATA function call</li>


     Put in cell E1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=zinc_id smiles mol_formula logp"))
     Put in cell E1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=zinc_id smiles mol_formula logp"))
Then to fill in I-K:
     Put in cell I1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity"))
     Put in cell I1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity"))
And to fill in L-M:
     Put in cell L1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))
     Put in cell L1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes"))
To figure out what column Look up additional columns of interest here http://zinc15.docking.org/substances/help/. As a sanity check include the zinc_id in the output column and verify that it corresponds with the input column</li>
</ol>
</ol>

Latest revision as of 17:06, 5 October 2020

To display substance 2D structure by smiles

  1. Put the smiles in column e.g. B2-B100 and then put this formula in the adjacent column
  2. =IMAGE(CONCATENATE("http://zinc15.docking.org/apps/mol/draw.png?smiles=", ENCODEURL(B59)))
  3. I sometimes have issues with the compounds loading, so it may be useful to save them once they load. Copy column and paste in adjacent column and select "by value" from the drop-down list

To Displaying substance 2D structure from ZINC_ID

  1. Sort zinc_ids numerically. This is really important because ZINC returns the values sorted numerically and if they aren't pre-sorted results won't match up with the right compound.
    1. Put zinc_ids in a column, e.g. B2-B100'
    2. If you try to sort them directly, it will sort them lexigraphically, eg. put ZINC100 before ZINC34. So extract the zinc number from the zinc id by making a column 'ZINC_Number" by putting in cell C2
    3. =VALUE(SUBSTITUTE(B2, "ZINC", ""))
    4. sort sort the cells by the ZINC_Number column by selecting the whole sheet and then data --> sort range... --> ZINC_Number
  2. To depict the molecules copy and paste the following in each cells in new column. E.g. put this in cell D2 to depict the molecule in B2
  3. =IMAGE(CONCATENATE("http://zinc15.docking.org/substances/", B2, "-small.png"))
  4. Use IMPORTDATA to populate additional columns of interest in the header of the next columns. It looks like it can only load 3 columns at a time, so for example put ZINC_Number in column B and to retrieve data from ZINC into columns E-H, make those columns empty an put into cell E1 the IMPORTDATA function call
  5. Put in cell E1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=zinc_id smiles mol_formula logp")) Then to fill in I-K: Put in cell I1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=mwt num_chiral_centers reactivity")) And to fill in L-M: Put in cell L1: =IMPORTDATA(CONCATENATE("http://zinc15.docking.org/substances.csv?zinc_id-in=", JOIN("+", B2:B100),"&output_fields=predicted_gene_names purchasability supplier_codes")) To figure out what column Look up additional columns of interest here http://zinc15.docking.org/substances/help/. As a sanity check include the zinc_id in the output column and verify that it corresponds with the input column