ZINC:Errata: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
Line 5: Line 5:


* many molecules reported with    ZINC01278699. Sorry about this case. It will be removed in the next version.
* many molecules reported with    ZINC01278699. Sorry about this case. It will be removed in the next version.
* I downloaded the databases Asinex and Sigma-aldrich from the version 7
of ZINC in both the formats SMILES and MOL2.  For both the databases I
found a difference in the molecules present in the archives, that means
some molecules present in the multi-mol2 file and not in the SMILES and
vice versa. Is it possible or I did some errors in the comparison?
No, you are quite correct. I just did:
>  zmore sial_p0.smi.gz | awk '{print $2}' | sort -u > smiles_codes 
>  zcat sial_p0.?.mol2.gz | grep ZINC | sort -u > mol2_p0_codes
>  wc -l smiles_codes mol2_p0_codes
114763 smiles_codes
112069 mol2_p0_codes
> diff smiles_codes  mol2_p0_codes  |wc -l
4265
I agree that there are a little over 2,500 differences in the mol2 and SMILES of Sigma Aldrich in ZINC version 7, a little over 2% of the library.


[[Category:Errata]]
[[Category:Errata]]
[[Category:ZINC]]
[[Category:ZINC]]

Revision as of 22:35, 6 December 2007

Here are errata as reported for ZINC:


  • for SIGMA propiophenone P51605 ZINC has 1671385 entry, and the ring in it does not show as aromatic.
  • many molecules reported with ZINC01278699. Sorry about this case. It will be removed in the next version.


  • I downloaded the databases Asinex and Sigma-aldrich from the version 7

of ZINC in both the formats SMILES and MOL2. For both the databases I found a difference in the molecules present in the archives, that means some molecules present in the multi-mol2 file and not in the SMILES and vice versa. Is it possible or I did some errors in the comparison?

No, you are quite correct. I just did:

>  zmore sial_p0.smi.gz | awk '{print $2}' | sort -u > smiles_codes  
>  zcat sial_p0.?.mol2.gz | grep ZINC | sort -u > mol2_p0_codes
>  wc -l smiles_codes mol2_p0_codes
114763 smiles_codes
112069 mol2_p0_codes
> diff smiles_codes  mol2_p0_codes  |wc -l
4265

I agree that there are a little over 2,500 differences in the mol2 and SMILES of Sigma Aldrich in ZINC version 7, a little over 2% of the library.