ZINC:Errata: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 5: | Line 5: | ||
* many molecules reported with ZINC01278699. Sorry about this case. It will be removed in the next version. | * many molecules reported with ZINC01278699. Sorry about this case. It will be removed in the next version. | ||
* I downloaded the databases Asinex and Sigma-aldrich from the version 7 | |||
of ZINC in both the formats SMILES and MOL2. For both the databases I | |||
found a difference in the molecules present in the archives, that means | |||
some molecules present in the multi-mol2 file and not in the SMILES and | |||
vice versa. Is it possible or I did some errors in the comparison? | |||
No, you are quite correct. I just did: | |||
> zmore sial_p0.smi.gz | awk '{print $2}' | sort -u > smiles_codes | |||
> zcat sial_p0.?.mol2.gz | grep ZINC | sort -u > mol2_p0_codes | |||
> wc -l smiles_codes mol2_p0_codes | |||
114763 smiles_codes | |||
112069 mol2_p0_codes | |||
> diff smiles_codes mol2_p0_codes |wc -l | |||
4265 | |||
I agree that there are a little over 2,500 differences in the mol2 and SMILES of Sigma Aldrich in ZINC version 7, a little over 2% of the library. | |||
[[Category:Errata]] | [[Category:Errata]] | ||
[[Category:ZINC]] | [[Category:ZINC]] |
Revision as of 22:35, 6 December 2007
Here are errata as reported for ZINC:
- for SIGMA propiophenone P51605 ZINC has 1671385 entry, and the ring in it does not show as aromatic.
- many molecules reported with ZINC01278699. Sorry about this case. It will be removed in the next version.
- I downloaded the databases Asinex and Sigma-aldrich from the version 7
of ZINC in both the formats SMILES and MOL2. For both the databases I found a difference in the molecules present in the archives, that means some molecules present in the multi-mol2 file and not in the SMILES and vice versa. Is it possible or I did some errors in the comparison?
No, you are quite correct. I just did:
> zmore sial_p0.smi.gz | awk '{print $2}' | sort -u > smiles_codes > zcat sial_p0.?.mol2.gz | grep ZINC | sort -u > mol2_p0_codes > wc -l smiles_codes mol2_p0_codes 114763 smiles_codes 112069 mol2_p0_codes > diff smiles_codes mol2_p0_codes |wc -l 4265
I agree that there are a little over 2,500 differences in the mol2 and SMILES of Sigma Aldrich in ZINC version 7, a little over 2% of the library.