Basic Tutorial: Difference between revisions

From DISI
Jump to navigation Jump to search
m (4 revisions)
No edit summary
Line 1: Line 1:
See [[SEA]]
See [[SEA]]


To run [[SEA]] we need three query files:
To run [[SEA]] we need, for both the query and the reference, three files with the same basename and different extensions:  
* smiles : The separator between smiles and IDs is a semicolumn
* <b>smiles</b> (.smi) : The separator between smiles and IDs is a semicolon. Make sure it has the same characteristics than the reference by running the following command:
* fingerprints : Separator is also a semicolumn. Can be generated by the following command:
  > sea-molecule-clean
* <b>fingerprints</b> (.fp) : Separator is also a semicolon. Can be generated by the following command:
   > sea-molecule-fingerprint
   > sea-molecule-fingerprint
* set file : First element is the set code, the second is the set description, the third is the list of compound IDs in the set separated by a coumn.
* <b>sets</b> (.set) : First element (separated by semicolon) is the set code, the second is the set description, the third is the list of compound IDs in the set separated by a colon.
 


For the reference sets we need the same three files plus the model file, generated from [[SEA]].
Additionally, we will need the model fit file generated from [[SEA]].


The actual command is:
The actual command to <b>run SEA</b> is:


   > sea-run -f model_fit_file model_set_file query_set_file
   > sea-run -f model_fit_file model_set_file query_set_file

Revision as of 23:16, 11 October 2012

See SEA

To run SEA we need, for both the query and the reference, three files with the same basename and different extensions:

  • smiles (.smi) : The separator between smiles and IDs is a semicolon. Make sure it has the same characteristics than the reference by running the following command:
  > sea-molecule-clean
  • fingerprints (.fp) : Separator is also a semicolon. Can be generated by the following command:
  > sea-molecule-fingerprint
  • sets (.set) : First element (separated by semicolon) is the set code, the second is the set description, the third is the list of compound IDs in the set separated by a colon.

Additionally, we will need the model fit file generated from SEA.

The actual command to run SEA is:

 > sea-run -f model_fit_file model_set_file query_set_file