Repackaging DB2 DOCK38: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
Line 6: Line 6:


# required parameter
# required parameter
RESULT_DIRECTORY=$1
TARBALL_SOURCE=$1
TARBALL_REPACK_DEST=$2
 
[ -z $TARBALL_SOURCE ] && echo "need to provide TARBALL_SOURCE as 1st arg!" && exit 1
[ -z $TARBALL_REPACK_DEST ] && echo "need to provide TARBALL_REPACK_DEST as 2nd arg!" && exit 1


# optional parameters
# optional parameters
Line 19: Line 23:


echo finding
echo finding
find $RESULT_DIRECTORY -name '*.tar.gz' > tarball_list.txt
find $TARBALL_SOURCE -name '*.tar.gz' > tarball_list.txt
echo splitting
echo splitting
split -l $PACKAGES_PER_PACKAGE tarball_list.txt tarball_split_list/
split -l $PACKAGES_PER_PACKAGE tarball_list.txt tarball_split_list/
Line 29: Line 33:
         done
         done
         tar -czf $(basename $f).$PACKAGE_TYPE_SHORT.tar.gz *.$PACKAGE_TYPE
         tar -czf $(basename $f).$PACKAGE_TYPE_SHORT.tar.gz *.$PACKAGE_TYPE
         mv $(basename $f).$PACKAGE_TYPE_SHORT.tar.gz ../output
         mv $(basename $f).$PACKAGE_TYPE_SHORT.tar.gz $TARBALL_REPACK_DEST
         rm *.$PACKAGE_TYPE
         rm *.$PACKAGE_TYPE
         echo $(basename $f)
         echo $(basename $f)

Revision as of 18:05, 16 March 2023

The following is a script for repackaging 3D pipeline results. First, here is the script:

#!/bin/bash
# make_tarballs.bash

# required parameter
TARBALL_SOURCE=$1
TARBALL_REPACK_DEST=$2

[ -z $TARBALL_SOURCE ] && echo "need to provide TARBALL_SOURCE as 1st arg!" && exit 1
[ -z $TARBALL_REPACK_DEST ] && echo "need to provide TARBALL_REPACK_DEST as 2nd arg!" && exit 1

# optional parameters
WORKING_DIRECTORY=${WORKING_DIRECTORY-/tmp/$(whoami)}
PACKAGES_PER_PACKAGE=${PACKAGES_PER_PACKAGE-100}
PACKAGE_TYPE=${PACKAGE_TYPE-db2.gz}
PACKAGE_TYPE_SHORT=$(echo $PACKAGE_TYPE | cut -d'.' -f1)

echo WORKING_DIRECTORY=$WORKING_DIRECTORY
mkdir -p $WORKING_DIRECTORY && cd $WORKING_DIRECTORY
mkdir -p output working tarball_split_list

echo finding
find $TARBALL_SOURCE -name '*.tar.gz' > tarball_list.txt
echo splitting
split -l $PACKAGES_PER_PACKAGE tarball_list.txt tarball_split_list/
echo working
cd working
for f in ../tarball_split_list/*; do
        for tb in $(cat $f); do
                tar --transform='s/^.*\///' -xf $tb '*.'$PACKAGE_TYPE 2>/dev/null
        done
        tar -czf $(basename $f).$PACKAGE_TYPE_SHORT.tar.gz *.$PACKAGE_TYPE
        mv $(basename $f).$PACKAGE_TYPE_SHORT.tar.gz $TARBALL_REPACK_DEST
        rm *.$PACKAGE_TYPE
        echo $(basename $f)
done
cd ..
echo Done! Results in $PWD/output

Now, an example usage:

[user@gimel5 ~] bash make_tarballs.bash /nfs/exb/zinc22/tarballs/H17P200_H19P400.smi.batch-3d.d/out/aaa.d
finding
splitting
working
aa
ab
ac
ad
ae
af
ag
ah
ai
aj
Done! Results in /tmp/user/output

For docking from ligands built using our pipeline with default options, running this script unmodified is sufficient for creating appropriately sized packages for docking. You may wish to edit WORKING_DIRECTORY if your files total more than 50G in size- /tmp is not usually any larger than this!