dm2mrc.sh Script

Under Construction!

In the following description of the dm2mrc.sh script, the script name and its content are shown in bold. Interspersed throughout the script are comments not contained in the actual script that describe what various pieces of the script are doing and that refer to the page that describes the overall action of this script. In these comments, things such as text that a user might type or text that is also contained in the script itself are also presented in bold.

NOTE: Comparison of the script shown here with a current version of the script from someplace else may show differences. Especially with regard to spacing and indentation, the script shown here and a script running on a linux machine may easily differ. It is also possible that changes will have been made to the script on a computer that reflect changes in available software or the actual operating system. Such differences can be ignored in terms of using what is shown here as an explanation of scripting.

This is a bash script and so it must start with #!/bin/bash. The script then has a very short explanation (marked using the comment delimiter #) of what the script does in order to give a reader of the script an idea of what should happen.

          #!/bin/bash
          # simple script to convert a series of
          #       Digital Micrograph (dm3) files
          #       into MRC format for further image
          #       processing operations

The next line defines the echo command to include the -e flag, which specifies that the command is to expect and use special characters to aid with formatting the output. This is not necessary in many cases, but is used in most of the EMC's scripts on Karst.

          ECHO="/bin/echo -e"

The following block is a complicated set of if-then-else that allows the user to type dm2mrc.sh help in order to obtain some information about the the script's operation. This same construction also allows the user to receive a standard how-to-use message if something is typed that the script cannot understand (e.g., dm2mrc.sh filename triggers an error that tells the user the proper way to invoke the script). Whatever action is produced using this if-then-else construction, the exit status will be 1 (see the comment for the exit 0 statement for an explanation of the use of exit status values). All the actions in this part of the script are referred to elsewhere as step 1.

          if ( test $# -ne 0 ) then
            if ( !(test `echo $1 | wc -w ` -ne 1 ) ) then
                if ( test $1 = "help" ) then
                  $ECHO "\n\tdm2mrc.sh\n"
                  $ECHO "\t    This script uses the EMAN1 command proc2d to"
                  $ECHO "\t    convert DigitalMicrograph (dm3) files into"
                  $ECHO "\t    MRC format.  The output files are MRC mode 2"
                  $ECHO "\t    (floating point numbers) and for most of the"
                  $ECHO "\t    images collected by the 3200FS, the images"
                  $ECHO "\t    will actually be '16-bit integers stored as"
                  $ECHO "\t    floats.'\n"
                  $ECHO "\t    This script also performs an automatic step of"
                  $ECHO "\t    'extreme outlier and negative value removal'"
                  $ECHO "\t    using a custom written program for that purpose.\n"
                  $ECHO "\t    Such outliers and negative values are generally"
                  $ECHO "\t    the result of X-ray/cosmic ray hits during image"
                  $ECHO "\t    acquistion.   If the program finds anything to"
                  $ECHO "\t    eliminate, the log file from the operation will"
                  $ECHO "\t    be saved in a MetaData sub-directory.  The log"
                  $ECHO "\t    file shows the min, max and mean before and after"
                  $ECHO "\t    removing unusual values, and also shows both the"
                  $ECHO "\t    total number of removed pixels, the number that"
                  $ECHO "\t    were negative and the average of those negative"
                  $ECHO "\t    values (X-ray hits in the dark reference will"
                  $ECHO "\t    produce very negative values while for low dose"
                  $ECHO "\t    images, there is a finite probability that small"
                  $ECHO "\t    negative values can occur under normal imaging"
                  $ECHO "\t    conditions.\n"
                  $ECHO "\t    There are a number of ways to identify where the"
                  $ECHO "\t    values have been changed in the image.  Talk to"
                  $ECHO "\t    the staff of the EM facility if you want to know"
                  $ECHO "\t    more about this.\n"
                  $ECHO "\tSimple instructions for using this script follow:\n"
                 else
                  $ECHO "\n\tImproper number of arguments ($#)!\n "
                fi
              else
                  $ECHO "\n\tImproper number of arguments ($#)!\n "
              fi
              $ECHO "\tProper usage is simply:\n"
              $ECHO "\t    dm2mrc.sh  \n"
              $ECHO "\t       which will convert all the dm3 files in this"
              $ECHO "\t       directory into MRC files\n"
              $ECHO "\tto convert a single dm3 file to MRC format, use\n"
              $ECHO "\t    proc2d myImage.dm3 myImage.mrc\n"
              exit 1
          fi

The following block stores the list of all dm3 files in a variable called LIST and proceeds into an if-then-else construction. The \ls *dm3 2> /dev/null construction here assures that only the most standard ls command is used ( \ls ) for this operation (i.e., no user-specific aliased versions of ls are to be used) and the output is to be suppressed (see the note at the end of the script).

The first part of the if tells the user what the script is going to do and ends by saving only the filenames (without the .dm3 extension) in the LIST variable ( LIST=` \ls *dm3 | cut -f1 -d. `, where the back quote pair tells the script to execute the commands within the back quotes and then assign that result to the variable LIST). The else clause makes certain that if for any reason, the list of dm3 files is empty, the script will relay that to the user and will exit with an exit status of 2 (instead of the usual exit status of 0 when a script terminates).

In terms of the description of this script found elsewhere, the creation of the LIST variable is step 2, the actual test (" [ "$?" == "0" ] ") is step 3, the first part of the if clause is step 5 and the second part of the if (the else part) is step 4.

          LIST=` \ls *dm3 2> /dev/null`
          if [ "$?" == "0" ] ; then
             $ECHO "\n\tConverting all dm3 files to MRC format.\n"
             $ECHO   "\t   Remember that this program removes"
             $ECHO   "\t   all negative image values and also"
             $ECHO   "\t   extreme positive outliers (greater"
             $ECHO   "\t   than 15 std deviations above avg)."
             $ECHO   "\t   Useful log files will be saved in"
             $ECHO   "\t   a MetaData sub-directory.\n"
             LIST=` \ls *dm3 | cut -f1 -d. `
            else
             $ECHO "\n\tNo dm3 files found.  Type\n\n\t\tdm2mrc.sh help"
             $ECHO "\n\t  for more information\n"
             exit 2
          fi

The heart of the script occurs next (the complicated step 6 described elsewhere). The first line creates a new variable called name that is associated with the dm3 filenames stored in the variable LIST. The for name in $LIST construction (ending with done) tells the script to run through every filename contained in the variable LIST, assign that actual filename to the variable name and to peform all the operations between for and done for each filename.

Various things happen inside this for/done loop: A variable called TimeStamp (containing a numeric representation of the date, accurate to the second during which the script starts to work on a new dm3 file) is created and then used as part of filename assigned to variable TMP. The EMAN program proc2d is used to convert the original dm3 file into an MRC file with name TMP. This new file is fed into a program called removeXrays2 that replaces negative values in the image and values that are most likely X-ray hits ("much too bright," defined here as larger than the image average + 15. times the image standard deviation) with the average values surrounding such pixels. The output of removeXrays2 is a new MRC file whose name ( ${name}.mrc ) is related to the original dm3 file (i.e., input_file_1.dm3 is converted to input_file_1.mrc).

There is also another if-then-else construction that deals with output ( ${name}.removeXrays2.log ) from the removeXrays2 program: if the program actually replaces any image values (the if ( ! test -z ` grep -l fixed ${name}.removeXrays2.log ` ) then construction), the script makes a directory called MetaData if it needs to (using a test for the non-existence of the directory: if ( ! test -d MetaData ) then and making the directory if it does not exist) and then moves the log file from removeXrays2 into it. If removeXrays2 does not do anything (the final else clause), the log file is simply deleted.

The last step in the for/done loop is to clean up: the temporary file TMP is deleted. If variable LIST contains more filenames that have not been processed, the script loops back to the step that creates the TimeStamp variable and continues. If all the filenames have been processed, the script continues beyond the done statement.

          for name in $LIST ; do
             TimeStamp=` date +%m%d%y-%H.%M.%S`
             TMP=TMP_${TimeStamp}.mrc
             proc2d ${name}.dm3 $TMP > /dev/null 2>&1
             removeXrays2 $TMP ${name}.mrc 15. > ${name}.removeXrays2.log 2>&1
             if ( ! test -z ` grep -l fixed ${name}.removeXrays2.log ` ) then
                 if ( ! test -d MetaData ) then
                   mkdir MetaData
                 fi
                 mv ${name}.removeXrays2.log ./MetaData
               else
                 rm -f ${name}.removeXrays2.log
             fi
             rm -f $TMP
          done

The script cleans up after itself by deleting the log file ( .emanlog ) created when the EMAN program proc2d was run.

          rm -f .emanlog

It also lists all the MRC files it can find in this directory.

          ls *mrc

The final echo command simply inserts a blank line ( " " ) at the very end of everything in order to make the output look a bit nicer. The previous two lines are referred to as step 7 elsewhere.

          $ECHO " "

Finally, the script exits with the explicit exit status of 0. If this script happened to be run inside another script, it would be possible to test for this value as a way of ensuring that the dm2mrc.sh script had run properly. Such a script could also report a non-zero exit status to the user and could even have different output for exit status 1 (an indication that the script never really started) or 2 (an indication that there were not any dm3 files where the script was trying to execute).

          exit 0

NOTE: There are several output redirection constructs used in the above script. For example, the > /dev/null 2>&1 at the end of the proc2d command says to combine ( >&) std err (2) into std out (1) (i.e.., this entire operation is 2>&1) and send it ( > ) to /dev/null, a type of write-only memory. This same construction is used to put all the output from removeXrays2 into a log file ( ${name}.removeXrays2.log ). In addition, at the very beginning of the script, the \ls command that associates the dm3 filenames with the variable LIST is told to send its output (which is only std err, 2) to /dev/null. Redirection of output to /dev/null results in nothing being sent to the terminal, and cuts down drastically on output that doesn't need to be seen in the process of running many scripts.

Here are a series of example outputs from the dm2mrc.sh script, where the command a user would type is in bold and the different commands are separated by horizontal bars:


        $ dm2mrc.sh help

        dm2mrc.sh

            This script uses the EMAN1 command proc2d to
            convert DigitalMicrograph (dm3) files into
            MRC format.  The output files are MRC mode 2
            (floating point numbers) and for most of the
            images collected by the 3200FS, the images
            will actually be '16-bit integers stored as
            floats.'

            This script also performs an automatic step of
            'extreme outlier and negative value removal'
            using a custom written program for that purpose.

            Such outliers and negative values are generally
            the result of X-ray/cosmic ray hits during image
            acquistion.   If the program finds anything to
            eliminate, the log file from the operation will
            be saved in a MetaData sub-directory.  The log
            file shows the min, max and mean before and after
            removing unusual values, and also shows both the
            total number of removed pixels, the number that
            were negative and the average of those negative
            values.  X-ray hits in the dark reference will
            produce very negative values while for low dose
            images, there is a finite probability that small
            negative values can occur under normal imaging
            conditions.

            There are a number of ways to identify where the
            values have been changed in the image.  Talk to
            the staff of the EM facility if you want to know
            more about this.

        Simple instructions for using this script follow:

        Proper usage is simply:

            dm2mrc.sh

               which will convert all the dm3 files in this
               directory into MRC files

        To convert a single dm3 file to MRC format, use

            proc2d myImage.dm3 myImage.mrc

        $ dm2mrc.sh *dm3

        Improper number of arguments (11)!

        Proper usage is simply:

            dm2mrc.sh

               which will convert all the dm3 files in this
               directory into MRC files

        To convert a single dm3 file to MRC format, use

            proc2d myImage.dm3 myImage.mrc

        $ dm2mrc.sh

        Converting all dm3 files to MRC format.

           Remember that this program removes
           all negative image values and also
           extreme positive outliers (greater
           than 15 std deviations above avg).
           Useful log files will be saved in
           a MetaData sub-directory.

        EDX_waffle_10_spectrum_12.mrc  firstImages_0002.mrc   thicknessMap_0002.mrc
        alignedSTEM.mrc                firstImages_0003.mrc   thicknessMap_0003.mrc
        alignedTEM.mrc                 firstImages_0004.mrc   thicknessMap_0004.mrc
        firstImages_0001.mrc           thicknessMap_0001.mrc