Marco Morais                                                    08/20/08

This file contains a procedure for generating the automated observation
reports from proposal XML.

--------------------------------------------------------------------------------
                    Automated Observation Reports Procedure
--------------------------------------------------------------------------------

1.  Create a top-level directory for the proposal XML. By convention
    the name of the directory is 'props-MM-NNNN' where MM is the
    GI proposal cycle number and NNNN is the version number of
    the proposal during the current cycle.  For instance, 'props-05-0003'
    is the top-level directory for the cycle 5 proposal XML version 3. 

    The full path to this directory is:

      /home/linux_en/ssldocs/gitr/props-05-0003

    More about proposal version number.
    The proposal version number concept is used only at Caltech.
    GSFC may decide to release multiple versions of the proposal XML
    within a cycle as problems and questions with the XML are resolved.
    In general, the proposal number having the highest version number
    is the most recent copy received from GSFC.  By convention,
    proposal version numbers > 9000 are used to refer to 'dummy'
    proposals.  Dummy proposals exist for testing purposes only
    and are not bound by the same disclosure and NDA restrictions
    as the real proposals.

1b. (optional)
    Create a symbolic link at the same level as the top-level 
    directory from the top-level directory to a directory named 
    'props'. Although this procedure does not depend upon finding 
    a directory named 'props', the proposal entry and review pages 
    by default use a directory named 'props' to search for a proposal 
    XML file named 'all.xml'.  You can override the default directory
    that the proposal entry and review pages use to look for this file
    by appending '&propsdir=' to the entry page url.

    Here is what a url with an overriden 'propsdir' looks like:

  https://forsete.caltech.edu/gitr/bin/prop_entry.cgi?&propsdir=../props-05-0997

    Here is what the contents of the the top-level directory containing
    multiple proposal version numbers might look like (Note: the symbolic
    link from 'props' to 'props-05-0003'):

      mmorais@lorax.srl.caltech.edu;rhe4(dev)% pwd
      /home/linux_en/ssldocs/gitr
      mmorais@lorax.srl.caltech.edu;rhe4(dev)% ls -l
      total 10
      drwxrwxr-x  2 mmorais galex   512 Aug 18 17:24 bin
      drwxrwx---  2 nobody  gitech  512 Jul 24  2006 htpass
      -rwxrwxr-x  1 mmorais users4  384 Jul 10 13:28 mkpropsdir.sh
      lrwxrwxrwx  1 mmorais galex    13 Aug  7 17:54 props -> props-05-0003
      drwxrwxrwx  3 mmorais galex   512 Jul 17 10:25 props-05-0001
      drwxrwxrwx  5 mmorais galex  1024 Aug 15 14:22 props-05-0003
      drwxrwxrwx  5 mmorais galex   512 Aug  4 16:31 props-05-0997
      drwxrwxrwx  3 mmorais galex   512 Jul 31 10:37 props-05-0999
      drwxrwxrwx  7 mmorais galex   512 Aug 15 16:19 webdocs
      drwxrwxrwx  3 mmorais galex   512 Aug 14 22:37 webmisc

2.  Copy the proposal XML into the top-level proposal directory
    created in step #1.

2b. (optional)
    Create a symbolic link in the top-level directory from the 
    proposal XML file to a file named 'all.xml'.  Although this 
    procedure does not depend upon finding a file named 'all.xml', 
    the proposal entry and review pages will expect to find the 
    proposal XML file this way.

    Here is what the contents of the directory might look like:

      mmorais@lorax.srl.caltech.edu;rhe4(dev)% pwd
      /home/linux_en/ssldocs/gitr/props-05-0003
      mmorais@lorax.srl.caltech.edu;rhe4(dev)% ls -l *.xml
      -r--r--r--  1 mmorais galex 2795069 Jul 17 10:12 GI5-0003.xml
      lrwxrwxrwx  1 mmorais galex      12 Jul 24 01:43 all.xml -> GI5-0003.xml

3.  If you plan on running the 'Recommended Predecessor Test', then
    copy the recommended predecessor CSV file into the top-level 
    proposal directory created in step #1.

    The recommended predecessor CSV file is generated by Tom using
    a program that he wrote that reproduces the logic used by the
    pipeline to find a recommended imaging predecessor for a grism
    observation.  Tom will need a CSV file of new grism observations
    as input in order to generate the CSV file of recommended 
    predecessors as output.  Use the mknewgrismcsv.sh script to 
    generate the input file for Tom.  The mknewgrismcsv.sh script is
    not installed in one of the standard places on the system so
    you will have to check it out of cvs to run it.

    The mknewgrismcsv.sh script is in cvs:
      CVSROOT/soda/tim/src/perl/gitr/mknewgrismcsv.sh

3b. (optional)
    Create a symbolic link in the top-level directory from the 
    recommended predecessor file to a file named 'rec_pred.csv'.
    Although this procedure does not depend upon finding a file
    named 'rec_pred.csv', the automated observation reports script
    will look for a file of this name in the top-level directory
    (you can override this behavior on the command line).

4.  Run the auomated observation report from the top-level directory.

    The name of the automated observation report script is
    gitr_gen_prop_reports.pl.  Like other GALEX scripts,
    type 'gitr_gen_prop_reports.pl --help' to print the usage
    associated with the program.  The script will take care of
    building the directory structure in which the reports are
    written.  

    It makes sense to show the command line arguments with their
    default values to understand the way the program works.

      gitr_gen_prop_reports.pl \
        --xml_fn          props/all.xml \
        --propnums        * \
        --gi_base_dir     props/gi_reports \
        --auto_base_dir   props/auto_reports \
        --rec_pred_csv    props/rec_pred.csv \
        --verbose

    The 'xml_fn' parameter is the name of the input proposal
    XML file from step #2.  The 'propnums' parameter is the
    proposal number(s) for which reports will be generated.  The 
    special value '*' means all proposals.  Use the 'propnums'
    parameter to split the work of generating the reports across 
    multiple processors.  The 'gi_base_dir' parameter is the 
    directory where the GI reports will be written and should 
    be a subdirectory of the proposal XML directory created in 
    step #1.  The 'auto_base_dir' parameter is the directory 
    where the automated observation reports will be written and 
    should be a subdirectory of the proposal XML directory created 
    in step #1.  The 'rec_pred_csv' parameter is the path to the 
    recommended predecessor CSV file from step #3. Finally, the 
    'verbose' flag will echo the report generation process to the 
    console so that you can monitor progress.

    As was alluded to in the previous paragraph, you will probably 
    want to split the work of generating the automated reports 
    across multiple processors.  For instance, lets say you have 3 
    machines { celery, cabbage, carrot } and the proposal XML contains 
    proposal numbers 1-75 with an evenly distributed set of observations
    in each proposal such that it makes sense to split the work into
    3 groups of 25 proposals.

    Generate reports for proposals 1-25 on celery:

      gitr_gen_prop_reports.pl \
        --xml_fn          props/all.xml \
        --propnums        '1-25' \
        --gi_base_dir     props/gi_reports \
        --auto_base_dir   props/auto_reports \
        --rec_pred_csv    props/rec_pred.csv \
        --verbose

    Generate reports for proposals 26-50 on cabbage:

      gitr_gen_prop_reports.pl \
        --xml_fn          props/all.xml \
        --propnums        '26-50' \
        --gi_base_dir     props/gi_reports \
        --auto_base_dir   props/auto_reports \
        --rec_pred_csv    props/rec_pred.csv \
        --verbose

    Generate reports for proposals 51-75 on carrot:
 
      gitr_gen_prop_reports.pl \
        --xml_fn          props/all.xml \
        --propnums        '51-75' \
        --gi_base_dir     props/gi_reports \
        --auto_base_dir   props/auto_reports \
        --rec_pred_csv    props/rec_pred.csv \
        --verbose

    As an alternative, you may find the 'slates' system to be 
    useful for automating this process.

    The gitr_gen_prop_reports.pl script is in cvs:
      CVSROOT/soda/tim/src/perl/gitr-exec/gitr_gen_prop_reports.pl

5.  Once the automated report scripts have completed, you will 
    find the files that have been generated by the script in the 
    directories that you specified with 'gi_base_dir' and 
    'auto_base_dir'.  The automated reports will now be visible 
    in the 'Automated Obsevation Reports' section of the review form.