Content-Type: text/html
Here is a brief introduction to pass2, how to run, what scripts
have been written for you and how to use them etc.
unix>
kinit -l #d (# stands for the number of days you
want your ticket to be validated )
unix>
(enter your password)
unix>
krlogin -l pass2 solxxx ( solxxx is the name of a valid machine )
-
What machines should I use ?
We
have 120 machines to churn the data. One can always find how many machines
are available for pass2 by doing following :
pass2_sol575%
qhowrl ( show the resource list )
The following two machines are particularly
useful:
- sol211 : This machines is used to submit jobs to the queueing system.
- sol410 : This machine is used for running interactive jobs.
-
What should I do before I begin a new federation
?
Before you begin a new data federation,
you should run setUpNewFederation.pl
. This scripts expects a data set ( for examle data6
) as an argument. The script would create the required directories
and copy and modify various files and scripts you need. Here is an outcome
when you run this script:
pass2_sol575% setUpNewFederations.pl
data6
- making directory
/home/pass2/Pass2Production/pass2/Pass2ReproScripts/DATA6
- making sub-dir
DBFilesDir
- copying and
modifying scripts ....
...... done python script, the job submitting script
...... done shell scripts
...... done perl scripts
- making directory /home/pass2/public_html/repass2/data6
-
How should I submit a job ?
To submit a
job, we use submit_job_db_6.py
or a variant there of. This script, which is serving as an example
was used for processing data6. When you start pass2ing a new data federation,
you need to copy and edit this script a little bit, which is routine. Other
than making a list of good_runs
, which has to done once and for all before the beginning of a new federation,
the job submission is a three step procedure
1. pass2_sol575% find_disk_space.pl
This script has to be
run before we start submitting jobs. This script would pick up the run
number which has to submitted and then makes a list of the promising disks
which are capable of handling the output from this run. Once this script
has updated the list of disks ( or basepaths, that's what we call it ),
we are ready to submit a job. Make sure that the newly written file
BASEPATHS.TXT in
your working directory is not empty. If it is blank, probably none of the
disks have enough space to handle the output of that run. Once this script
has made the list for the first time, you are all set to submit a job.
The job submission script and the disk-space-finding script go in tandem.
You would always have an updated list of basepaths before the next run
is submitted.
2. pass2_sol211% source
SetRepass2Env.sh
This script set the environment. You might have to
edit this script as the situation requires.
3. pass2_sol211% submit_job_db_6.py
>& ./submit_log/submit_#.log &
The job submission script creates a Log# directory
in your working directory, the # depends upon the already existing directories.
For example, if you execute the submission for the first time, it would
create Log1. A few hours later, if you execute the script again,
Log2 would be created and so on. For any reason, if you had removed the
Log5 whereas Log6 exists, then during the next execution, it would make
Log5 and not Log<highest_existing+1>.
If you are running interactively on sol410, then you do not need to
create a list of the basepaths. You need to source the environment
as:
pass2_sol410% source debug_SetRepass2Env.sh
dataset run_number and
then run_Pass2.sh
&
-
The one step procedure:
Before we start pass2ing a data federation, we need to create some directo