Skip to content
Skip to navigation menu

 

Frequently Asked Questions about Condor

General questions

Machine-related questions

Job-related questions

Error messages


General questions


What is Condor?

Condor is a system that allows us to harness the spare computing capacity of a number of desktop computers across campus for useful work when they would otherwise be idle. Users submit their jobs to the Condor system which then places those jobs into a queue, chooses where and when to run them, carefully monitors their progress, and informs the user upon completion.

A Condor job is a command line Windows executable that reads data from one or more input files and writes results to one or more output files. When the user submits their job to Condor they need to tell Condor what to do and how to do it by writing a submit script.

What can Condor do?

Condor can run a broad range of applications. To give you an idea of the range of applications Condor can run we have drawn up a list of the popular classes of problems that researchers run on the Condor pool. Please note this list is in no way exhaustive.

Parameter Sweep : A Parameter Sweep problem is one for which one or more parameters sweep from a particular start value to an end value in regular increments, for example: I have P parameters, I need to run P*P*P different jobs. Using Condor I can run N jobs at the same time instead of just one.

Embarrassingly Parallel : An Embarrassingly Parallel program is one for which no particular effort is needed to segment the problem into a very large number of parallel tasks, and there is no communication between those parallel tasks, for example: I can divide my data into N chunks. Using Condor I can run N chunks at the same time instead of just one.

Monte Carlo : A Monte Carlo problem problem is one that relies on repeated random sampling to compute a result.  Monte Carlo methods are often used when simulating physical systems such as liquids, disordered materials, strongly coupled solids, and cellular structures, for example: I need to run the same job N times. Using Condor I can run N copies at the same time instead of just one.

Can I use Condor?

If you are a researcher with a Cardiff University user account, and your problem can run on the Condor system, then yes you can. There is currently no charge for the Condor service however we request that you write a paragraph or two about your research and send it to us so that we may keep a record of the kind of problems our researchers are running on the Condor pool.

How does Condor work?

Condor works on a principle called matchmaking. All machines in the Condor pool, Execute and Submit nodes alike, send regular updates to the Central Manager node. Those updates, which are called ClassAds, contain capability and status information about the node. When a user submits a job from their Submit node a ClassAd containing job requirements is sent to the Central Manager.  The Central Manager responds with a list of machines that match those requirements and the Submit machine then contacts the appropriate Execute machine to run the job.

How much does Condor cost?

There is currently no charge for the Condor service

Where is the Condor manual?

You can find the manual here: Condor manual

Where are the Condor training materials?

You can find the training materials here: Condor training materials.

Back to top


Machine-related questions


How many machines are in the Condor pool?

There are approximately 2500 machines in the Condor pool providing about 4500 slots. About 60% of the machines have Pentium IV 3.0 Ghertz processors, 30% of the machines have Core 2 Duo 2.1 Ghertz processors, and 10% of the machines have Core 2 Duo 3.0 Ghertz processors.

How much memory is installed in machines in the Condor pool?

Approximately 3% of the machines have less than 512 Mbytes of memory, 19% of the machines have between 512 Mbytes and 1024 Mbytes of memory, 46% of the machines have between 1024 Mbytes and 2048 Mbytes of memory, and 32% of the machines have more than 2048Mbytes of memory installed.

What is the speed of the network connection to the majority of execute nodes?

100MBit.

What is the speed of the network connection to the majority of submit nodes?

100MBit.

What is the speed of the network connection to the central managers?

1000MBit.

How do I find out the IP address of my machine?

Click on Start > Programs > Accessories > Command Prompt and type the following at the prompt: ipconfig. The IP address of the central manager is 131.251.239.147. The IP address of a machine on the layer 2 network will look something like 131.251.*.*. The IP address of a machine on a layer 3 subnet will look something like 10.*.*.*. The stars represent numbers in the range 0 to 254.

How do I find out the MAC address of my machine?

Click on Start > Programs > Accessories > Command Prompt and type the following command at the prompt: ipconfig -all. The MAC address of the central manager is 00-50-56-00-00-46. The MAC address is the same as the Physical Address.

How do I find out the HOSTNAME of my machine?

Click on Start > Programs > Accessories > Command Prompt and type the following command at the prompt: hostname. The HOSTNAME of the central manager is condorman7. The HOSTNAME of a machine will look something like X001122334455. This will give you the HOSTNAME used in eDirectory to identify a particular workstation. 

Click on Start > Programs > Accessories > Command Prompt and type the following command at the prompt: nslookup 131.251.239.147. The previous command shows you how to find out the HOSTNAME of the machine with IP address 131.251.239.147 as stored in Socket to identify a particular workstation. 

The HOSTNAME of a machine that can submit Condor jobs will look something like *.condor.cf.ac.uk or *.school.condor.cf.ac.uk. The stars represent a sequence of alphanumeric characters.

Back to top


Job-related questions


Can I run commercial applications on the Condor pool?

It depends on the license:

Firstly we must make sure that the legal requirements of the license are met.

Secondly we must make sure that the technical requirements of the licensing process are met. If the application requires access to a hardware dongle to run then it will not be possible to run the application on the Condor pool (unless you have a large number of dongles). If the application requires access to a license server to run then it should be possible to run the application on the Condor pool (if you have a large number of licenses). If the application does not require access to a hardware dongle or a license server then it should be possible to run the application.

Can I run open source applications on the Condor pool?

Yes.

Can I run applications compiled from source code on the Condor pool?

Yes.

Can I submit a job that runs a number of commands?

Yes.

You can do this by writing a batch file that contains a number of commands and telling Condor to run the batch file instead of a regular executable file. Batch files can contain commands to mount network shares, download input data files from the share, run one or more executables, upload output files to the share, and unmount network shares.

Can I submit a job that mounts a network share?

Yes.

You can do this by writing a batch file similar to this one. Lines beginning with rem are remarks and not batch file commands

rem mount the network share
net use \\computername\sharename\
rem download input file
copy \\computername\sharename\input*.dat .
rem input0.dat, input1.dat and input3.dat are now downloaded
rem run one or more executables
program1.exe
program2.exe
rem upload output file
copy output.dat \\computername\sharename\
rem output.dat is now uploaded
rem unmount the network share
net use \\computername\sharename\ /delete

 Further information on this is available.

Can I target my jobs to run on a particular machine?

Yes.

If you want to target your jobs to run on a particular machine you will have to add the following to the requirements line in your submit script: requirements = MACHINE==” X001122334455.CF.AC.UK”

You should replace X001122334455 in the previous example with the name of the machine you want to target.

Can I target my jobs to run on machines donated by a particular school?

Yes.

If you are logged into a submit node you can find out which machines in the Condor pool belong to ASCHOOL using the following command at the prompt: condor_status -constraint IS_OWNED_BY==\"ASCHOOL\"

If you want to target your jobs to run on machines donated by ASCHOOL you will have to add the following to the requirements line in your submit script: requirements = IS_OWNED_BY==”ASCHOOL”

You should replace ASCHOOL in the previous examples wit the name of a particular school you want to target.

The following schools have donated some machines: BIOSI, CARBS, CLAWS, COMSC, ENGIN, MATHS, MEDIC, OPTOM, PHYSX, and SOCSI.

Can I run Blast jobs on the Condor pool?

No.

Can I run Matlab jobs on the Condor pool?

Yes. Versions 78 and 710 of the Matlab Compiler Runtime are installed on a number of machines in the Condor pool.

The path to version 78 of the runtime on those machines is C:\Program Files\MATLAB\MATLAB Compiler Runtime\v78\bin\win32. If you are logged into a submit node you can find out which machines in the Condor pool have version 78 of the runtime installed using the following command at the prompt: condor_status -constraint "HAS_MATLAB_V78". If you want to submit a Matlab job you will have to compile your Matlab code into an executable and add the following to the requirements line in your submit script: requirements = HAS_MATLAB_V78 == TRUE.  You should also run your Matlab executable from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Program Files\MATLAB\MATLAB Compiler Runtime\v78\bin\win32
myexecutable.exe

The path to version 710 of the runtime on those machines is C:\Program Files\MATLAB\MATLAB Compiler Runtime\v710\bin\win32. If you are logged into a submit node you can find out which machines in the Condor pool have version 710 of the runtime installed using the following command at the prompt: condor_status -constraint "HAS_MATLAB_V710". If you want to submit a Matlab job you will have to compile your Matlab code into an executable and add the following to the requirements line in your submit script: requirements = HAS_MATLAB_V710 == TRUE.  You should also run your Matlab executable from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Program Files\MATLAB\MATLAB Compiler Runtime\v710\bin\win32
myexecutable.exe

Can I run Perl jobs on the Condor pool?

Yes. Version 5.10.0.1007 of Perl is available on a network share

You can access the share by writing a batch file similar to this one. Lines beginning with rem are remarks and not batch file commands

rem mount the network share
net use q: "\\Maincf2g\arcapps\Activestate Activeperl 5.10.1.1007\mount"
rem add the network application to the path
path=%path%q:\bin;
rem run the perl script
perl.exe script.pl
rem unmount the network share
net use q: "\\Maincf2g\arcapps\Activestate Activeperl 5.10.1.1007\mount" /delete

Can I run Python jobs on the Condor pool?

Yes. Version 2.6.5.14 of Python is available on a network share

You can access the share by writing a batch file similar to this one. Lines beginning with rem are remarks and not batch file commands

rem mount the network share
net use q: "\\Maincf2g\arcapps\Activestate Activepython 2.6.5.14\mount"
rem add the network application to the path
path=%path%q:\;
rem run the python script
python.exe script.py
rem unmount the network share
net use q: "\\Maincf2g\arcapps\Activestate Activepython 2.6.5.14\mount" /delete

Can I run R jobs on the Condor pool?

Yes. Version 2.12.0 is available on a network share

You can access the share by writing a batch file similar to this one. Lines beginning with rem are remarks and not batch file commands

rem mount the network share
net use q: "\\Maincf2g\arcapps\R 2.12.0\mount"
rem add the network application to the path
path=%path%q:\bin;
rem run the r script
r.exe script.r
rem unmount the network share
net use q: "\\Maincf2g\arcapps\R 2.12.0\mount" /delete

What is the maximum practical job length?

Maximum practical run time per job is 5 hours therefore you should aim to divide your work up into multiple jobs each of which takes no longer than 5 hours to run on a typical desktop computer which, as of this writing, is a computer with a 3.0GHz Dual-Core processor and 2GB of memory.

What is the maximum practical job size?

Maximum practical data size per job is 50 megabytes input or output therefore you should aim to divide your input data up into multiple chunks each of which is no larger than 50 megabytes and generate no more than 50 megabytes of output data per job.

Back to top


Error messages


When installing Condor I get the error "ERROR: The process "condor_*.exe" not found"

This error appears only during installation and can be ignored.

When running condor_* I get the error "'condor_*' is not recognized as an internal or external command..."

This error appears when the Condor programs have not been added to the path variable of your current command prompt.  

To run a command prompt with the path variable set correctly click on Start > Networked Applications > Departmental Software > ARCCA > Condor > Condor Prompt.

When running condor_store_cred add I get the error "Operation failed: bad password"

This error appears when the password entered does not match your Novell password.

Run condor_store_cred add again and this time provide your Novell password.

When running condor_submit I get the error "No credential stored for user@hostname"

This error appears when you have not already stored a user credential

Run condor_store_cred add

I get the error "You are running out of disk space on C"

If you do not submit jobs to the Condor pool then the space occupied by the optional installation packages on your machine is too great. Please contact us via the helpdesk and ask us to give you access to the "Remove Condor" application object. Please note that you will have to tell us the MAC address of your machine so that we can give you access to the "Remove Condor" application object and delete the association with the appropriate "Install Condor" application object so that this does not occur in the future e.g. as Condor is upgraded.

If you submit jobs to the Condor pool then there are two possibilities (1) the space occupied by the optional installation packages on your machine is too great, or (2) the space occupied by the results of your jobs on your machine is too great. Please contact us via the helpdesk to discuss how we can prune your C:\Condor directory.

Back to top