Try our new documentation site (beta).


Distributed Parallel Algorithms

Gurobi Compute Server supports distributed parallel parameter tuning and distributed concurrent MIP solving. These features allow you to distribute the associated computation among multiple machines. The optimization model of interest is copied to the available machines, and the machines each use different parameter settings to solve the model. In the case of distributed tuning, one machine doles out parameter sets among machines, collects the results, compares runtimes to those of the best parameter sets found so far, and chooses new parameter settings to test. In the case of distributed concurrent MIP, one machine doles our different parameter sets to the available machines, starts a single MIP solve on each machine, and then keeps track of the best available lower and upper bounds for the model. When one machine finds a proven optimal solution, the computation is interrupted on other machines and the optimal solution is reported. Both of these distributed parallel algorithms are designed to be almost entirely transparent to the user. The user simply modifies a few parameters, and the work of distributing the computation to multiple machines is handled behind the scenes by the Gurobi library.

Specifying the Compute Server Pool

The first step when doing distributed tuning or distributed concurrent MIP is to provide a list of one or more Compute Servers that are running and available. You should following the instructions for setting up and administering a Gurobi Compute Server to set these up.

Distributed tuning actually only requires a single licensed Compute Server, even if you want to use multiple machines for tuning. We allow you to start as many free restricted Compute Servers as you want. A restricted Compute Server is a machine that runs Gurobi Compute Server, but will only accept jobs that are associated with distributed parallel tuning. By contrast, distributed concurrent MIP will only distribute work among fully licensed Compute Servers. Again, please refer to the instructions for setting up an administering a Gurobi Compute Server for details on setting up a restricted or standard Compute Server.

Once you've set up a set of one or more Compute Servers, you should list their names in the ServerPool parameter. You can provide either machine names or IP addresses, and they should be comma-separated. For example, you might use the following in your gurobi_cl command line:

gurobi_cl ServerPool=server1,server2,server3 ...
You can provide the Compute Server access password through the ServerPassword parameter. All servers in the server pool must have the same access password.

We should reiterate that distributed tuning only requires one of the machines in the server pool to be a licensed Compute Server. The rest can be restricted Compute Servers. For distributed concurrent MIP, all servers in the list must be licensed Compute Servers.

Note that providing a list of available servers is strictly a configuration step. It doesn't actually modify any algorithm behavior.

Requesting Distributed Algorithms

Once you've set up the server pool through the appropriate parameters, your next step is to set the TuneJobs or ConcurrentMIPJobs parameter. These parameters indicate how many distinct tuning or concurrent MIP jobs should be started on the available Compute Servers. For example, if you set TuneJobs to 2 in grbtune...

> grbtune ServerPool=server1,server2 TuneJobs=2 misc07.mps
...you should see the following output in the log:
Server capacity available on server1 - running now
Server capacity available on server2 - running now

Distributed tuning: launched 2 server jobs
This output indicates that two jobs have been launched, one on machine server1 and the other on machine server2. These two jobs will continue to run on these servers until your tuning run completes.

Similarly, if you launch distributed concurrent MIP...

> gurobi_cl ServerPool=server1,server2 ConcurrentMIPJobs=2 misc07.mps
...you should see the following output in the log:
Server capacity available on server1 - running now
Server capacity available on server2 - running now

Distributed concurrent MIP optimizer: launched 2 concurrent instances

If some of the servers in your server pool are running at capacity when you launch a distributed algorithm, the algorithm won't queue jobs. Instead, it will launch as many jobs as it can (up to the requested value), and it will run with these jobs.

While it may be tempting to equate jobs with machines, note that there are situations where you may want to run multiple jobs on the same server. For example, if you have an 8-core server and a 4-core server, you may wish to limit the thread count on each job to 4 and allow two jobs to run on the 8-core server. You would achieve this by using the Threads parameter to limit the thread count per job, and the JOBLIMIT server configuration option to limit the number of jobs per Compute Server (refer to the section on Compute Server configuration for details). If you set the job limit to 2 on the 8-core machine (call it server1) and to 1 on the 4-core machine (call it server2), you might see output that looks like the following when you run distributed tuning:

Server capacity available on server1 - running now
Server capacity available on server2 - running now
Server capacity available on server1 - running now
Since Gurobi Compute Server assigns a new job to the machine with the most available capacity, then assuming that the two servers are otherwise idle, the first job would be assigned to server1, the second to server2, and the third to server1.

As for the actual behavior of the distributed tuning and concurrent MIP algorithms, they are designed to be nearly indistinguishable from the single machine versions. In particular, distributed tuning respects all of the usual tuning parameters, including TuneTimeLimit, TuneTrials, and TuneOutput. Similarly, concurrent MIP respects all of the usual MIP parameters. Like the version that runs on a single machine, distributed concurrent MIP can be controlled using concurrent environments. The only major difference you should notice between the distributed and single-machines versions is in the aggregate performance.

One important thing we should note is that distributed algorithms can be affected by differences in machine speed. Consider tuning, for example. If one machine in your server pool is much slower than the others, any parameter sets that are run on the slower machine will appear to be less effective than if they were run on a faster machine. For this reason, we recommend that you use machines with similar performance when running distributed parallel algorithms. Similarly, if your machines have different core counts, we suggest that you use the Threads parameter to make sure that all tests use the same number of cores (since, by default, Gurobi will use all cores in the machine).

Try Gurobi for Free

Choose the evaluation license that fits you best, and start working with our Expert Team for technical guidance and support.

Evaluation License
Get a free, full-featured license of the Gurobi Optimizer to experience the performance, support, benchmarking and tuning services we provide as part of our product offering.
Academic License
Gurobi supports the teaching and use of optimization within academic institutions. We offer free, full-featured copies of Gurobi for use in class, and for research.
Cloud Trial

Request free trial hours, so you can see how quickly and easily a model can be solved on the cloud.

Search

Gurobi Optimization