Click or drag to resize

Simulating on Remote Clusters

This page will help users run jobs (e.g. EMA3D® simulations) on remote compute clusters. Information about Windows and Linux clusters is included, as well as common issues users may encounter. Note that it is assumed users are running an EMA3D simulation, but much of the information here pertains to any executable. This page contains the following sections, use the links to jump to a section:

General Instructions

These tasks will need to be completed whether or not the remote cluster target is Linux or Windows based (there is not support for mixed OS).

  • Starting from your workstation users will need to have access to either a command prompt or a Linux style terminal

  • Identify a shared network drive location that can be accessed by all compute nodes

  • Place required files in this shared location; for EMA3D this will be the *.emin file and the *.dat file that describes the input signal

  • Create a host file (e.g. a .txt file) that will identify the remote machines and the number of processes to be run (an example is shown below). Each machine must be placed on its own line, and the specified number of processes must add up to the total number of blocks as defined by the *.emin file (in this case that would be 8)

    MACHINE_NAME_1:4

    MACHINE_NAME_2:4

Please note the following information about the commands given in the sections below:

  • The "<some instruction>" indicates users should replace the entire thing with that instruction (<2+2> implies users should put 4)

  • The "-v" flag indicates the output should be verbose

  • The "-envnone" flag indicates environment variables won't be propagated to remote machines

  • The use of "working directory" indicates the directory on your local machine that the *.emin and *.dat files for the simulation originated from

  • The use of "*" before a file extension indicates the actual filename should be specified by the user

Windows → Windows Cluster
  • Run the following command in your command prompt or terminal:

    mpiexec -v -np <number of total blocks> -wdir <path to shared network drive> -hostfile <path to hostfile> -genvnone ema3d.exe *.emin

As an example we will assume that the shared network drive is "\\share_drive\directory\example", that the hostfile (named "hfile.txt") was created in the working directory, the number of total processes is 8, and the input file is "test.emin".

mpiexec -v -np 8 -wdir \\share_drive\directory\example -hostfile hfile.txt -genvnone ema3d.exe test.emin

Windows → Linux Cluster

When using a remote Linux cluster we will have to ssh into one of the remote machines and launch the process from there. Pick one of the target machines to use as the "<remote launcher> " and make sure users know its password. Also note that in this case "<path to shared network drive>" must be from the Linux perspective (i.e. forward slashes instead of backward slashes).

  • Place the hostfile on the "<remote launcher>" machine (the shared network drive is the easiest choice)

  • Run the following command in your command prompt or terminal:

    ssh <remote launcher> mpiexec -v -np <number of blocks> -wdir <path to shared network drive> -hostfile <path to hostfile> -genvnone ema3d *.emin

  • Enter the password for the remote machine when prompted

As an example, we will consider the shared network drive to be "\\share_drive\directory\example" which was mounted into "/directory", that the hostfile is named "hfile.txt" and was placed in the shared drive, the number of total processes is 8, the input file is "test.emin", and the "<remote launcher>" is "username@computername".

ssh username@computername mpiexec -v -np 8 -wdir /directory/example -hostfile /directory/example/hfile.txt -genvnone ema3d test.emin

Linux → Linux Cluster

The instructions here are almost identical to those in the preceding section (Windows → Linux Cluster), but do not require the use of the "ssh <remote launcher>" command. The resulting change can be see in the example below, which considers the same setup as the example in the previous section.

mpiexec -v -np 8 -wdir /directory/example -hostfile /directory/example/hfile.txt -genvnone ema3d test.emin

Finishing Up
  • The process will run with output being displayed on your screen, it should be quite clear whether it was successful or not

  • Assuming success, look in the shared network drive to find the output files which can be copied back to your working directory for analysis

  • If the process failed then consult the section immediately below, debug yourself using the error output, or reach out for more help

Known Issues

Many of the issues that crop up can be solved by making sure everything (i.e. mpiexec and ema3d) works as expected from the remote machines. The following list will have the format of "(Potential Error) : Fix", but note that it may be incomplete.

  • Process hangs on license initialization: Run standard ema3d from remote machine and manually check out license

  • Process hangs in *.emin file check: Make sure that ema3d is the same version across remote machines, or place ema3d executable in the shared drive and append a ".\" or a "./" to the ema executable for Windows and Linux respectively

  • Process can't find ema3d executable: Make sure that ema3d is on the PATH so that it is runnable from anywhere

  • Process can't connect to remote machine: Make sure that the machine is available

  • Process can't locate working directory: Make sure the path is correct, or on Linux that the drive is mounted (see Useful Commands below)

Useful Commands
  • Mount a shared network drive on Linux

    sudo mount -v -t cifs <path to shared network drive> <location to mount to> -o username=<username>,password=<password>,domain=<domain>,vers=3.0,uid=<uid>

  • Generate a key to use ssh without a password for the machine "<remote host>"

    ssh-keygen

    ssh-copy-id -i ~/.ssh/id_rsa.pub <remote host>

Other Resources

Intel MPI Documentation, look at the global and local hydra options

EMA3D - © 2025 EMA, Inc. Unauthorized use, distribution, or duplication is prohibited.