Last modified: July 21 2017.

hpchelp@iitd.ac.in

Frequently Asked Questions

  • How to Cite/acknowledge

    Acknowledgement text: The authors thank IIT Delhi HPC facility for computational resources.
  • What are the best practices for users?

    Best practices are listed here.
  • Access of HPC Cluster outside IITD

    For IITD Users: Currently, HPC Cluster is not accessible outside the campus directly. You can, however, request for a VPN connection.
    For Non IITD users: Please have a look at the Usage Charges
  • Using "LOW" queue for Job submission

    At the time of job submission use:
    #PBS -q low

    Within your job submission script use :
    qsub -q low <job script>

    If your job is in queued (Q) state then use :
    qmove low <job id>
  • How to check disk quota

    Currently, users are limited to 30GB of space on /home and 200TB on /scratch. Users can check their quota usage :
    lfs quota -hu $USER /home
    lfs quota -hu $USER /scratch
    Users can request more diskspace for their home directories via their supervisors. Current upper limit is 900GB.
  • How to check the older files

    The lists all regular files in a user directory more than 30 days old.
     lfs find $HOME -mtime +30 -type f -print | xargs du -ch 
     lfs find $SCRATCH -mtime +30 -type f -print | xargs du -ch 
  • How To Set Up SSH Keys

    Step One

    Create the RSA Key Pair : The first step is to create the key pair on the client machine. In the case of HPC cluster, it will be any one of the login nodes.
    ssh-keygen -t rsa

    Step Two

    Store the Keys and Passphrase Once you have entered the keygen command, you will get a few more questions:
    Enter file in which to save the key (/home/demo/.ssh/id_rsa): You can press enter here, saving the file to the user home (in this case, my example user is called demo).

    Enter passphrase (empty for no passphrase):
    The entire key generation process looks like this:
    ssh-keygen -t rsa
    Generating public/private rsa key pair.
    Enter file in which to save the key (/home/demo/.ssh/id_rsa): 
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again: 
    Your identification has been saved in /home/demo/.ssh/id_rsa.
    Your public key has been saved in /home/demo/.ssh/id_rsa.pub.
    The key fingerprint is:
    4a:dd:0a:c6:35:4e:3f:ed:27:38:8c:74:44:4d:93:67 demo@a
    The key's randomart image is:
    +--[ RSA 2048]----+
    |          .oo.   |
    |         .  o.E  |
    |        + .  o   |
    |     . = = .     |
    |      = S = .    |
    |     o + = +     |
    |      . o + o .  |
    |           . o   |
    |                 |
    +-----------------+
    The public key is now located in /home/demo/.ssh/id_rsa.pub The private key (identification) is now located in /home/demo/.ssh/id_rsa

    Step Three

    Copy the Public Key to the server. Once the key pair is generated, it's time to place the public key on the server that we want to use. You can copy the public key into the new machine's authorized_keys file with the ssh-copy-id command. Make sure to replace the example username and IP address below.
    ssh-copy-id user@hpc.iitd.ac.in

    OR
    Alternatively, you can paste in the keys using SSH:
    cat ~/.ssh/id_rsa.pub | ssh user@hpc.iitd.ac.in "mkdir -p ~/.ssh && cat >>  ~/.ssh/authorized_keys"
    No matter which command you chose, you should see something like:
    The authenticity of host '12.34.56.78 (12.34.56.78)' can't be established.
    
    RSA key fingerprint is b1:2d:33:67:ce:35:4d:5f:f3:a8:cd:c0:c4:48:86:12.
    
    Are you sure you want to continue connecting (yes/no)? yes
    
    Warning: Permanently added '12.34.56.78' (RSA) to the list of known hosts.
    
    user@12.34.56.78's password:
    Now try logging into the machine, with "ssh 'user@hpc.iitd.ac.in'", and check in:
    ~/.ssh/authorized_keys
    to make sure we haven't added extra keys that you weren't expecting. Now you can go ahead and log into user@hpc.iitd.ac.in and you will not be prompted for a password. However, if you set a passphrase, you will be asked to enter the passphrase at that time (and whenever else you log in in the future). In the case of HPC cluster, you will be on a login node and to check you will try ssh user@hpc.iitd.ac.in.
  • Compiling and testing GPU and Xeon Phi programs

    Two login nodes for GPU and Xeon Phi are available. For GPUs, users can login to gpu.hpc.iitd.ac.in and for Xeon Phi mic.hpc.iitd.ac.in. These nodes have two accelerator cards each.
  • Accessing HPC facility using Windows/Linux

    Please see How to access.
  • Environment Modules


    What are Environment Modules.?

    The Environment Modules package provides for the dynamic modification of a user's environment via modulefiles.Each modulefile contains the information needed to configure the shell for an application. Once the Modules package is initialized, the environment can be modified on a per-module basis using the module command which interprets modulefiles. Typically modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, etc. modulefiles may be shared by many users on a system and users may have their own collection to supplement or replace the shared modulefiles. Modules can be loaded and unloaded dynamically and atomically, in a clean fashion. All popular shells are supported, including bash, ksh, zsh, sh, csh, tcsh, as well as some scripting languages such as perl and python. Modules are useful in managing different versions of applications. Modules can also be bundled into metamodules that will load an entire suite of different applications. Examples of usage:
    • List of available modules:
      $ module avail
    • Loads specific module:
      $ module load apps/lammps/gpu-mixed
    • Provide a brief description of the module:
      $ module whatis apps/lammps/gpu-mixed
      apps/lammps/gpu-mixed: LAMMPS MIXED PRECISION GPU 9 Dec 2014

  • Gtk-WARNING **: cannot open display
    -OR-
    Error: Can't open display:

    If you are using Linux system, use ssh -X @hpc.iitd.ac.in. Windows users: please see "X11 Forwarding"
  • X11 Forwarding in Windows


    We can run graphical programs on Linux/Solaris machines on IITD HPC remotely and display them on your desktop computer running Windows. We can do this by using running two applications together on your Windows machine: Xming and PuTTY.

    What is Xming?

    Xming is a PC X Window Server. This enables programs being run remotely to be displayed on your desktop. Download and run the installation program from: http://sourceforge.net/projects/xming/


    Navigate to the Files section and download:
     a)Xming setup from the Xming folder
     b)the fonts package installer from the Xming-fonts folder Note:

    1.) By default both programs will be installed into the same location, so don't the worry about over writing files. We cannot work without both packages.
    2.) Once installed, running All Programs > Xming > XLaunch is a good idea to see what the configuration looks like. In most cases, the default options should be just fine.
    3.) Finally run All Programs > Xming > Xming to start the PC X Server. The "X" icon should be visible on the Windows Taskbar. The X Server must be started before setting up a SSH connection to the HPC facility.

    What is PuTTY?


    PuTTY is a free SSH client. Through PuTTY connect to HPC facility. Download the single Windows executable file from: http://www.putty.org

    Configuring PuTTY


    Under Session, enter the hostname you want to connect to: hpc.iitd.ac.in on port 22. Make sure the connection type is ssh.
    1. Next, scroll to Connection > SSH > X11. Check the box next to Enable X11 Forwarding. By default the X Display location is empty. You can enter localhost:0. The remote authentication should be set to MIT-Magic-Cookie-1
    2. Finally go back to Session. You can save your session too, and load it each time you want to connect.
    3. Click Open to bring up the terminal and login using your username/password .
  • SCP not functional

    Sometimes scp (or rsync) breaks "suddenly". Here is a list of things to check:
    1. Do you have enough disk space?
    2. Are you copying to the correct path?
    3. Is there any entry in your ~/.bashrc files which is generating messages?
  • Old Nodes

    The old K20 nodes can be accessed by the flag "K20GPU=true". e.g.
    1. CPU job: 2 nodes, 16 cores each from the "cc" project
      qsub -IP cc -l select=2:ncpus=16:K20GPU=true
    2. GPU job: 2 nodes, 16 cores, 2 GPUs each from the "cc" project
      qsub -IP cc -l select=2:ncpus=16:ngpus=2:K20GPU=true
  • Large Jobs

    When submitting jobs spanning multiple nodes, you can be assigned any node - CPU, GPU, Xeon Phi or the old K20 node(s). If you want to explicitly exlcude old nodes from your submission, use the K20GPU=false flag:
    1. CPU job: 2 nodes, 16 cores each from the "cc" project, exclude old nodes
      qsub -IP cc -l select=2:ncpus=16:K20GPU=false
    2. GPU job: 2 nodes, 16 cores, 2 GPUs each from the "cc" project, exclude old nodes
      qsub -IP cc -l select=2:ncpus=16:ngpus=2:K20GPU=false
  • Bad Interpreter

    Issue:
     -bash: /var/spool/PBS/mom_priv/jobs/58524.hn1.hpc.iitd.ac.in.SC: /bin/bash^M: bad interpreter: No such file or directory 
    When submitting jobscript from Windows system to HPC (Linux environment) use dos2unix, the program that converts plain text files in DOS format to UNIX format.
    Example : dos2unix submit.sh submitscript.sh
    
  • Accessing Internet

    By default, HPC users have access to IITD intranet from login nodes. In the following procedure we are trying to get internet access from the login02 node with lynx web browser. The procedure will work on any IITD HPC login node :-

    IITD proxy login page can be accessed via the terminal based lynx web browser. Please set the SSL_CERT_FILE variable to the path of your IITD CA certificate.
    [user1$login02]$ export SSL_CERT_FILE=$HOME/mycerts/CCIITD-CA.crt
    
    
    Access the proxy login URL via lynx or firefox (ssh -X) browser after logging in to the IITD HPC account.
    [user1@login02]$ lynx https://proxy82.iitd.ernet.in/cgi-bin/proxy.cgi
                                                       
    
                                              IIT Delhi Proxy Login
    
    
                                          User ID:  ____________________
    
                                          Password: ____________________
    
                                                    Log on
    
    
    NOTE: The URL varies per user basis. For staff the URL is https://proxy21.iitd.ernet.in/cgi-bin/proxy.cgi

    After successful authentication, you should be able to see the following output on your terminal :-
    
                                              IIT Delhi Proxy Login
    
    
    
                         You are logged in successfully as user1 from xx.xx.xx.x
    
    
    
                                        Date and Time for your last Kerberos
    
                         Password Change   Successful Authenticaton Unsuccessful Authentication
    
                       10-11-2015 10:22:04   18-03-2016 10:52:56       16-03-2016 10:27:34
    
                             *Please change your password at least once in three months*
    
                             Click to continue browsing: http://www.cc.iitd.ernet.in/
    
                                         Check your Proxy Usage/Quota here
    
         For non-browser Applications (Proxy_Name: proxy82.iitd.ernet.in Proxy_IP: 10.10.79.29 Proxy_port: 3128)
    
        	         			 * Click "Log out" to logout:
    
    
                                                     Log out
    
    
                           Please keep this page open and browse from another window/tab
    
    
    Notedown the proxy ip & port ( Proxy_IP is 10.10.79.29 and Proxy_port is 3128) & the login node's hostname (login02).

    From a new terminal , log in to the hpc account & go to the same login node ( login02 ) where lynx is running & set http_proxy , https_proxy environment variable within your terminal as:
    [user1@login02]$ export http_proxy=10.10.79.29:3128
    [user1@login02]$ export https_proxy=10.10.79.29:3128
    
    
    Now you can use commands like wget & git clone to access internet.
  • How to install python packages

    setup internet connectivity (check FAQ entry: Accessing internet) & load python compiler in your environment
    example:
    module load compiler/python/2.7.10/compilervars
    
    now using pip python command specify the directory for package installation as:
    pip install --ignore-installed --install-option="--prefix=${HOME}/MYPYTHONMODULES" package_name
    
    Set the environment variable PYTHONPATH as:
    export PYTHONPATH=${HOME}/MYPYTHONMODULES/lib/python2.7/site-packages:$PYTHONPATH
    
    now you should be able to import & use installed python modules.
    
  • How to check Budget Status

    LC_ALL=en_IN /opt/alloc_db/user_scripts/budget_status -P cc -p 2016.q2 -v
    Where :
    -P is Project Name
    -p is Year with Quarter
  • How to check Project Summary

    LC_ALL=en_IN /opt/alloc_db/user_scripts/project_summary -P cc -p 2016.q2 -s hn1
    Where :
    -P is Project Name
    -p is Year with Quarter
    -s is System
  • Out of memory / segmentation fault

    It is possible that the program runs out of the default memory available. Users are advised to update the "ulimit" using the following commands:
    $ ulimit -s unlimited
    
    If this resolves the issue, the same commands should be added to the ~/.bashrc file.
  • PBS Error: Alloc DB reservation failed

    ERROR: Not Running: PBS Error: Alloc DB reservation failed, holding job
    Resolution: 1. Submit job in low queue    (e.g  qsub -q low scriptname.sh)
     	    2. move your job in low queue (e.g. qmove low jobid)  	
    
  • Non-availability of slots and jobs on hold

    ERROR: Not Running: Either request fewer slots for your application, or make more slots available for use
    Resolution: please mention the "mpiprocs" resource in your job submission script.
    
  • qsub: Error: Insufficient balance

    ERROR: Not Running: qsub:
    Error: insufficient balance
    The project has insufficient funds to run the requested job. Users should check the allocation budget and contact the HPC representative in case the allocation in insufficient. The job may be submitted to the "low" queue.