Kohinoor 4

Introduction: Kohinoor 4 is the fourth HPC cluster in the Kohinoor tetralogy of clusters in TIFR – TCIS, Hyderabad, which was installed and operational from May 2019.
This cluster is composed of 90 nodes, in which two of them are head nodes and all the other are execution nodes. This cluster is composed of 88 CPU nodes with 32 cores per node. The cluster nodes are connected together through a completely non-blocking Intel OmniPath (OPA) switch. The cluster is managed by the open source batch scheduler “PBS Pro” software for job scheduling and load balancing. The head-node allows user logins for job submission in the cluster. The cluster has a locally attached ZFS File System of 200 TB across the nodes through OPA switch, which is used for computational runs and another 400 TB ZFS storage is attached to the secondary head node of the cluster for the purpose of archiving and post-processing of data.

OEM – M/s. Acer Incorporated (Supplied and installed by Vendor M/s. Locuz Enterprise Solutions Ltd, Hyderabad)

Kohinoor 4 Overview

  1. Master node -1 (Job submission node)
    • 2 * Intel Broadwell 8C E5-2620V4 2.1 GHz 20M 8GT/s
    • 64 GB DDR4 2133MHz RAM.
    • 2 × 900 GB Enterprise SATA SSD Hard Disk configured in RAID 1 for Operating System
    • 1 X Intel 100 G Omnipath port
  2. Master node – 2 (Interactive node + Archival Storage)
    • 2 * Intel Skylake 8C Silver 4110 2.1 GHz 20M 8GT/s
    • 384 GB DDR4 2666MHz RAM.
    • 2 × 900 GB Enterprise SATA SSD Hard Disk configured in RAID 1 for Operating System
    • 2 × 480 GB Enterprise SATA SSD Hard Disk configured in RAID 1 for ZIL Cache
    • 2 × 900 GB Enterprise SATA SSD Hard Disk configured in RAID 1 for L2ARC Cache
    • 360 TB Usable space using 10 TB Enterprise SATA Hard Disks configured in RAID Z2
    • 1 X Intel 100 G Omnipath port
  3. Compute nodes (CPU only) [88 Nos.]
    • 2 * Intel Skylake 16C Gold 6130 2.1 GHz 20M 8GT/s
    • 96 GB DDR4 2666MHz RAM.
    • 1 × 120 GB Enterprise SATA SSD Hard Disk
    • 1 X Intel 100 G Omnipath port
  4. Compute Storage
    • 2 * Intel Broadwell 8C E5-2620V4 2.1 GHz 20M 8GT/s
    • 256 GB DDR4 2133MHz RAM.
    • 2 × 480 GB Enterprise SATA SSD Hard Disk configured in RAID 1 for Operating System
    • 2 × 480 GB Enterprise SATA SSD Hard Disk configured in RAID 1 for ZIL Cache
    • 2 × 900 GB Enterprise SATA SSD Hard Disk configured in RAID 1 for L2ARC Cache
    • 192 TB Usable space using 10 TB Enterprise SATA Hard Disks configured in RAID Z2
    • 1 X Intel 100 G Omnipath port
  5. Networking & Interconnect
    • Primary compute nodes communication network is through a completely non-blocking interconnect of 6 Nos of 48 port Intel Omnipath 100 G switch
    • Secondary communication network for cluster management is through a 48 port Dell Gigabit Ethernet switch
  6. System Software
    • Operating System – CentOS 7.6
    • Clustering tool – XCAT
    • Job Scheduler – PBS Pro (Open Source)
  7. Libraries
    • GNU compiler collection
    • MVAPICH 2.2.3
    • OpenMPI 3.1.3
  8. Application software/Libraries
    • LAMMPS, Gromacs, FFTW, MPI, GERRIS, Quantum Expresso, etc.

TCIS-Kohinoor 3 Cluster Document