Compute nodes

Compute nodes is where the most work at Acuario cluster is performed. The resources of these nodes are managed by Slurm and are only accessible through it. The different compute nodes of Acuario cluster are detailed now.

 

Intel HNS2600WPF

Node list: from pez036 to pez043.

There are two Intel Server H2312WPQKR with 4 compute nodes each, but all of them works independently. Here are the tech specs of every node:

  • Model: Intel Server Board S2600WPQ (HNS2600WPF)
  • Motherboard: NUMA, 1 x PCIe x16 Gen 3 Low Profile
  • Total cores: 20 Ivy Bridge-EP
  • Total RAM: 128 GB
  • CPU: 2 x Intel Xeon E5-2660 v2
    • Family: Intel Xeon E5-2600 v2
    • Model: E5-2660 v2
    • Architecture: Ivy Bridge
    • Socket: Socket 2011 / LGA2011
    IntelServerBoardS2600WPQFunctionalBlockDiagram

    HNS2600WPF Functional block diagram

    • Cores: 10
    • Core speed: 2.2 GHz (turbo 3 GHz)
    • L1 cache: 10 x 32 KB for instructions and 10 x 32 KB for data
    • L2 cache: 10 x 256 KB for data
    • L3 cache 3 x 25 MB for data
    • QPI: 2 links at 8 GT/s (4000 MHz)
    • Memory channels: 4
  • RAM per CPU: 8 x 16 GB DDR3 @ 1600 MHz
  • Storage: 1 x HD SSD, 2.5, 240 GB Hyper X
  • Connections:
    • Ethernet: 2 x 1 Gb/s
    • Infiniband: 1 x 40 Gb/s

 

Dell PowerEdge R815

Node list: pez035

  • Total cores: 64 Abu Dhabi
  • Total RAM: 256 GB
  • CPU: 4 x AMD Opteron 6376
      • Family: AMD Opteron 6300 series
      • Model: 6376
      • Architecture: Piledriver (v2 Bulldozer)
      • Socket: G34
      • Cores: 16 [1]
    Module at R815

    Module at R815

    Inside a NUMA node at R815

    Inside a NUMA node at R815

    NUMA nodes at R815

    NUMA nodes at R815

    • Core speed: 2.3 GHz (turbo 2.6 GHz)
    • L1 cache: 8 x 64 KB for instructions and 16 x 16 KB for data
    • L2 cache: 8 x 2 MB for data
    • L3 cache: 2 x 6 MB for data
    • HT: 4 links at 6.4 GT/s [2]
  • RAM per CPU: 4 x 16 GB DDR3 @ 1600 MHz, RDIMM LV
  • Storage: 1 x HD SAS 6 Gb/s, 7.2K rpm, 2.5
  • Connections:
    • Ethernet: 4 x Broadcom NetXtreme II BCM5709 1000Base-T (C0)
    • Infiniband: 1 x Mellanox Technologies MT27500 Family [ConnectX-3]

1. Each CPU contains 2 chips (named this way by AMD) that are like 2 separate CPUs connected by 2 HT level 3 connections, one of 16 threads and another with 8 threads, with a bandwidth above typical HT. This makes that in a socket exist 2 NUMA nodes. Each NUMA node has its integrated memory controller with 2 channel memory access (presumably has direct access to 2 RAM slots as there are 4 per CPU). It also has an 6 MB L3 cache shared between all cores, moreover the cores are grouped in twos in a component called module where there are shared a 2 MB L2 cache, a 64 KB L1 instructions cache, the floating point unit and the instruction decoder. Each core has its ALU, its units of integer operations and a 16 KB L1 data cache.

2. All NUMA nodes (or chips) are interconnected in a three dimensional scheme. Each socket has an HT 3 8-wire connection with others. Because of the particular design of the CPU there are different distances between NUMA nodes that is necessary to take account at the moment of select the resources where will be running a code:

NUMA nodes distribution at R815

NUMA nodes distribution at R815

node   0   1   2   3   4   5   6   7 
  0:  10  16  16  22  16  22  16  22 
  1:  16  10  22  16  16  22  22  16 
  2:  16  22  10  16  16  16  16  16 
  3:  22  16  16  10  16  16  22  22 
  4:  16  16  16  16  10  16  16  22 
  5:  22  22  16  16  16  10  22  16 
  6:  16  22  16  22  16  22  10  16 
  7:  22  16  16  22  22  16  16  10

 

Bull bullx B510

Node list: from pez017 to pez033.

All nodes are blades from a Bull bullx B500 chassis. Here are the tech specs of every node:

  • Model: Bull bullx B510

    ClusterBullSCBDiagramaDeBloques

    Bull bullx B510 node

  • Motherboard: NUMA
  • Total cores: 16 Sandy Bridge-EP
  • Total RAM: 64 GB
  • CPU: 2 x Intel Xeon E5-2670
    • Family: Intel Xeon E5-2600
    • Model: E5-2670
    • Architecture: Sandy Bridge
    • Socket: Socket 2011 / LGA2011
    • Cores: 8
    • Core speed: 2.6 GHz (turbo 3.3 GHz)
    • L1 cache: 8 x 32 KB for instructions and 8 x 32 KB for data
    • L2 cache: 8 x 256 KB for data
    • L3 cache 3 x 20 MB for data
    • QPI: 2 links at 8 GT/s (4000 MHz)
    • Memory channels: 4
  • RAM per CPU: 4 x 8 GB DDR3 @ 1600 MHz
  • Storage: 1 x HD SSD, 2.5, 128 GB
  • Connections:
    • Ethernet: 1 x 1 Gb/s
    • Infiniband: 1 x 40 Gb/s