Improving networking performance of a Linux cluster

A. Bogdanov, V. Gaiduchok, N. Ahmed, P. Ivanov, I. Gankevich

Research output

Abstract

Networking is known to be a "bottleneck" in scientific computations on HPC clusters. It could become a problem that limits the scalability of systems with a cluster architecture. And that problem is a worldwide one since clusters are used almost everywhere. Expensive clusters usually have some custom networks. Such systems imply expensive and powerful hardware, custom protocols, proprietary operating systems. But the vast majority of up-to-date systems use conventional hardware, protocols and operating systems. For example, Ethernet network with OS Linux on cluster nodes. This article is devoted to the problems of small and medium clusters that are often used in universities. We will focus on Ethernet clusters with OS Linux. This topic will be discussed by an example of implementing a custom protocol. TCP/IP stack is used very often, it is used on clusters too. While it was originally developed for the global network and could impose unnecessary overheads when it is used on a small cluster with reliable network. We will discuss different aspects of Linux networking stack (e.g. NAPI) and modern hardware (e.g. GSO and GRO); compare performance of TCP, UDP, custom protocol implemented with raw sockets and as a kernel module; discuss possible optimizations. As a result several recommendations on improving networking performance of Linux clusters will be given. Our main goal is to point possible optimization of the software since one could change the software with ease, and that could lead to performance improvements.

Original languageEnglish
Pages (from-to)140-144
Number of pages5
JournalCEUR Workshop Proceedings
Volume1787
Publication statusPublished - 2016

Fingerprint

Network protocols
Computer operating systems
Ethernet
Computer hardware
Computer systems
Computer networks
Scalability
Hardware
Linux

Scopus subject areas

  • Computer Science(all)

Cite this

Bogdanov, A. ; Gaiduchok, V. ; Ahmed, N. ; Ivanov, P. ; Gankevich, I. / Improving networking performance of a Linux cluster. In: CEUR Workshop Proceedings. 2016 ; Vol. 1787. pp. 140-144.
@article{2080dde0d1394caeb0a8cd4719611500,
title = "Improving networking performance of a Linux cluster",
abstract = "Networking is known to be a {"}bottleneck{"} in scientific computations on HPC clusters. It could become a problem that limits the scalability of systems with a cluster architecture. And that problem is a worldwide one since clusters are used almost everywhere. Expensive clusters usually have some custom networks. Such systems imply expensive and powerful hardware, custom protocols, proprietary operating systems. But the vast majority of up-to-date systems use conventional hardware, protocols and operating systems. For example, Ethernet network with OS Linux on cluster nodes. This article is devoted to the problems of small and medium clusters that are often used in universities. We will focus on Ethernet clusters with OS Linux. This topic will be discussed by an example of implementing a custom protocol. TCP/IP stack is used very often, it is used on clusters too. While it was originally developed for the global network and could impose unnecessary overheads when it is used on a small cluster with reliable network. We will discuss different aspects of Linux networking stack (e.g. NAPI) and modern hardware (e.g. GSO and GRO); compare performance of TCP, UDP, custom protocol implemented with raw sockets and as a kernel module; discuss possible optimizations. As a result several recommendations on improving networking performance of Linux clusters will be given. Our main goal is to point possible optimization of the software since one could change the software with ease, and that could lead to performance improvements.",
keywords = "Computational clusters, GRO, GSO, Kernel, Linux, NAPI, Networking, Networking protocols, Sockets",
author = "A. Bogdanov and V. Gaiduchok and N. Ahmed and P. Ivanov and I. Gankevich",
year = "2016",
language = "English",
volume = "1787",
pages = "140--144",
journal = "CEUR Workshop Proceedings",
issn = "1613-0073",
publisher = "RWTH Aahen University",

}

Bogdanov, A, Gaiduchok, V, Ahmed, N, Ivanov, P & Gankevich, I 2016, 'Improving networking performance of a Linux cluster', CEUR Workshop Proceedings, vol. 1787, pp. 140-144.

Improving networking performance of a Linux cluster. / Bogdanov, A.; Gaiduchok, V.; Ahmed, N.; Ivanov, P.; Gankevich, I.

In: CEUR Workshop Proceedings, Vol. 1787, 2016, p. 140-144.

Research output

TY - JOUR

T1 - Improving networking performance of a Linux cluster

AU - Bogdanov, A.

AU - Gaiduchok, V.

AU - Ahmed, N.

AU - Ivanov, P.

AU - Gankevich, I.

PY - 2016

Y1 - 2016

N2 - Networking is known to be a "bottleneck" in scientific computations on HPC clusters. It could become a problem that limits the scalability of systems with a cluster architecture. And that problem is a worldwide one since clusters are used almost everywhere. Expensive clusters usually have some custom networks. Such systems imply expensive and powerful hardware, custom protocols, proprietary operating systems. But the vast majority of up-to-date systems use conventional hardware, protocols and operating systems. For example, Ethernet network with OS Linux on cluster nodes. This article is devoted to the problems of small and medium clusters that are often used in universities. We will focus on Ethernet clusters with OS Linux. This topic will be discussed by an example of implementing a custom protocol. TCP/IP stack is used very often, it is used on clusters too. While it was originally developed for the global network and could impose unnecessary overheads when it is used on a small cluster with reliable network. We will discuss different aspects of Linux networking stack (e.g. NAPI) and modern hardware (e.g. GSO and GRO); compare performance of TCP, UDP, custom protocol implemented with raw sockets and as a kernel module; discuss possible optimizations. As a result several recommendations on improving networking performance of Linux clusters will be given. Our main goal is to point possible optimization of the software since one could change the software with ease, and that could lead to performance improvements.

AB - Networking is known to be a "bottleneck" in scientific computations on HPC clusters. It could become a problem that limits the scalability of systems with a cluster architecture. And that problem is a worldwide one since clusters are used almost everywhere. Expensive clusters usually have some custom networks. Such systems imply expensive and powerful hardware, custom protocols, proprietary operating systems. But the vast majority of up-to-date systems use conventional hardware, protocols and operating systems. For example, Ethernet network with OS Linux on cluster nodes. This article is devoted to the problems of small and medium clusters that are often used in universities. We will focus on Ethernet clusters with OS Linux. This topic will be discussed by an example of implementing a custom protocol. TCP/IP stack is used very often, it is used on clusters too. While it was originally developed for the global network and could impose unnecessary overheads when it is used on a small cluster with reliable network. We will discuss different aspects of Linux networking stack (e.g. NAPI) and modern hardware (e.g. GSO and GRO); compare performance of TCP, UDP, custom protocol implemented with raw sockets and as a kernel module; discuss possible optimizations. As a result several recommendations on improving networking performance of Linux clusters will be given. Our main goal is to point possible optimization of the software since one could change the software with ease, and that could lead to performance improvements.

KW - Computational clusters

KW - GRO

KW - GSO

KW - Kernel

KW - Linux

KW - NAPI

KW - Networking

KW - Networking protocols

KW - Sockets

UR - http://www.scopus.com/inward/record.url?scp=85016208520&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85016208520

VL - 1787

SP - 140

EP - 144

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

ER -