Christian Mohrbacher [email protected] · Schmallenberg Dortmund Potsdam Berlin...

25
0 FhGFS Performance at the maximum Christian Mohrbacher [email protected]

Transcript of Christian Mohrbacher [email protected] · Schmallenberg Dortmund Potsdam Berlin...

Page 1: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

0

FhGFS – Performance at the maximum

Christian Mohrbacher [email protected]

Page 2: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

1

Introduction

Overview on FhGFS

Benchmarks

Page 3: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

2

The Fraunhofer Gesellschaft (FhG)

Fraunhofer is based in Germany

Largest organization for applied research in Europe

Annual research volume of 1.6 billion euros

17,000 employees

~ 60 Fraunhofer institutes with different

business fields

München

Holzkirchen

Freiburg

Efringen-

Kirchen

Freising Stuttgart

Pfinztal Karlsruhe Saarbrücken

St. Ingbert Kaiserslautern

Darmstadt Würzburg

Erlangen

Nürnberg

Ilmenau

Schkopau

Teltow

Oberhausen

Duisburg

Euskirchen Aachen St. Augustin Schmallenberg

Dortmund

Potsdam Berlin

Rostock Lübeck

Itzehoe

Braunschweig

Hannover

Bremen

Bremerhaven

Jena

Leipzig

Chemnitz

Dresden

Cottbus Magdeburg

Hall

e

Fürth

Wachtberg

Ettlingen

Kandern

Oldenburg

Freiberg

Paderborn

Kassel

Gießen Erfurt

Augsburg

Oberpfaffenhofen

Garching

Straubing

Bayreuth

Bronnbach

Prien

Page 4: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

3

The Fraunhofer ITWM

Institute for Industrial Mathematics

Located in Kaiserslautern, Germany

Staff: ~ 150 employees + ~ 70 PhD students

Page 5: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

4

ITWM’s Competence Center HPC

FhGFS Photorealistic RT

rendering Interactive

seismic imaging

Green IT Smart Grids

Programming models / tools

Research

Page 6: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

5

Introduction

Overview on FhGFS

Benchmarks

Page 7: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

6

FhGFS - Overview

Maximum Scalability

Flexibility Easy to use

Free to use Support by

Fraunhofer

http://www.fhgfs.com

Page 8: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

7

FhGFS – Key concepts (1)

Maximum Scalability

Distributed file contents & metadata

Initially optimized especially for HPC

Native Infiniband / RDMA

Page 9: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

8

FhGFS - Key concepts (2)

Flexibility

Add clients and servers without downtime

Multiple servers on the same machine

Client and servers can run on the same machine

Servers run on top of local FS

On-the-fly storage init => suitable for temporay “per-job” PFS

Flexible striping (per-file/per-directory)

Multiple networks with dynamic failover

Page 10: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

9

FhGFS - Key concepts (3)

Easy to use

Servers: userspace

Client: Kernel module w/o kernel patches

Graphical system administration & monitoring

Simple setup/startup mechanism

No specific Linux distribution

No special hardware requirements

Page 11: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

10

Partners / Vendors

Page 12: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

11

Customers (Examples)

2 Servers 2 Clients

8 TB 800 MB/s

12 Servers 900 Clients

1PB 20 GB/s

12 Servers 1200 Clients

300 TB 6 GB/s

5 Servers 100 Clients

200 TB 5 GB/s

Page 13: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

12

Current development

Integrated High Availability

No shared storage needed

Flexible mirroring

RAID10 available in 2012.10-beta1

Internal speed improvements

e.g. metadata format (available in 2012.10-beta1)

HSM integration

Grau Data and Fraunhofer collaborate

Providing a fast archiving solution

Built-in benchmarking tools (available in 2012.10-beta1)

Quotas

Page 14: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

13

Introduction

Overview on FhGFS

Benchmarks

Page 15: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

14

File Statistics

Dice PFS comparison project surveyed HPC data center representatives to find the most important metrics 1)

Multi-stream performance

Large block I/O

Metadata performance

File size statistics by Johannes Gutenberg University Mainz 2)

Large files are common (>100 GB)

Very small files (<=4k) are the most common

90% of files => 10% disk capacity

1) PFS Survey Report; http://www.avetec.org/appliedcomputing/dice/projects/pfs/docs/PFS_Survey_Report_Mar2011.pdf 2) A Study on Data Deduplication in HPC Storage Systems; Dirk Meister et al.; Johannes Gutenberg Universität; SC12

Page 16: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

15

Benchmarks – server hardware

20 servers for metadata and storage

2x Intel Xeon X5660 @ 2.8 GHz

48 GB RAM

4x Intel 510 Series SSD (RAID 0), Ext4

QDR Infiniband

Scientific Linux 6.3; Kernel 2.6.32-279

FhGFS 2012.10-beta1

Page 17: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

16

Streaming Throughput

0

5000

10000

15000

20000

25000

0 2 4 6 8 10 12 14 16 18 20

MB

/s

# Storage servers

Sequential Read/Write,

up to 20 servers, 160 client procs

Write

Read

Page 18: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

17

Streaming Throughput (2)

Single node local performance

Write: 1332 MB/s

Read: 1317 MB/s

20 nodes (theoretical)

Write: 26640 MB/s

Read: 26340 MB/s

FhGFS

Write: 26247 MB/s (98,5%)

Read: 24789 MB/s (94,1%)

25247

24789

0

5000

10000

15000

20000

25000

0 2 4 6 8 10 12 14 16 18 20

MB

/s

# Storage servers

Write

Read

Page 19: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

18

Streaming Throughput (3)

25409

26649

4096

8192

16384

6 12 24 48 96 192 384 768

MB

/s

# Clients

Sequential Read/Write,

20 servers, up to 768 client procs

Write

Read

Page 20: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

19

Shared file access (1)

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

0 2 4 6 8 10 12 14 16 18 20

MB

/s

# Servers

Sequential I/O, 1 shared file, 600k block size

up to 20 servers, 192 client procs

Write

Read

Page 21: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

20

Shared file access (2)

6000

7000

8000

9000

10000

11000

12000

13000

14000

12 24 48 96 192 384 768

MB

/s

# Clients

Sequential write, 1 shared file

20 servers, up to 768 client procs

Page 22: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

21

IOPS

109992

1126963

0

200000

400000

600000

800000

1000000

1200000

2 4 6 8 10 12 14 16 18 20

IOP

S

# Storage servers

IOPS (Random 4k writes)

up to 20 servers, 160 client procs

Page 23: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

22

Metadata performance

34693

539724

0

100000

200000

300000

400000

500000

600000

1 2 4 6 8 10 12 14 16 18 20

Cre

ate

/se

c

# MDS

Create

93007

1381339

0

200000

400000

600000

800000

1000000

1200000

1400000

1 2 4 6 8 10 12 14 16 18 20

Sta

t/se

c

# MDS

Stat

File create / stat

up to 20 servers, up to 640 client procs (32*#MDS)

Page 24: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

23

Metadata performance (2)

> 500,000 file creates per second

Creation of 1,000,000,000 files: ~ 33 minutes

Page 25: Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin Rostock Lübeck Itzehoe Braunschweig Hannover Bremen Bremerhaven Jena Leipzig Chemnitz

24

Questions?

http://www.fhgfs.com

http://wiki.fhgfs.com

Fraunhofer Booth

# 643