You are on page 1of 44

Copyright 2015, Oracle and/or its affiliates. All rights reserved.

OpenWorld 2015
200 Million QPS on Commodity Hardware
Getting Started with MySQL Cluster 7.4
Frazer Clement
MySQL Cluster Technical Lead
Bernd Ocklin
Director, MySQL Cluster Engineering
October 26th, 2015

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

200 Million QPS on Commodity Hardware


Getting started with MySQL Cluster 7.4

Users, Features and Releases

Design for Availability and Scale

Performance, getting to 200M queries/second

How to get started with MySQL Cluster

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement


The following is intended to outline our general product direction. It is
intended for information purposes only, and may not be incorporated
into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing
decisions. The development, release, and timing of any features or
functionality described for Oracles products remains at the sole
discretion of Oracle.

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

Keynote: Monday, 4.00-6.00 pm, YBCA Theater


State of the Dolphin

Rich Mason, SVP & General Manager MySQL GBU, Oracle


Tomas Ulin, VP MySQL Engineering, Oracle

Customer Experiences

Hari Tatrakal, Director of Database Services, Live Nation


Olaniyi Oshinowo, MySQL & Open Source Technologies Leader, GE
Ernie Souhrada & Rob Wultsch, Database Engineers, Pinterest

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

MySQL Cluster content @ OpenWorld


Fully Elastic RealTime Services with
MySQL Cluster

MySQL Server and


MySQL Cluster at
Indias Financial
Inclusion Gateway
Service

Get Started with


MySQL Cluster

Bernd Ocklin, Oracle

NEC et al

Conference Session

Conference Session

Benedita Vasconcelos,
Oracle
Hands On Lab

Tuesday 11am.
Moscone South, 262

Tuesday 5.15pm
Moscone South, 250

Thursday, 9.30am
Hotel Nikko - Peninsula

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

MySQL Community Reception @ OpenWorld


Celebrate, Have Fun and Mingle with Oracles MySQL Engineers
& Your Peers

Tuesday, October 27th, 7 pm

Jillians at Metreon: 175 Fourth Street, San Francisco CA94103

At the corner of Howard and 4th st.; only 2-min walk from Moscone Center
(same place as last year)

Join us!
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

MySQL Cluster deployments


Web

Telecoms

Other

High volume OLTP


eCommerce
On-Line Gaming
Digital Marketing

User Profile
Management
Session Management &
Caching

Service Delivery
Platforms

Mobile Content Delivery

VAS: VoIP, IPTV & VoD

Mobile Payments

Online gaming : AAA +


profile management

Many more, some


unknown

Payment fraud detection

DBMS research

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

Who's using MySQL Cluster?

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

MySQL Cluster highlights


High
Throughput
Reads & Writes
Carrier-Grade
Availability

Distributed, Parallel architecture


Transactional, ACID-compliant relational database
Shared-nothing design, synchronous data replication
Sub-second failover & self-healing recovery

Real-Time
Responsiveness

Data structures optimized for RAM. Real-time extensions


Predictable low latency, bounded access times

On-Line, Linear
Scalability

Incrementally scale out, scale up and scale on-line


Linearly scale with distribution awareness

Low TCO,
Open platform

GPL & Commercial editions, scale on COTS


Flexible APIs: SQL, C++, Java, OpenJPA, LDAP & HTTP

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

10

MySQL Cluster highlights


HA, High performance, Relational, Transactional, Distributed,
Parallel, SQL, NoSQL, Shared-Nothing, Commodity ...

Joins, Foreign Keys, Transactions

SQL
, Row locks, Triggers, Views,
Stored procedures, Blobs, keyless tables, newSQL, MySQL compatible... connectors for most
languages, ORMs etc...
NoSQL Full C++ Api for best control and performance (MySQLD SE built on top), Other Apis :
Java, JPA, Node.js, Memcache....
HA 99.999% uptime systems (five nines),

No single point of failure (SPOF),

Heartbeating, cluster membership, automatic failover + recovery, automatic client


failover, transactional DDL, CP, async replication, advanced exception logging...

High throughput, low bounded latency

Performance and parallelism


(200M read tx/s). Batching, optimised protocols, Intra and Inter query parallelism, pushed
parallel filters, pushed parallel joins, non-blocking event driven multithreaded....

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

11

MySQL Cluster highlights


HA, High performance, Relational, Transactional, Distributed,
Parallel, SQL, NoSQL, Shared-Nothing, Commodity ...
Scalability Scale-out nodegroups or stateless API clients online, Scale-up data nodes and
clients online with multithreading, scale up hardware online
Replication

Synchronous two phase commit internally, Transactional HA

async replication between clusters, conflict detection+resolution...


Storage Data transparently distributed and balanced by hash , Indexed columns in
memory, others on disk or memory, Secondary unique and ordered indexes, Redundant Redo

and periodic checkpoints...

logs

Manageability Online add + drop (index, column) , Online consistent backup, Online
upgrade, Online OS or hardware upgrade, consolidated cluster logs, C management Api for tooling...
Shared nothing, Commodity No need for shared storage, In-memory data uses disk frugally,
TCP over Ethernet / Infiniband etc, No special layer 2 requirements.

Open source.

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

12

MySQL Cluster Releases


Regular fixes and improvements

7.2
2012

2013

- Distributed
parallel joins
- Multi-TC
- Active-Active
- Memcached
- MySQL Server
5.5

MySQL Cluster is built on top of and tracks GA


MySQL Server releases, gaining their
features, optimisations and bug fixes.

7.3
2014
- Foreign keys
- Client lib
performance
- node.js
- MySQL
Server 5.6

7.4
2015
- Restart
performance
- Active-Active
- Internal
reporting
- MySQL
Server 5.6

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

...

13

MySQL Cluster 7.4

More detail and download links at


dev.mysql.com

Performance
optimisations in the
data node kernel

Active-Active
replication
enhancements

System restart and


maintenance activities
parallelised

Improved
observability and
manageability

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

14

MySQL Cluster Architecture

JPA

Most Application Nodes are themselves


Servers for various client protocols

REST

Application Nodes
NdbApi protocol

F2

Node Group 2

Node Group 1

F1

Node Group 3

F1

F2

F3

F4

F5

F6

F6

F4

F5

F6

Table 1

F1

F2

F3

F3
F4
F5

Cluster
Mgmt

Data
Nodes

Cluster
Mgmt

Tables and Indices are horizontally partitioned, distributed across and


replicated within the NodeGroups. Application Nodes including MySQLD, use
NdbApi to perform transactional operations and queries on data.
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

15

MySQL Cluster Architecture for Availability


Redundancy for availability

- All nodes in each nodegroup store the same data


- Can survive data node failures so long as one node per nodegroup is
available.
- Load balanced, Synchronous 2PC, heartbeating,
automatic failover,
REST
JPA
recovery
Application Nodes

Redundant
components

P
C A

NdbApi protocol

F2

Node Group 2

Node Group 1

F1

Node Group 3

F1

F2

F3

F4

F5

F6

F6

F4

F5

F6

Table 1

F1

F2

F3

F3
F4
F5

Cluster
Mgmt

Data
Nodes

Cluster
Mgmt

MySQL Cluster is a CP system in that consistency is favoured over


availability. Async replication between clusters gives AP properties

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

16

MySQL Cluster Architecture for Availability


Redundancy for availability

- Two (or more) management servers.


- Used for configuration, node startup/shutdown, triggering
backups, logging + 'split-brain' arbitration
- Not critical not JPA
involved in transaction
processing /
REST
querying
Application Nodes

Redundant
components

NdbApi protocol

F2

Node Group 2

Node Group 1

F1

Node Group 3

F1

F2

F3

F4

F5

F6

F6

F4

F5

F6

Table 1

F1

F2

F3

F3
F4
F5

Cluster
Mgmt

Data
Nodes

Cluster
Mgmt

Management nodes act as lightweight arbitrators, avoiding the cost of


odd-sized data node quorums to cope with single failures.

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

17

MySQL Cluster Architecture for Availability


Redundant
components
JPA

REST

Application Nodes
NdbApi protocol

Node Group 1

F1
F2
F3
F4
F5
F6
Table 1

Cluster
Mgmt

Node Group
2
Redundancy
for availability

Node Group 3

- API nodes are stateless and consistent, can use n + m sparing with
F2
F3
F1
simple front end
load balancing.
F5and back on data node failures.
F6
F4
- NdbApi automatically
balances, fails over
- Network needs no SPOF too no single failure takes out > 1 cluster
F5
F6
F4
member.
F1

Data
Nodes

F2

Cluster
Mgmt

F3

Availability also comes from support for online operations : Schema


changes, Hardware and OS upgrades, Software upgrades, Cluster scaling
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

18

MySQL Cluster Architecture for Scale Out

JPA

REST

Application Nodes
NdbApi protocol

F2
F3
F4
F5
F6
Table 1

Node Group 2

Node Group 1

F1

Cluster
Mgmt

Node Group 3

F2
Performance + Capacity
F5
F4
Online
scale out of back end by adding
whole node groups
(Read + Write scaling)
F5
F4
F1

F1

Data
Nodes

F2

F3
F6

Cluster
Mgmt

F6
F3

Data Nodes can be added online, while transactions and queries


are running. Existing data is rebalanced across all nodegroups.
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

19

MySQL Cluster Architecture for Scale Out


Performance + HA
Online scale out of front end / Api nodes
REST
JPA

Application Nodes

NdbApi protocol

F2
F3
F4
F5
F6
Table 1

Node Group 2

Node Group 1

F1

Cluster
Mgmt

Node Group 3

F2
Performance + Capacity
F5
F4
Online
scale out of back end by adding
whole node groups
(Read + Write scaling)
F5
F4
F1

F1

Data
Nodes

F2

F3
F6

Cluster
Mgmt

F6
F3

Application Nodes can be added and removed online, all have


equal, consistent access to the data stored by the data nodes.
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

20

MySQL Cluster Architecture for Scale Up


Data node
Main thread

Request processing
threads

TC instances
Shared nothing

TC and LDM threads do


most work, must be well
fed by Send + Receive
threads

LDM instances
Shared nothing

TC
Transaction
coordinator
LDM
Local data
manager (Table +
Index partitions)

ndbmtd

Replication thread

Connect threads

Send threads
Receive threads

Watchdog

IO threads

Generally no more than one request processing thread per [HT] core
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

21

MySQL Cluster Architecture for Scale Up


Data node
Main thread

Request processing
threads

TC instances
Shared nothing

TC and LDM threads do


most work, must be well
fed by Send + Receive
threads

LDM instances
Shared nothing

TC
Transaction
coordinator
LDM
Local data
manager (Table +
Index partitions)

ndbmtd

Replication thread

Connect threads

Send threads
Receive threads

Watchdog

IO threads

Configurable parallelism within a Data node


Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

22

MySQL Cluster Architecture for Scale Up


Application
node

Clients

Clients

Clients

Client protocol (mysql,


memcached, ldap...)

Mysqld

Memcached

Many* threads

- Can scale the


number of threads
to meet demand
- Can scale the
number of NdbApi
connections to
avoid bottlenecks

Protocol decoding
Business logic / State machines
Database / Persistence layer

API conn

NdbApi calls
libndbclient

NdbApi
API conn

Node.js*
Java
Slapd
...

API conn

'Protocol 6'
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

23

MySQL Cluster Architecture for Scale Up


Application
node

Clients

Clients

Clients

Client protocol (mysql,


memcached, ldap...)

Mysqld

Memcached

Many* threads

- Can scale the


number of threads
to meet demand
- Can scale the
number of NdbApi
connections to
avoid bottlenecks

Protocol decoding
Business logic / State machines
Database / Persistence layer

API conn

NdbApi calls
libndbclient

NdbApi
API conn

Node.js*
Java
Slapd
...

API conn

'Protocol 6'
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

24

MySQL Cluster Performance


Distributed
efficiency

Coordination
avoidance

Local
efficiency

Protocol design,
optimisation, packing,
mulitplexing.
Data Distribution
awareness
Locality - Pushed down
filtering and joining

Non blocking reads


Parallel commit

OS call amortisation
Non blocking execution
Cache friendly data
structures
Lock free shared data
structures
Local data structures
Multi granularity pools

Scale Out

Balance
Hash partitioning

Scale Up

See MySQL Connect 2012 session 'Breakthrough performance with MySQL Cluster'
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

25

MySQL Cluster Performance


Optimisations build in layers
MySQL Server SQL optimisations
Distributed parallel filter + join

Batching hints, distribution


awareness, read removal
Optimised 2PC, async
APIs.
Low level efficiency,
Coordination
avoidance

SQL joins,
aggregates

Lower volume, more complex,


bigger footprint

SQL R/W of multi rows


NoSQL R/W of multi rows

Higher volume,
simpler, smaller
footprint

NoSQL R/W of single rows


Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

26

MySQL Cluster Performance


Feb 2012
1 billion
NoSQL reads
per minute

Jun 2013
8.5x better
performance
per NdbApi
connection

Jul 2012
1 billion writes
per minute

7.2

7.3

Feb 2015
200 million
NoSQL reads
per second

2.5 million
SQL
statements
per second

50% better
Sysbench read
performance

7.4

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

27

MySQL Cluster Performance


Regular improvements compound over releases
Data node
Multiple
Transaction
Coordinator
(TC) threads

7.2

NdbApi
Connection
thread
contention
reduction

7.3

Data node
Scan + PK
lookup
optimisations,
Send + Recv
optimisations

...

7.4

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

28

MySQL Cluster Performance


NoSQL Bulk benchmarks
- Getting to millions of requests per second on a
distributed system is often a matter of efficient
multiplexing and demultiplexing of individual
requests
- Modern hardware is very capable and so it is
important to keep out of the way, avoiding
context switches, threads, lock contention, small
messages, extra hops, and unnecessary
communication or coordination.
- Many small requests must be gathered
together and handled in bulk, without adversely
affecting latency or application semantics.
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

29

MySQL Cluster Performance


NoSQL benchmark tool flexAsynch

Unlike e.g. MySQLD /


Memcached, has no
upstream clients to serve,
so simpler

- Delivered as part of source distribution


- Multithreaded C++ NdbApi application
- Uses the asynchronous features of NdbApi which allow a single
thread to participate in multiple concurrent database transactions.
- Row operations using the full primary key
- Can make use of NdbApi Distribution Awareness hints to minimise
communication
- Parameters : NumberofAPIconnections,Numberofthreads,
Numberofparalleltransactionsperthread,Numberofrows
pertransaction,Numberofcolumns,Sizeofeachcolumn,
Lockmode,DistributionAwareness,Threadpartitioning
Details : http://mikaelronstrom.blogspot.co.uk/2013/11/how-to-make-efficient-scalable-key.html
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

30

MySQL Cluster 200 million NoSQL reads/s


72 API client machines running flexAsynch

32 Data node
machines
running ndbmtd

1 Management node

- 100 bytes data / read


- 19 GB/s aggregate data read
rate
- 6.4 M reads/s per data node
- 612 MB/s data node read
rate
- 2.86 M reads/s per client
- 272 MB/s read per client

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

31

MySQL Cluster 200 million NoSQL reads/s


72 API client machines running flexAsynch
216 NdbApi connections
18,432 client threads
> 10 million concurrent reads

32 Data node
machines
running ndbmtd

384 TC threads
384 LDM threads

1 Management node

- 100 bytes data / read


- 19 GB/s aggregate data read
rate
- 6.4 M reads/s per data node
- 612 MB/s data node read
rate
- 2.86 M reads/s per client
- 272 MB/s read per client

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

32

MySQL Cluster 200 million NoSQL reads/s


ndbmtd

flexAsynch

The
Infiniband
CloudTM

10 million
conc. reads

72 x 256
threads

72 x 3 API
connections

32 x 12 TC +
LDM threads

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

> 100 GB
data
33

MySQL Cluster 200 million NoSQL reads/s


ndbmtd

flexAsynch

The
Infiniband
CloudTM

Not distribution aware,


extra hop to data
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

34

MySQL Cluster 200 million NoSQL reads/s


ndbmtd

flexAsynch

The
Infiniband
CloudTM

Distribution aware,
minimal hops
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

35

MySQL Cluster 200 million NoSQL reads/s


ndbmtd

flexAsynch

The
Infiniband
CloudTM

Distribution aware,
minimal hops

Batching of
requests

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

36

MySQL Cluster 200 million NoSQL reads/s


ndbmtd

flexAsynch

The
Infiniband
CloudTM

Distribution aware,
minimal hops

Batching of
requests

Partitioned
client threads
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

37

MySQL Cluster 200 million NoSQL reads/s


Intel hardware lab (Thanks!)

Software configuration

105 machines, each with 28 cores (56


HT threads)
- 2 sockets Intel Xeon 'Haswell' E52697 v3 processors
Each socket :
- 14 cores (28 HT threads)
- 2.6GHz base, 3.6GHz turbo
- 35MB LLC
- 64GB DDR4 memory
- Infiniband + Gig Ethernet

Data nodes :
- 12 LDM threads (non-HT)
- 12 TC threads (HT)
- 2 Send threads (non-HT)
- 8 Receive threads (HT)
- MaxSendDelay config

56 Gbps switched Infiniband network.


~1 Tbps bisection bandwidth

API nodes :
- 3 NdbApi connections per client
machine
- 256 flexAsynch threads per client
machine
Scripts : https://dev.mysql.com/downloads/benchmarks.html

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

38

MySQL Cluster NoSQL Scale Out


Data node throughput scaling

API connection scaling

Million NoSQL reads/s as number of data nodes scales

Million NoSQL reads/s as API connections scale @ 24 data nodes


180

250

Near-linear scaling, 92%


efficiency at 32 nodes

200

API node scaling


saturates Data nodes
with Infiniband interrupts

160
140

Million reads/s

Million reads/s

120
150

100

100
80
60
40

50

20
0
0

10

15

20

Number of Data nodes

25

30

35

0
0

20

40

60

80

100

120

140

160

180

Number of Api connections

Infiniband adapters configred for latency rather than throughput, but


benchmarks reached within 10% of maximum throughput in any case
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

39

Getting started with MySQL Cluster


Try Cluster at OOW!
Benedita's Hands-on Lab on Thursday morning
Getting started video on YouTube
https://www.youtube.com/watch?v=4OixfzhOJoA

QuickStart whitepaper
http://downloads.mysql.com/tutorials/cluster/mysql_wp_cluster_quickstart.pdf

MySQL Cluster 'Getting Started' page


https://www.mysql.com/products/cluster/start.html

education.oracle.com MySQL Cluster courses

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

40

Getting started with MySQL Cluster


Tips

My laptop

Start small and simple


- Minimal nodes + configuration
- (< 10M concurrent reads!)
- Start on localhost to rule out firewall issues

F1
F4
F4
F1

Get it up and running, then add complexity


Experiment with mysql / mysqld, node failures, applications
Consider using MySQL Cluster Manager (https://edelivery.oracle.com)
Ask for help : forums.mysql.com
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

41

Keep Learning with Oracle


University

Classroom
Training

Cloud

Learning

Technology

Subscription

Applications

Live Virtual Class

Industries

Training On
Demand

education.oracle.com
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

42

Keynote: Monday, 4.00-6.00 pm, YBCA Theater


State of the Dolphin

Rich Mason, SVP & General Manager MySQL GBU, Oracle


Tomas Ulin, VP MySQL Engineering, Oracle

Customer Experiences

Hari Tatrakal, Director of Database Services, Live Nation


Olaniyi Oshinowo, MySQL & Open Source Technologies Leader, GE
Ernie Souhrada & Rob Wultsch, Database Engineers, Pinterest

Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

43

MySQL Cluster Performance Gains


Synchronous API
- Operation definition and execution are separated.
- Single user thread can define a batch of operations, then execute
them together, with only one API DB round trip
- A transaction can contain one or more batches of operations.
- 1 user thread : 1 executing transaction
Asynchronous API adds :
- Single user thread can define, execute and wait for the results of
multiple independent transactions.
- 1 user thread : n executing transactions
Async Api allows the number of client threads to be reduced giving efficiency gains.
Copyright 2015, Oracle and/or its affiliates. All rights reserved. |

44

You might also like