Archive for MySQL Cluster

MySQL Cluster : Delivering Breakthrough Performance (upcoming webinar)

MySQL Cluster partitioning key

MySQL Cluster partitioning key

I’ll be presenting a webinar covering MySQL Cluster performance on Thursday, July 26. As always, the webinar will be free but you’ll need to register here – you’ll then also receive a link to the charts and a recording of the session after the event.

The replay of this webinar is now available from here.

Here’s the agenda (hoping that I can squeeze it all in!):

  • Introduction to MySQL Cluster
  • Where does MySQL Cluster fit?
  • Benchmarks:
    • ANALYZE TABLE
    • Access patterns
    • AQL (fast JOINs)
    • Distribution aware
    • Batching
    • Schema optimisations
    • Connection pooling
    • Multi-threaded data nodes
    • High performance NoSQL APIs
    • Hardware choices
    • More tips
  • The measure/optimise loop
  • Techniques to boost performance
  • Scaling out
  • Other useful resources

The session starts at 9:00 am UK time / 10:00 am Central European time.





MySQL Cluster running on Raspberry Pi

MySQL Cluster running on Raspberry Pi

MySQL Cluster running on Raspberry Pi

I start a long weekend tonight and it’s the kids’ last day of school before their school holidays and so last night felt like the right time to play a bit. This week I received my Raspberry Pi – if you haven’t heard of it then you should take a look at the Raspberry Pi FAQ – basically it’s a ridiculously cheap ($25 or $35 if you want the top of the range model) ARM based PC that’s the size of a credit card.

A knew I had to have one to play with but what to do with it? Why not start by porting MySQL Cluster onto it? We always claim that Cluster runs on commodity hardware – surely this would be the ultimate test of that claim.

I chose the customised version of Debian – you have to copy it onto the SD memory card that acts as the storage for the Pi. Once up and running on the Pi, the first step was to increase the size of the main storage partition – it starts at about 2 Gbytes – using gparted. I then had to compile MySQL Cluster – ARM isn’t a supported platform and so there are no pre-built binaries. I needed to install a couple of packages before I could get very far:

sudo apt-get update
sudo apt-get install cmake
sudo apt-get install libncurses5-dev

Compilation initially got about 80% through before failing and so if you try this yourself then save yourself some time by applying the patch from this bug report before starting. The build scripts wouldn’t work but I was able to just run make…

make
sudo make install

As I knew that memory was tight I tried to come up with a config.ini file that cut down on how much memory would be needed (note that 192.168.1.122 is the Raspberry Pi while 192.168.1.118 is an 8GByte Linux x86-64 PC – doesn’t seem a very fair match!):

[ndb_mgmd]
hostname=192.168.1.122
NodeId=1

[ndbd default]
noofreplicas=2
DataMemory=2M
IndexMemory=1M
DiskPageBufferMemory=4M
StringMemory=5
MaxNoOfConcurrentOperations=1K
MaxNoOfConcurrentTransactions=500
SharedGlobalMemory=500K
LongMessageBuffer=512K
MaxParallelScansPerFragment=16
MaxNoOfAttributes=100
MaxNoOfTables=20
MaxNoOfOrderedIndexes=20

[ndbd]
hostname=192.168.1.122
datadir=/home/pi/mysql/ndb_data
NodeId=3

[ndbd]
hostname=192.168.1.118
datadir=/home/billy/my_cluster/ndbd_data
NodeId=4

[mysqld]
NodeId=50

[mysqld]
NodeId=51

[mysqld]
NodeId=52

[mysqld]
NodeId=53

[mysqld]
NodeId=54

Running the management node worked pretty easily but then I had problems starting the data nodes – checking how much memory I had available gave me a hint as to why!

pi@raspberrypi:~$ free -m
             total       used       free     shared    buffers     cached
Mem:           186         29        157          0          1         11
-/+ buffers/cache:         16        169
Swap:            0          0          0

OK – so 157 Mbytes of memory available and no swap space, not ideal and so the next step was to use gparted again to create swap partitions on the SD card as well a massive 1Gbyte on my MySQL branded USB stick (need to persuade marketing to be a bit more generous with those). A quick edit of /etc/fstab and a restart and things were looking in better shape:

pi@raspberrypi:~$ free -m
             total       used       free     shared    buffers     cached
Mem:           186         29        157          0          1         11
-/+ buffers/cache:         16        169
Swap:         1981          0       1981

Next to start up the management node and 1 data node on the Pi as well as a second data node on the Linux server “ws2” (I want High Availability after all – OK so running the management node on the same host as a data node is a single point of failure)…

pi@raspberrypi:~/mysql$ ndb_mgmd -f conf/config.ini --configdir=/home/pi/mysql/conf/ --initial
pi@raspberrypi:~/mysql$ ndbd
billy@ws2:~$ ndbd -c 192.168.1.122:1186

I could then confirm that everything was up and running:

pi@raspberrypi:~$ ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @192.168.1.122  (mysql-5.5.22 ndb-7.2.6, Nodegroup: 0, Master)
id=4    @192.168.1.118  (mysql-5.5.22 ndb-7.2.6, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.1.122  (mysql-5.5.22 ndb-7.2.6)

[mysqld(API)]   5 node(s)
id=50 (not connected, accepting connect from any host)
id=51 (not connected, accepting connect from any host)
id=52 (not connected, accepting connect from any host)
id=53 (not connected, accepting connect from any host)
id=54 (not connected, accepting connect from any host)

Cool!

Next step is to run a MySQL Server so that I can actually test the Cluster – if I tried running that on the Pi then it caused problems (157 Mbytes of RAM doesn’t stretch as far as it used to) – on ws2:

billy@ws2:~/my_cluster$ cat conf/my.cnf
[mysqld]
ndbcluster
datadir=/home/billy/my_cluster/mysqld_data
ndb-connectstring=192.168.1.122:1186

billy@ws2:~/my_cluster$ mysqld --defaults-file=conf/my.cnf&

Check that it really has connected to the Cluster:

pi@raspberrypi:~/mysql$ ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @192.168.1.122  (mysql-5.5.22 ndb-7.2.6, Nodegroup: 0, Master)
id=4    @192.168.1.118  (mysql-5.5.22 ndb-7.2.6, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.1.122  (mysql-5.5.22 ndb-7.2.6)

[mysqld(API)]   5 node(s)
id=50   @192.168.1.118  (mysql-5.5.22 ndb-7.2.6)
id=51 (not connected, accepting connect from any host)
id=52 (not connected, accepting connect from any host)
id=53 (not connected, accepting connect from any host)
id=54 (not connected, accepting connect from any host)

Finally, just need to check that I can read and write data…

billy@ws2:~/my_cluster$ mysql -h 127.0.0.1 -P3306 -u root
mysql> CREATE DATABASE clusterdb;USE clusterdb;
Query OK, 1 row affected (0.24 sec)

Database changed
mysql> CREATE TABLE simples (id INT NOT NULL PRIMARY KEY) engine=ndb;
120601 13:30:20 [Note] NDB Binlog: CREATE TABLE Event: REPL$clusterdb/simples
Query OK, 0 rows affected (10.13 sec)

mysql> REPLACE INTO simples VALUES (1),(2),(3),(4);
Query OK, 4 rows affected (0.04 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM simples;
+----+
| id |
+----+
|  1 |
|  2 |
|  4 |
|  3 |
+----+
4 rows in set (0.09 sec)

OK – so is there any real application to this? Well, probably not other than providing a cheap development environment – imagine scaling out to 48 data nodes, that would cost $1,680 (+ the cost of some SD cards)! More practically might be management nodes – we know that they need very few resources. As a reminder – this is not a supported platform!





MySQL Cluster Manager 1.1.6 released

MySQL Cluster Manager 1.1.6 is now available to download from My Oracle Support.

Details on the changes can will be added to the MySQL Cluster Manager documentation . Please give it a try and let me know what you think.

Note that if you’re not a commercial user then you can still download MySQL Cluster Manager 1.1.5 from the Oracle Software Delivery Cloud and try it out for free. Documentation is available here.





MySQL 5.6 Replication – webinar replay

MySQL 5.6 Replication - Global Transaction IDs

MySQL 5.6 Replication - Global Transaction IDs

On Wednesday (16th May 2012), Mat Keep and I presented on the new replication features that are previewed as part of the latest MySQL 5.6 Development Release.

The replay for that webinar (together with the chart deck) is now available from here.

In addition, there were a huge number of great questions raised and we had a couple of  key engineers answering them on-line – view the Q&A transcript here.

A reminder of the topics covered in the webinar…

MySQL 5.6 delivers new replication capabilities which we will discuss in the webinar:

  • High performance with Multi-Threaded Slaves and Optimized Row Based Replication
  • High availability with Global Transaction Identifiers, Failover Utilities and Crash Safe Slaves & Binlog
  • Data integrity with Replication Event Checksums
  • Dev/Ops agility with new Replication Utilities, Time Delayed Replication and more

 





MySQL Cluster 7.1.22 is available for download

The binary version for MySQL Cluster 7.1,21 has now been made available at https://www.mysql.com/downloads/cluster/7.1.html#downloads (GPL version) or https://support.oracle.com/ (commercial version).

A description of all of the changes (fixes) that have gone into MySQL Cluster 7.1.22 (compared to 7.1.21) are available from the 7.1.22 Change log.





MySQL Cluster 7.2.6 is available for download

The binary version for MySQL Cluster 7.2.6 has now been made available at http://www.mysql.com/downloads/cluster/ (GPL version) or https://support.oracle.com/ (commercial version).

A description of all of the changes (fixes) that have gone into MySQL Cluster 7.2.6 (compared to 7.2.5) are available from the 7.2.6 Change log.





On-line add-node with MCM; a more complex example

I’ve previously provided an example of using MySQL Cluster Manager to add nodes to a running MySQL Cluster deployment but I’ve since received a number of questions around how to do this in more complex circumstances (for example when ending up with more than 1 MySQL Server on a single host where each mysqld process should use a different port). The purpose of this post is to work through one of these more complex scenarios.

The starting point is an existing cluster made up of 3 hosts with the nodes (processes) as described in this MCM report:

mcm> SHOW STATUS -r mycluster;
+--------+----------+-------------------------------+---------+-----------+---------+
| NodeId | Process  | Host                          | Status  | Nodegroup | Package |
+--------+----------+-------------------------------+---------+-----------+---------+
| 1      | ndbmtd   | paas-23-54.osc.uk.oracle.com  | running | 0         | 7_2_5   |
| 2      | ndbmtd   | paas-23-55.osc.uk.oracle.com  | running | 0         | 7_2_5   |
| 49     | ndb_mgmd | paas-23-56.osc.uk.oracle.com  | running |           | 7_2_5   |
| 50     | mysqld   | paas-23-54.osc.uk.oracle.com  | running |           | 7_2_5   |
| 51     | mysqld   | paas-23-55.osc.uk.oracle.com  | running |           | 7_2_5   |
| 52     | mysqld   | paas-23-56.osc.uk.oracle.com  | running |           | 7_2_5   |
| 53     | ndbapi   | *paas-23-56.osc.uk.oracle.com | added   |           |         |
+--------+----------+-------------------------------+---------+-----------+---------+
7 rows in set (0.01 sec)

This same configuration is shown graphically in this diagram:

On-Line scalability with MySQL Cluster - starting point

Original MySQL Cluster deployment

 

Note that the ‘ndbapi’ node isn’t actually a process but is instead a ‘slot’ that can be used by any NDB API client to access the data in the data nodes directly – this could be any of:

  • A MySQL Server
  • An application using the C++ NDB API directly
  • A Memcached server using the direct NDB driver
  • An application using the ClusterJ, JPA or modndb REST API
  • The MySQL database restore command

This Cluster is now going to be extended by adding an extra host as well as extra nodes (both processes and ndbapi slots).

The following diagram illustrates what the final Cluster will look like:

MySQL Cluster after on-line scaling

MySQL Cluster after on-line scaling

The first step is to add the new host to the configuration and make it aware of the MySQL Cluster package being used (in this example, 7.2.5). Note that you should already have started the mcmd process on this new host (if not then do that now):

mcm> ADD HOSTS --hosts=paas-23-57.osc.uk.oracle.com mysite;
+--------------------------+
| Command result           |
+--------------------------+
| Hosts added successfully |
+--------------------------+
1 row in set (8.04 sec)

mcm> ADD PACKAGE -h paas-23-57.osc.uk.oracle.com --basedir=/home/oracle/cluster_7_2_5 7_2_5;
+----------------------------+
| Command result             |
+----------------------------+
| Package added successfully |
+----------------------------+
1 row in set (0.68 sec)

At this point the MCM agent on the new host is connected with the existing 3 but it has not become part of the Cluster – this is done by declaring which nodes should be on that host; at the same time I add some extra nodes to the existing hosts. As there will be more than one MySQL server (mysqld) running on some of the hosts, I’ll explicitly tell MCM what port number to use for some of the mysqlds (rather than just using the default of 3306).

mcm> ADD PROCESS -R ndbmtd@paas-23-54.osc.uk.oracle.com,
ndbmtd@paas-23-55.osc.uk.oracle.com,mysqld@paas-23-56.osc.uk.oracle.com,
ndbapi@paas-23-56.osc.uk.oracle.com,mysqld@paas-23-57.osc.uk.oracle.com,
mysqld@paas-23-57.osc.uk.oracle.com,ndbapi@paas-23-57.osc.uk.oracle.com 
-s port:mysqld:54=3307,port:mysqld:57=3307 mycluster;
+----------------------------+
| Command result             |
+----------------------------+
| Process added successfully |
+----------------------------+
1 row in set (2 min 34.22 sec)

In case you’re wondering how I was able to predict the node-ids that would be allocated to the new nodes, the scheme is very simple:

  • Node-ids 1-48 are reserved for data nodes
  • Node-ids 49-256 are used for all other node types
  • Within those ranges, node-ids are allocated sequentially

If you look carefully at the results you’ll notice that the ADD PROCESS command took a while to run (2.5 minutes) – the reason for this is that behind the scenes, MCM performed a rolling restart – ensuring that all of the existing nodes pick up the new configuration without losing database service. Before starting the new processes, it makes sense to double check that the correct ports are allocated to each of the mysqlds:

mcm> GET -d port:mysqld mycluster;
+------+-------+----------+---------+----------+---------+---------+---------+
| Name | Value | Process1 | NodeId1 | Process2 | NodeId2 | Level   | Comment |
+------+-------+----------+---------+----------+---------+---------+---------+
| port | 3306  | mysqld   | 50      |          |         | Default |         |
| port | 3306  | mysqld   | 51      |          |         | Default |         |
| port | 3306  | mysqld   | 52      |          |         | Default |         |
| port | 3307  | mysqld   | 54      |          |         |         |         |
| port | 3306  | mysqld   | 56      |          |         | Default |         |
| port | 3307  | mysqld   | 57      |          |         |         |         |
+------+-------+----------+---------+----------+---------+---------+---------+
6 rows in set (0.07 sec)

At this point the new processes can be started and then the status of all of the processes confirmed:

mcm> START PROCESS --added mycluster;
+------------------------------+
| Command result               |
+------------------------------+
| Process started successfully |
+------------------------------+
1 row in set (26.30 sec)

mcm> SHOW STATUS -r mycluster;
+--------+----------+-------------------------------+---------+-----------+---------+
| NodeId | Process  | Host                          | Status  | Nodegroup | Package |
+--------+----------+-------------------------------+---------+-----------+---------+
| 1      | ndbmtd   | paas-23-54.osc.uk.oracle.com  | running | 0         | 7_2_5   |
| 2      | ndbmtd   | paas-23-55.osc.uk.oracle.com  | running | 0         | 7_2_5   |
| 49     | ndb_mgmd | paas-23-56.osc.uk.oracle.com  | running |           | 7_2_5   |
| 50     | mysqld   | paas-23-54.osc.uk.oracle.com  | running |           | 7_2_5   |
| 51     | mysqld   | paas-23-55.osc.uk.oracle.com  | running |           | 7_2_5   |
| 52     | mysqld   | paas-23-56.osc.uk.oracle.com  | running |           | 7_2_5   |
| 53     | ndbapi   | *paas-23-56.osc.uk.oracle.com | added   |           |         |
| 3      | ndbmtd   | paas-23-54.osc.uk.oracle.com  | running | 1         | 7_2_5   |
| 4      | ndbmtd   | paas-23-55.osc.uk.oracle.com  | running | 1         | 7_2_5   |
| 54     | mysqld   | paas-23-56.osc.uk.oracle.com  | running |           | 7_2_5   |
| 55     | ndbapi   | *paas-23-56.osc.uk.oracle.com | added   |           |         |
| 56     | mysqld   | paas-23-57.osc.uk.oracle.com  | running |           | 7_2_5   |
| 57     | mysqld   | paas-23-57.osc.uk.oracle.com  | running |           | 7_2_5   |
| 58     | ndbapi   | *paas-23-57.osc.uk.oracle.com | added   |           |         |
+--------+----------+-------------------------------+---------+-----------+---------+
14 rows in set (0.08 sec)

The enlarged Cluster is now up and running but any existing MySQL Cluster tables will only be stored across the original data nodes. To remedy that, each of those existing tables should be repartitioned:

mysql> ALTER ONLINE TABLE simples REORGANIZE PARTITION;
Query OK, 0 rows affected (0.22 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> OPTIMIZE TABLE simples;
+-------------------+----------+----------+----------+
| Table             | Op       | Msg_type | Msg_text |
+-------------------+----------+----------+----------+
| clusterdb.simples | optimize | status   | OK       |
+-------------------+----------+----------+----------+
1 row in set (0.00 sec)

You can safely perform the repartitioning while the Cluster is up and running (with your application sending in reads and writes) but there is a performance impact (has been measured at 50%) and so you probably want to do this at a reasonably quiet time of day.

As always, please post feedback and questions in the comments section of this post.





NoSQL and MySQL – free webinar, replay now available

Schema-free NoSQL Data

Update – the webinar replay is now available from here.

On Thursday, I’ll be presenting a webinar on NoSQL (of course with a MySQL twist!) – as always it’s free to attend but you need to register here in advance. Even if you can’t attend, it’s worth registering as you’ll be sent a link to the replay and the charts. The session will introduce the concepts and motivations behind the NoSQL movement and then go on to explain how you can get most of the same benefits with MySQL (including MySQL Cluster) while still getting the RDBMS benefits such as ACID transactions.

The official description:

The ever increasing performance demands of web-based services has generated significant interest in providing NoSQL access methods to MySQL – enabling users to maintain all of the advantages of their existing relational database infrastructure, while providing blazing fast performance for simple queries, using an API to complement regular SQL access to their data. This session looks at the existing NoSQL access methods for MySQL as well as the latest developments for both the InnoDB and MySQL Cluster storage engines. See how you can get the best of both worlds – persistence, consistency, rich queries, high availability, scalability and simple, flexible APIs and schemas for agile development.

When:

  • Thursday, March 29, 2012: 09:00 Pacific time (America)
  • Thu, Mar 29: 06:00 Hawaii time
  • Thu, Mar 29: 10:00 Mountain time (America)
  • Thu, Mar 29: 11:00 Central time (America)
  • Thu, Mar 29: 12:00 Eastern time (America)
  • Thu, Mar 29: 16:00 UTC
  • Thu, Mar 29: 17:00 Western European time
  • Thu, Mar 29: 18:00 Central European time
  • Thu, Mar 29: 19:00 Eastern European time




MySQL Cluster 7.2.5 available for download

The binary version for MySQL Cluster 7.2.5 has now been made available at http://www.mysql.com/downloads/cluster/ (GPL version) or https://support.oracle.com/ (commercial version).

A description of all of the changes (fixes) that have gone into MySQL Cluster 7.2.5 (compared to 7.2.4) will appear in the 7.2.5 Change log.





Getting the best performance out of MySQL Cluster – White Paper

There are a number of resources available to help get you up and running with MySQL Cluster so that you can start experimenting – including the quick start guides from the download page but what happens when you get further down the line and need to get the best performance from your Cluster? One useful source is the Guide to Optimizing Performance of the MySQL Cluster Database which covers how best to configure MySQL Cluster as well as what you should be doing with your application and schema to get the best performance.

Any feedback on this white paper would be greatly appreciated – whether it be a recommendation that you found gave good results, something that didn’t work or a killer tip that we missed out.