Database Administration

Friday, 16 November 2018

GTID implementation - Oracle vs MariaDB

Oracle MySQL has implemented GTID differently from MariaDB; this article walks through some of the key differences.
Before we look at the details, let’s look at the benefits of using global transaction ID:

· Easy to switch a slave server to replicate from a different replication master server
· Makes it simple to check the consistency of master and slaves servers.
· Ignore transactions already executed on slaves
· Simplifies complex replication topology
· Permits replicas to be pointed to masters at different level of hierarchy

GTID format

MariaDB	MySQL
A GTID is represented as 3 numbers separated with dashes (-). For example: 0-1-10. The first number is the domain ID, the second number is the server ID and the third number is transaction id.	A GTID is represented as a pair of coordinates, separated by a colon character (:), as shown below: source_id:transaction_id The source_id represents the originating server; the master’s server_uuid is normally used for this purpose.

Configuration

MariaDB	MySQL
Two servers are already running as master and slave, using old-style log position based replication
There is no special configuration needed of the server to start using global transaction ID. it can be switched to using GTID without any other actions needed: STOP SLAVE; CHANGE MASTER TO .., master_use_gtid=current_pos; START SLAVE;	Online procedure works with 5.7.6+, it goes through series of steps: · On each server, set enforce_gtid_consistency to WARN Let the servers run for a while, you must ensure that above step does not cause any warnings in MySQL error log. · On each server, set enforce_gtid_consistency to ON · On each server, set GTID_MODE to OFF_PERMISSIVE · Wait until all anonymous gets replicated: · SHOW STATUS LIKE 'ONGOING_ANONYMOUS_TRANSACTION_COUNT'; · On each server, execute: · SET @@GLOBAL.GTID_MODE = ON; · For more information, see online manual here
Crash-safe
It is crash-safe without any other actions needed. The state of the slave is crash-safe; slave keeps track of its current positions in the mysql.gtid_slave_pos system table. Updates to the state are done in the same transaction as the updates to the data.	The state of the slave is crash-safe with following settings: a) --relay-log-info-repository is set to TABLE so that slave can keep track of its current position in the mysql.slave_relay_log_info system table. Updates to the relay log info log table are committed together with the transactions. b) To ensure automatic relay log recovery immediately following server start-up, the `--relay-log-recovery` must be enabled

Errant transactions

Any writes applied directly to the replication slave is known as Errant transactions, such transaction/writes do not exists on other slaves connected to the same master. Table below lists the way they are handled in MariaDB vs MySQL;

MariaDB	MySQL
MariaDB may generate errors when the slave restarts due to GTIDs being present on the slave but absent from the master, for example: Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-102-786684, which is not in the master's binlog'	You may discover errant transactions when performing maintenance operation which verify replication topology and reports presence of errant transactions. However the procedure to fix errant requires injecting empty transaction(s). It is crucial to avoid errant transactions in MySQL, as it can prevent failover!

Skipping transactions

MariaDB	MySQL
`set global sql_slave_skip_counter = 1`
Still works and can be used moderately. However, it would not work when using parallel replication.	Not supported. If you need to skip transactions, use the value of the master's `gtid_executed` variable instead; see Injecting empty transactions, for more information.

Ignore transactions already executed

MariaDB	MySQL
The default behaviour is to allow such transaction more than once on the slave. However, when `gtid_stict_mode` is enabled , error is raised and replication stops e.g. Last_SQL_Error: An attempt was made to binlog GTID 0-101-786685 which would create an out-of-order sequen ce number with existing GTID 0-101-786685, and gtid strict mode is enabled.	Any attempt to execute a subsequent transaction with the same GTID is ignored by that server. No error is raised, and no statement in the transaction is executed.

Friday, 26 October 2018

Export to JSON from MySQL All Ready for MongoDB

This post walks through how to export data from MySQL® into JSON format, ready to ingest into MongoDB®. Starting from MySQL 5.7+, there is native support for JSON. MySQL provides functions that actually create JSON values, so I will be using these functions in this article to export to JSON from MySQL:

JSON_OBJECT
JSON_ARRAY

These functions make it easy to convert MySQL data to JSON e.g.

Shell

1

2

3

4

5

6

7

8

mysql> SELECT json_object('employee_id', emp_no, 'first_name', first_name ) AS 'JSON' FROM employees LIMIT 2;

+------------------------------------------------+

| JSON |

+------------------------------------------------+

| {"first_name": "Aamer", "employee_id": 444117} |

| {"first_name": "Aamer", "employee_id": 409151} |

+------------------------------------------------+

2 rows in set (0.00 sec)

In this article, I will be using the employees sample database available from here:
https://dev.mysql.com/doc/employee/en/employees-installation.html

The employees schema:

Employee schema from MySQL https://dev.mysql.com/doc/employee/en/images/employees-schema.png

When mapping relations with collections, generally there is no one to one mapping, you would want to merge data from some MySQL tables into a single collection.

Export data to JSON format

To export data, I have constructed the following SQL (the data is combined from 3 different tables: employees, salaries, and departments):

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

SELECT json_pretty(json_object(

'emp_no', emp.emp_no, 

'first_name', emp.first_name, 

'last_name', emp.last_name, 

'hire_date',

json_object("$date", DATE_FORMAT(emp.hire_date,'%Y-%m-%dT%TZ')),

'Department', JSON_ARRAY(json_object('dept_id', dept.dept_no, 'dept_name', dept.dept_name)), 

'Salary', s.salary)) AS 'json' 

FROM employees emp

INNER JOIN salaries s ON s.emp_no=emp.emp_no

INNER JOIN current_dept_emp c on c.emp_no = emp.emp_no

INNER JOIN departments dept on dept.dept_no = c.dept_no

LIMIT 1;

Output:

*************************** 1. row ***************************

json: {

"Salary": 60117,

"emp_no": 10001,

"hire_date": "1986-06-26",

"last_name": "Facello",

"Department": [

{

"dept_id": "d005",

"dept_name": "Development"

}

],

"first_name": "Georgi"

}

You can see from this that json_object did not convert ‘hire_date’ column value to be compatible with MongoDB. We have to convert date into ISODate format:

Shell

1

2

3

4

5

6

7

mysql> select json_object('hire_date', hire_date) AS "Original Date", json_object('hire_date', DATE_FORMAT(hire_date,'%Y-%m-%dT%TZ')) AS "ISODate" from employees limit 1;

+-----------------------------+---------------------------------------+

| Original Date | ISODate |

+-----------------------------+---------------------------------------+

| {"hire_date": "1985-01-01"} | {"hire_date": "1985-01-01T00:00:00Z"} |

+-----------------------------+---------------------------------------+

1 row in set (0.00 sec)

Next, we dump the output to a file (the above query is slightly modified) e.g.

Shell

1

2

3

4

5

6

7

8

9

10

11

12

SELECT json_object(

'emp_no', emp.emp_no,

'first_name', emp.first_name, 

'last_name', emp.last_name, 

'hire_date', json_object("$date", DATE_FORMAT(emp.hire_date,'%Y-%m-%dT%TZ')), 

'Department', JSON_ARRAY(json_object('dept_id', dept.dept_no, 'dept_name', dept.dept_name)),

'Salary', s.salary) as 'json' 

INTO OUTFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/employees.json' ## IMPORTANT you may want to adjust outfile path here

FROM employees emp

INNER JOIN salaries s ON s.emp_no=emp.emp_no

INNER JOIN current_dept_emp c on c.emp_no = emp.emp_no

INNER JOIN departments dept on dept.dept_no = c.dept_no

Importing data

To load the file employees.json into MongoDB, I use the mongoimport utility. It’s a multi-threaded tool that can load large files efficiently.

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

# mongoimport --db test --collection employees --drop < employees.json

2018-10-05T12:32:30.401+0100 connected to: localhost

2018-10-05T12:32:30.401+0100 dropping: test.employees

2018-10-05T12:32:33.400+0100 test.employees 34.0MB

2018-10-05T12:32:36.401+0100 test.employees 67.3MB

2018-10-05T12:32:39.399+0100 test.employees 100MB

2018-10-05T12:32:42.400+0100 test.employees 134MB

2018-10-05T12:32:45.401+0100 test.employees 168MB

2018-10-05T12:32:48.402+0100 test.employees 202MB

2018-10-05T12:32:51.402+0100 test.employees 235MB

2018-10-05T12:32:54.400+0100 test.employees 269MB

2018-10-05T12:32:57.400+0100 test.employees 303MB

2018-10-05T12:33:00.403+0100 test.employees 335MB

2018-10-05T12:33:03.404+0100 test.employees 368MB

2018-10-05T12:33:06.399+0100 test.employees 397MB

2018-10-05T12:33:09.400+0100 test.employees 430MB

2018-10-05T12:33:12.400+0100 test.employees 465MB

2018-10-05T12:33:15.403+0100 test.employees 499MB

2018-10-05T12:33:18.401+0100 test.employees 530MB

2018-10-05T12:33:18.589+0100 test.employees 533MB

2018-10-05T12:33:18.589+0100 imported 2844047 documents

Validate

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

> db.employees.find({}).pretty()

{

"_id" : ObjectId("5bb740cfd73e26bf45435181"),

"Salary" : 60117,

"emp_no" : 10001,

"hire_date" : ISODate("1986-06-26T00:00:00Z"),

"last_name" : "Facello",

"Department" : [

{

"dept_id" : "d005",

"dept_name" : "Development"

}

],

"first_name" : "Georgi"

}

We have successfully migrated some data from MySQL to MongoDB!

The content in this blog is provided in good faith by members of the open source community. The content is not edited or tested by Percona, and views expressed are the authors’ own. When using the advice from this or any other online resource test ideas before applying them to your production systems, and always secure a working back up.