Planet MySQL

Planet MySQL -
  1. We had a really great session yesterday during the webinar on Amazon Redshift replication, but sadly ran out of time for questions. So as promised, let’s try and answer the questions asked now! Are you going to support SAP IQ as a target ? This is the first request we’ve ever received, and so therefore there are no firm plans for supporting it. As a target, SAP IQ supports a JDBC interface, and therefore is not hugely complicated to achieve compared to more complex, customer appliers. Can you replicate from SAP ASE ? No, and no current plans. What about latin1 character ? Is it supported ? Yes, although the default mode is for us to use UTF-8 for all extract and apply jobs. You can certainly change the character set, although leaving UTF-8 should also work. Is it a problem not to have primary keys on Redshift ? Yes, and no. The replicator needs some way to identify a unique row within the target database so that we don’t inadvertently corrupt the data. Because of the way the data is loaded, we also need to have enough information to be able to run a suitable materialise process, and that relies on having that primary key information to use to identify those rows. Obviously, primary keys provide us with the information we need to do that. In a future release, we will support the ability to add an explicit custom primary key to tables, even if the source/target table doesn’t have explicit primary keys. This means that if you can use multiple columns to uniquely identify a row, we can use this to perform the materialisation. What if someone runs a truncate on the source ? Currently, TRUNCATE is not supported as it is a DDL operation that is normally filtered. With the new DDL translation functionality however we have the ability to support this, and decide how it is treated. Support for TRUNCATE is not yet supported even in the new DDL translation support. In a future release we will process a TRUNCATE operation as we would any other DDL, and therefore choose whether it deletes or truncates the table, or creates a copy or archive version of the table. Is it possible to filter out DML operations per table ? Forbid DELETEs on a given table. Yes. We have a filter called SkipEventByType, which allows you to select any table and decide whether operations apply to it. That means that you can configure by table and/or schema whether you want to allow INSERT, UPDATE or DELETE operations. Is replication multi-threaded ? Yes. The replicator already handles multiple threads internally (so for example the reading of the THL and the applying are two separate threads). You can also configure the applier to handle data by multiple threads and this works with our ‘sharding’ system. This is not sharding of the data, but instead sharding of the THL stream within the replication process. To ensure we don’t corrupt data that crosses multiple tables within transactions (the replicator is always transactionally consistent) the sharding is handled on a per-schema basis. So if you are replicating 10 schemas, you can configure 10 threads, and each thread will be handled and applied separately to the target. Also keep in mind that batch-based appliers like Redshift automatically handle multiple simultaneous targets and threads for each table, since each table is loaded individually into the target. If you are condensing multiple streams (i.e. multiple MySQL or Oracle sources), then each stream also has it’s own set of threads. However, even with multiple threads, it’s worth remembering that both the MySQL binary log and the Oracle Redo logs are sequential transaction logs, and so we don’t extract with multiple threads because there is only one source of transaction changes, and therefore we only ever extract complete applied and committed transactions. — Hopefully this has answered all the questions (and some other information). But if you have more, please feel free to ask.
  2. Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community. The Percona Live Call For Papers closes on December 22, but why aren’t you submitting already? Don’t wait till the last minute! Look at our broad scope as well. Worth noting that the best prices for tickets are available now until January 7. FOSDEM is happening in Brussels, Belgium (like it has for the past many years). All I can say is that the schedule is out, and it was very hard to make a choice! See the schedule. In addition, don’t forget to drop by the project table/stand for Percona — yes, we aim to showcase the ecosystem — we’ll be located at Building H. ACMUG wrapup Over the weekend, I was in Beijing, China for the annual ACMUG gathering, now expanded to two whole days. A special thanks go out to Rikki and Yanwei for helping get me there, and Lixun Peng for being my simultaneous translator. Most of the talks were in Mandarin, but we did have a bit of a foreigner contingent (from Oracle, Manyi Lu, Luis Soares; from Facebook, Yoshinori Matsunobu, and Junyi Luke Lu, and me from Percona), thus there were also talks in English. It’s clear that there’s a lot of mixed uses these days — not just MySQL but also MongoDB and Hadoop. Another thing that folk are looking into? PingCap’s TiDB. ScyllaDB has also been talked about, and I guess the other important thing from a RocksDB standpoint is that it’s also being utilised as part of PolarDB. One other major takeaway — learning from Chinese DBAs means you learn at scale. For them, a tiny instance may have 3 million users – that’s the whole population of some countries. More has to be done to get exposed to their ideas and how they are solving big issues; expect to have some kind of track at Percona Live! Releases Percona Server for MySQL 5.6.38-83.0 with new features, including InnoDB page fragmentation counters, a feature ported from the Facebook MySQL patch, for Multiple page asynchronous I/O requests and more. Percona Server for MySQL 5.7.20-18 with improvements for InnoDB full-text search stopwords, as well as the features in the above 5.6 release. Apache Hadoop v3.0.0 is out with multiple new features. I especially like the powered by page, which I feel we need to do for Percona Server for MySQL/MongoDB, and more in the ecosystem. sysbench 1.0.11 released, with a change to get the PostgreSQL driver to be compatible with CockroachDB. Link List The RedMonk re:Invent 2017 Recap – love this chat-style format, also a lot of important insights from the RedMonk analysts. Towards Bigger Small Data – great to see Yoshinori write a blog post again after a hiatus. Important read. The Case for Learned Index Structures – paper making the rounds, on how machine learned indexes can replace B-Trees or hash indexes. Requisite twitter conversation. Upcoming appearances FOSDEM 2018 – Brussels, Belgium – February 3-4 2018 SCALE16x – Pasadena, California, USA – March 8-11 2018 Feedback I look forward to feedback/tips via e-mail at or on Twitter @bytebot.
  3. Tarantool, a Lua application server plus NoSQL DBMS, is now an SQL DBMS too, in alpha version 1.8. I was interested in how the combination "Lua + NoSQL + SQL" works. Disclaimer: I do paid work related to Tarantool but it has nothing to do with this blog. First let's verify that it's really SQL. The illustrations are all unretouched screenshots from ocelotgui for Windows, connected to a Tarantool 1.8 server on Linux, which I built from a source download on github. Example of SELECT Yes, the "history" below the query window shows successful CREATE and INSERT statements, the "result set" at the bottom shows a successful SELECT statement's output. A close look at the clauses shows that there's support for constraints, foreign keys, triggers,... and so on. In all, it's a reasonable subset of the SQL standard, pretty well the same as what I found for SQLite in an earlier post. That's not surprising because Tarantool started with SQLite's parser; however, the storage layer is Tarantool's NoSQL. Combine Lua with SQL Now it's time for the first "combination". I want to store and retrieve pictures, which are in .png and .jpg files. With MySQL/MariaDB I'd use load_file() but that's a built-in function that Tarantool doesn't have. Am I daunted? No, because I can write such a function in Lua -- or actually I can copy such a function because it's one of the examples in the Tarantool manual. -- Lua function to set a variable to a file's contents function load_file(file_name) local fio = require('fio') local errno = require('errno') local f =, {'O_RDONLY' }) if not f then error("Failed to open file: "..errno.strerror()) end local data = f:read(1000000) f:close() return data end; Not a difficult matter. As is clear from the display, the function is syntactically okay (there would be squiggly red lines under the error if my Lua was bad). To explain the function: it says "read a file and return either an error message or the whole contents". I can't call Lua functions directly from SQL yet, but I can do so from NoSQL, and with NoSQL I can INSERT into the same table that I created with SQL...{1,load_file('shot1.jpg'),'shot1'); And then go back to SQL to handle the retrieval... SELECT * FROM "timages"; and the .jpg image is at the bottom of the screen. So, although they're not standard SQL/PSM or Oracle-like PL/SQL, Tarantool does have server-side stored procedures. Combine NoSQL With SQL Now it's time for the second "combination". I want to read some NoSQL data that was not produced or populated with SQL in mind. Specifically we've got: a variable number of fields, some of which are unnamed, and they're not all scalar, there are arrays and structures. A typical tuple related to index metadata looks like this: [[ Example of tuple ]] - [312, 1, 'owner', 'tree', {'unique': false}, [[0, 'unsigned']]] - [312, 2, 'object', 'tree', {'unique': false}, [[2, 'string'], [3, 'unsigned']]] - [313, 0, 'primary', 'tree', {'unique': true}, [[1, 'unsigned'], [2, 'string'], [3, 'unsigned']]] - [313, 1, 'owner', 'tree', {'unique': false}, [[0, 'unsigned']]] - [313, 2, 'object', 'tree', {'unique': false}, [[2, 'string'], [3, 'unsigned']]] - [320, 0, 'primary', 'tree', {'unique': true}, [[0, 'unsigned']]] - [320, 1, 'uuid', 'tree', {'unique': true}, [[1, 'string']]] For this I wrote some C code in the client instead of on the server, but I don't think that's cheating -- it doesn't show a Tarantool feature, but it does show that one can transfer the data into an SQL table and go from there. The syntax added to the client looks like this: CREATE SERVER id FOREIGN DATA WRAPPER ocelot_tarantool OPTIONS (PORT 3301, HOST 'localhost', USER 'guest'); CREATE TABLE convertee SERVER id LUA 'return'; SELECT * FROM convertee; The selection after converting looks like this: (I'm changing the fonts and the window order now to make relevant parts look bigger.) I wish I could call this "flattening", but that term has been hijacked for other concepts. Anyway, whatever it can be called, it's the reason that schemaless data doesn't need a new query language. Things You Don't Always See Elsewhere I'll mention a few things that are in Tarantool SQL that are not in MySQL/MariaDB, or are fairly new in MySQL/MariaDB. This short list does not mean Tarantool is "ahead", I'm talking about an alpha where many things are to-be-announced. I like to look at what's new and different. COLLATE The point isn't that there's a COLLATE clause, the point is that the collation is ICU. I've talked about the advantages of ICU earlier. The collation name will change soon, probably to 'unicode_s1' or 'unicode_ci'. WITH A non-recursive WITH clause is the same thing as a temporary view that lasts only for the statement that encloses it. A recursive WITH clause is a temporary view that is populated by taking a seed (the query to the left of the UNION) and adding to it with a populator (the query to the right of the UNION), repeatedly, until some condition fails. I find it hard to understand, I suppose my problem is that this is procedural (a loop), and for procedural problems I prefer C or Lua or SQL/PSM. EXCEPT and INTERSECT SELECT * FROM "t" EXCEPT SELECT * FROM f; These operators can fit in the same place as UNION, but have different effects. In the example, the EXCEPT would mean "take out the rows that match" instead of "add rows that do not match". NaN and Inf This is a differentiator, since in standard SQL and in some implementations these values are not supported, even though they're supported in IEEE 754. CHECK A constraint's effect is: "if the condition inside the parentheses is true or unknown, then and only then it is legal to have this row." This feature is also in MariaDB 10.2.1. INDEXED BY This is non-standard (and obviously always will be): you can force Tarantool to use a particular index, or no index at all, bypassing the optimizer. VALUES This means: return a result set containing a row with three columns containing 1, 2, 3. MySQL and MariaDB have a non-standard way to do this: SELECT 1,2,3; I like the logic of VALUES and the fact that I can say VALUES (1,2,3),(4,5,6); but Tarantool also supports the MySQL/MariaDB way, and I expect that it will always be more popular. Game Changer? Tarantool's aiming high and Tarantool's owner has a $9-billion market cap so the resources are there, but I'm not sure that Oracle sees them on its threat radar just yet. Tarantool SQL is not a drop-in replacement for all the code you've written for Oracle or MySQL/MariaDB, and the distinguishing features that I've mentioned are only going to cause a few people to migrate, at least in 2018. Other people will use Tarantool as an "add-on" or "engine". I do see that this is close enough to SQLite that it will probably be reasonable to switch from there, if people need the extra multi-user / replication capabilities and the Lua integration and the speed (the main engine is an in-memory DBMS). More dimly, I see some other NoSQL DBMSs looking bad compared to Tarantool because their SQL support is trivial by comparison. I'm thinking especially of the ones that already get beaten by Tarantool in YCSB tests of NoSQL performance. Tarantool's licence is BSD. Alphas Are Not Betas Everything I've described above might change before Tarantool SQL is ready for use in production. If you want to try to replicate the examples exactly, start with the old version-1.7 manual, move up to the SQL tutorial for version-1.8. The ocelotgui client additional instructions for connecting to Tarantool are here.
  4. Percona announces the GA release of Percona Server for MySQL 5.7.20-18 on December 14, 2017. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository. Based on MySQL 5.7.20, including all the bug fixes in it, Percona Server for MySQL 5.7.20-18 is the current GA release in the Percona Server for MySQL 5.7 series. Percona’s provides completely open-source and free software. Find release details in the 5.7.20-18 milestone at Launchpad. New Features: Percona Server packages are now available for Ubuntu 17.10 (Artful). As part of InnoDB Full-Text Search improvements, a new innodb_ft_ignore_stopwords variable has been implemented which controls whether InnoDB Full-Text Search should ignore stopword list when building/updating an FTS index. This feature is also fixing bug #1679135 (upstream #84420). Percona Server has implemented InnoDB Page Fragmentation Counters. Percona Server has implemented support for Multiple page asynchronous I/O requests. This feature was ported from a Facebook MySQL patch. Percona Server has implemented TokuDB integration with PERFORMANCE_SCHEMA. As part of Data at Rest Encryption, Percona Server has implemented support for InnoDB general tablespace encryption and Keyring Vault plugin. This feature is considered BETA quality. Bugs Fixed: Percona Server 5.7 Docker images did not include TokuDB. Bugs fixed #1682419 and #1699241. If an I/O syscall returned an error during the server shutdown with Thread Pool enabled, a mutex could be left locked. Bug fixed #1702330 (Daniel Black). Dynamic row format feature to support BLOB/VARCHAR in MEMORY tables requires all the key columns to come before any BLOB columns. This requirement, however, was not enforced, allowing creating MEMORY tables in unsupported column configurations, which then crashed or lose data in usage. Bug fixed #1731483. After fixing bug #1668602, bug #1539504, and bug #1313901, CREATE/DROP TEMPORARY TABLE statements were forbidden incorrectly in transactional contexts, including function and trigger calls, even when they required no binary logging at all. Bug fixed #1711781. Running ANALYZE TABLE while a long-running query is accessing the same table in parallel could lead to a situation where new queries on the same table are blocked in a Waiting for table flush state. Fixed by stopping ANALYZE TABLE flushing affected InnoDB and TokuDB tables from the table definition cache. Bug fixed #1704195 (upstream #87065). CREATE TABLE... LIKE... did not use the source row_format on target TokuDB table. Bug fixed #76. TokuDB would encode already encoded database name for a directory name. Bug fixed #74. Other bugs fixed: #1720810, #83, #80, and #75. MyRocks Changes: RocksDB has implemented a FlushWAL API which improves upon the performance of MySQL 2-phase-commit during binary log group commit flush stage. This feature adds support for using the FlushWAL API in MyRocks and also matches rocksdb_flush_log_at_trx_commit variable with innodb_flush_log_at_trx_commit behavior. To implement this feature rocksdb_manual_wal_flush and rocksdb_concurrent_prepare variables have been implemented. A new rocksdb_force_compute_memtable_stats_cachetime variable has been implemented that cane be used to specify how long the cached value of memtable statistics should be used instead of computing it every time during the query plan analysis. A new rocksdb_large_prefix variable has been implemented which, when enabled, allows index key prefixes longer than 767 bytes (up to 3072 bytes). This option mirrors the innodb_large_prefix. The values for this variable should be the same between master and slave. A new rocksdb_max_background_jobs variable has been implemented to replace rocksdb_base_background_compactions, rocksdb_max_background_compactions, and rocksdb_max_background_flushes variables. This variable specifies the maximum number of background jobs. It automatically decides how many threads to allocate towards flush/compaction. It was implemented to reduce the number of (confusing) options users and can tweak and push the responsibility down to RocksDB level. A new rocksdb_sim_cache_size variable has been implemented to enable the simulated cache. This can be used to figure out the hit/miss rate with a specific cache size without changing the real block cache. Input can be now sorted by the Primary Key during the bulkload by enabling the rocksdb_bulk_load_allow_unsorted variable. A new rocksdb_ignore_unknown_options variable has been implemented, which when enabled (default) allows RocksDB to receive unknown options and not exit. The release notes for Percona Server for MySQL 5.7.20-18 are available in the online documentation. Please report any bugs on the launchpad bug tracker.
  5. Fill out a quick survey on your cloud data plans. Today, keeping your enterprise agile and flexible is not just an advantage, it is a requirement. Many of the systems and processes once controlled by businesses onsite are moving offsite to “service” models. This includes Platform as a Service (PaaS), Software as a Service (SaaS), Infrastructure as a Service (IaaS), Database as a Service (DBaaS), etc. These services are usually referred to as being in the cloud. The enterprise using the service doesn’t maintain or manage the infrastructure of the service in question. Migrating database workloads to the cloud can be a vital part of improving customer experience, gaining deeper business insights and increasing efficiency. More enterprises are choosing to move data to the cloud in order to make scaling easy, offload resource overhead or control expenses. Are you looking at moving your database to the cloud in the next year? What’s your reason? Do you know where you want to go? If no, what are your reasons? Concerns? Percona wants to know about your cloud data plans. Help the community by filling out this quick survey on when, why and how you’d like to migrate data to the cloud. Add any thoughts or other options in the comments section. We’ll post a follow-up blog with the results! Fill out the survey now.