Sunday, March 30, 2014

Fun with Bugs #31 - what's new in MySQL 5.6.17

MySQL 5.6.17 will probably be announced loudly at or immediately before Percona Live MySQL Conference & Expo next week. But official release announcement via email was made on March 28, release notes and binaries to download are already available, so why not to check them carefully to find out what to expect from this 8th minor release of MySQL 5.6 GA...

First of all, it seems Oracle still does not hesitate to introduce new features and behavior in the process. Just check these major changes:
  • Starting with 5.6.17, MySQL now supports rebuilding regular and partitioned InnoDB tables using online DDL (ALGORITHM=INPLACE) for OPTIMIZE TABLES, ALTER TABLE ... FORCE and ALTER TABLE ... ENGINE=InnoDB. Rebuilding tables (to defragment them and free some disk space mostly) is probably as popular kind of ALTER as adding new columns or changing indexes, so it's great to see a built-in way to do these ALTERs without blocking DML (well, assuming the table does not have foreign keys at least). Previously one had to rely on third party tools like pt-online-schema-change for these cases.
  • "The AES_ENCRYPT() and AES_DECRYPT() functions now permit control of the block encryption mode and take an optional initialization vector argument ... These changes make statements that use AES_ENCRYPT() or AES_DECRYPT() unsafe for statement-based replication and they cannot be stored in the query cache." So, we should expect more warnings now for STATEMENT-based replication setups, but has more secure implementation now. Note that having the details of call to these functions visible in the logs make them less secure, as pointed out in recent Bug #72158.
Oracle also continues to push deprecation of features actively:
  • "The deprecated ERROR_FOR_DIVISION_BY_ZERO, NO_ZERO_DATE, and NO_ZERO_IN_DATE SQL modes now do nothing. Instead, their previous effects are included in the effects of strict SQL mode (STRICT_ALL_TABLES or STRICT_TRANS_TABLES)." - it seems it's time to check all the code and remove these modes, as they may just disappear in 5.6 even before MySQL 5.7 GA with this kind of activity.
  • "The IGNORE clause for ALTER TABLE is now deprecated and will be removed in a future version of MySQL." - as this deprecation happened in 5.6.x minor release probably the clause will be removed well before 5.8. So it's not like your code is safe for next 2+ years.
Speaking about fixes for public bugs (these I mostly care about), I consider the following as most important:
  • Bug #71197. "mysql slave sql thread memory leak". I've pointed out this bug this week (and not for the first time) as a possible explanation for the OOM conditions in big replication cluster. It's really nice to see it fixed!
  • Bug #70669. "Slave can't continue replication after master's crash recovery", even with sync_binlog=1. This is now fixed.
  • One more fix in replication: Bug #68429 - "When running the slave with --slave-parallel-workers at 1 or greater, setting --slave-skip-errors=all caused the error log to be filled with with instances of the warning Slave SQL: Could not execute Query event. Detailed error: ;, Error_code: 0."
  • Bug #69680. "Auto_inc value not properly generated with RBR and auto_inc column only on slave". It was reported by my colleague Ovais Tariq long time ago and is now fixed. 
  • Bug #71436. "Compressed Tables Regression from 5.6.13". Finally this performance regression when large InnoDB compressed tables are used is fixed. It was reported by my colleague Jervin Real.
  • Bug #70430. "Race condition between purge coordinator and InnoDB monitor crash on shutdown". It was reported by Laurynas Biveinis from Percona.
  • Bug #70063. "Create table is slower in 5.6". It was reported by my former colleague Arnaud Adant from Oracle and the problem was tracked down to inserts into the mysql.innodb_index_stats table (one of two to store InnoDB persistent statistics). Maybe this fix will help to run CREATE or ALTER for partitioned tables faster as well - the problem reported recently as a regression comparing to MySQL 5.5 as Bug #72115 - "Very slow create/alter table with partitions and InnoDB".
  • Bug #65225. "InnoDB miscalculates auto-increment after changing auto_increment_increment" - this one was reported by Elena Stepanova almost 2 years ago and was considered a regression comparing to... MySQL 5.0. Nice to see that we have one less case when old MySQL 5.0 can be considered "better" that recent greatest MySQL 5.6 :)
There are more important fixed in replication and InnoDB categories. Fixes in optimizer are less visible in release notes, but these are really important optimizer bugs that are fixed finally:
  • Bug #70657. "SELECT DISTINCT...GROUP BY returns wrong results in some cases".
  • Bug #69969. "Failing assertion: prebuilt->trx->conc_state == 1 from subselect". It was reported by my colleague Raghavendra Prabhu.
  • Bug #71244. "Wrong result computation using ALL() and GROUP BY".
The last but not the least, for a second release in a row Oracle engineers had fixed something I've reported (even though it's minor enough problem):
  • Bug #71554. Release notes says: " innodb_ft_result_cache_limit now has a hardcoded maximum value of 4294967295 bytes or (2**32 -1). The maximum value was previously defined as the maximum value of ulong." One of those cases when fixing the server code happens as a result of a simple documentation request.
To summarize, MySQL 5.6.17 seems to be a very good release for those who use new InnoDB and replication features in MySQL 5.6. It fixed several regressions and (assuming no new regression bugs introduced) should be implemented everywhere as soon as possible.

Sunday, March 9, 2014

What's wrong with MySQL Manual

I think that MySQL Manual is one of the reasons why MySQL became and still remains popular. I find a lot of useful information (at least references to share with customers, if not real insights) there even after 9+ years of working with MySQL every day and with all numerous articles and blog posts on most important topics available now. I still have MySQL manual page open at every browser instance on every laptop I use on a regular basis.

It's simply great, well indexed by Google and has meaningful human-readable URLs, so one can even guess them for the topics he need. I have http://dev.mysql.com/doc/refman/5.6/en/merge-storage-engine.html open right now and looking at it I clearly understand without any search that if I need a reference for SELECT syntax in 5.5 I have just replace "5.6" with "5.5" and "merge-storage-engine" with "select" to get there. It works that way and many users even had created some nice marcos or shortcuts  for their browser to end up at the proper manual page after typing just the topic they are interested it.

These being said, today I'd like to write about problems with MySQL manual that I care about. Some of them are not new and are hard to fix/avoid, others I consider recent enough. Today I'd like to discuss 4 of them:
  1. MySQL Manual does NOT have enough careful readers who are ready to report problems or missing details.
  2. Old documentation bug reports seem to be ignored without any obvious good reason.
  3. Some documentation pages are useless with their current content.
  4. "How to" kind of official MySQL documentation is missing, and this seems intentional.
The most important problem for MySQL Manual is the fact that it seems it has not enough careful readers who are interested in making it better. Great MySQL Documentation team works hard on improving documentation, but even now, in Oracle, with all the funds theoretically available and almost unlimited resources in general, it seems the team is hardly can keep up with documenting new features and releases in time, so they do not have either time or free pairs of eyes to do real "QA" for the manual other than proofreading maybe.

Also, it's easy to understand that authors who had written the text are not the best people to read it later with a hope to find something useful or some mistake there. They deal with their "ideas" and may have too many things assumed in mind comparing to average user/reader of the manual. They are also very experienced MySQL users and even developers in the past. I had written a book myself long time ago and translated dozens of technical books into Russian, so I know the feeling - for you, as a writer, your text is clear, useful and may be just great. While it is NOT, especially for readers with different technical or cultural background.

As a result we have manual pages that are less than clear and has a lot of useful things missing or improperly documented. When I started to read the manual carefully some time ago (after I stopped writing about MySQL bugs at Facebook I needed some other thing to do to keep my mind in peace) I had found out that I can easily report a bug (if not 5 or more) for any page I open. I see missing details or wrong statements everywhere. I had even "invented" separate "missing manual" tag to classify some of the documentation bugs I've reported. It's so easy to find these problems and report them as bugs in "Server: Documentation" category...

So, MySQL manual has bugs, but not so many it seems (I see 86 active ones at the moment). This is NOT because it's ideal, but more because too few people care to read the manual carefully and report anything they consider wrong or missing. Actually, besides MySQL Support engineers and few other Oracle employees, few of my colleagues at Percona and myself, I know only 3 Community members who report documentation bugs recently:
  • Daniel van Eeden. He is a well known Community bug reporter and used to be great Oracle customer (at least this is how I remember him from the days when I worked in Oracle MySQL Support). Out of his 78 active bug reports I see 11 for MySQL Server manual, but he also reports bugs in documentation for Oracle's "commercial" MySQL software, like MySQL Enterprise Backup.
  • Peter Laursen. He is also a well known and productive MySQL bugs reporter for a decade probably. Out of his 83 active bug reports 5 are related to MySQL Manual. They are not very recent though.
  • Federico Razzoli. He had started to report himself only recently because of privacy-related issues with information requested by Oracle to obtain SSO account for bugs reporting. Before that he asked me to report bugs sometimes, and I did. So, now of his 5 active bugs 2 are related to MySQL Manual.
So, just few reporters who care and, at the same time, problems and deficiencies easy to find everywhere. This is a main problem, IMHO, and the only way to get a hope to solve it is to start reading MySQL Manual and reporting bugs/problems noted! Surely, it's just the first step, somebody has to write/provide real missing content or fixes. If you can suggest some better text, please, do it. If you do not want to wait there are probably alternative ways: you have to document MySQL related things elsewhere. For example, this is what Federico did when he noted that no way to deal with materialized views is documented in usual sources of MySQL-related information. I plan to discuss alternatives to MySQL Manual later. Let's concentrate on what to do to make it better for now.

Other important problem of the manual related to bugs is the fact that some of the bugs reported are NOT fixed in time. Let me share my favorite example here, my Bug #68097. It was reported almost 14 months ago and asked to actually properly document PERFORMANCE_SCHEMA behavior that was intended and explained long time ago - the only way to enable instrumentation for (InnoDB) rwlocks and mutexes is upon server startup. If these instruments where not enabled at startup they are just not available, otherwise you can disable them and enable them back dynamically. I fail to understand what's so hard to write exactly this, as we can find statements about this from Oracle engineers in many places.

As a result, MySQL Manual stays misleading for years! By the way, the oldest documentation bug I see in "Verified" status, Bug #30538, remains in this state without further comments since October 19, 2010. Based on recent comments it may be even somehow fixed, but nobody cares to change status then. And this is for a bug reported back in 2007 by famous customer (probably Google at that time), Mark Callaghan. Obviously we see not enough care about older documentation requests. My recent ones are usually fixed fast though.

Some pages of MySQL Manual exist for years, but their content is almost useless without looking into the code or checking with developers. Great examples can be found in documents devoted to NDB Cluster internals, for example, this page (it's one of many): https://dev.mysql.com/doc/ndbapi/en/ndb-internals-dump-command-2401.html. Nobody even tried to explain the output here, interpretation of the fields, use cases for this DUMP command, nothing. Just few words and example of output that I can probably get myself. How this page is useful to anybody?

Unfortunately, the problem is NOT limited to NDB Cluster internals (that I am happy to honestly not care about at all at the moment). Same happens even to InnoDB. Check http://dev.mysql.com/doc/refman/5.6/en/innodb-monitors.html and tell me where it explains the details about any lock waits related information you can find out in TRANSACTIONS section of INNODB STATUS output. I see only this text:
"If this section reports lock waits, your applications might have lock contention. The output can also help to trace the reasons for transaction deadlocks."
That's all! Really. It does not even explain how "next-key lock" looks in the output vs "record lock"... As a result, one can not explain the way to troubleshoot even simple deadlock based on the manual or, if she tries, she can NOT prove the points with anything but experiments and quotes from the source code. And all that happens when InnoDB lock model does NOT change for decade(s) and source code seems commented in details.

The fourth problem, the lack of official "How to" or "FAQ" documents, is closely related to the above. Not only MySQL Manual sometimes misses a lot of details, you can not get them from http://dev.mysql.com/doc/ (the place to look for official MySQL guides) at all! How to read the output of SHOW ENGINE INNODB STATUS? Now to use PERFORMANCE_SCHEMA to troubleshoot different types of performance problems? You can find answers (of different quality, level of details and age) in great blog posts, books and presentations, maybe in Oracle's knowledge base articles (if you are lucky to be their customer and good in search there) but NOT in official MySQL documentation.

I do not consider this normal or acceptable, especially for the open source software. Check FreeBSD Documentation Project, for example, or Linux Documentation Project to get some ideas of how other open source software approaches the problem.

Surely, adding a collection of "How to" guides probably requires input from Community, but why at least not to try to collect links to all the good existing documents of this kind somewhere at http://dev.mysql.com? Having third party documents of this kind with "officially approved" status after a review by Oracle's MySQL Documentation and/or Support teams would help a lot, I think.

That's all I can complain about at the moment. One day I'll try to continue...

Sunday, March 2, 2014

Fun with Bugs #30 - quick review of my reports in February, 2014

I've got only one comment to my previous post about deadlock, and it was more like a hint based on a different use case, not a real explanation. So far there is nobody who wants to get free beer... Maybe this is even good, as I do not go to the conference and BOF I've submitted will be supervised by my colleague Przemysław Malkowski. But you still have entire month till the conference to get a chance for a beer from him (we'll arrange this somehow).

In the meantime I'd like to review bug reports for MySQL server (few) and fine manual (many) that I've submitted in February, 2014. 22 in total, one was just plain wrong and I closed it as "Not a bug" almost immediately. So, 21 to consider.

Let's start with serious problem. Based on our previous discussion on responsible bug reporting I had not shared the test case with the entire world immediately, but you should know that you probably can crash MySQL 5.5, 5.6 and 5.7 (at least on Windows, but I can not exclude older 5.5 on Linux also) with a single SELECT statement. This was reported as Bug #71858 (security bug from the very beginning, so you will not see it directly).

While working on a real customer problem I had to report a bug related to replication, Bug #71732, "Garbage value in output when MASTER_LOG_FILE='' is set". Having this kind of messages in the error log does not really help to find out fast what was the reason of the problem.





I've considered Bug #71818, "Type column in SHOW PERFORMANCE_SCHEMA STATUS is redundant" annoying at least, but it ended up as "Not a bug" for a reason that SHOW ENGINE INNODB STATUS also has "Type" column for its single row of results, so it's by design, "The column might not be very useful, but it works as intended".

I've spent some time trying to build recent MySQL 5.5+ on Linux using Solaris Studio 12.3 compiler, and mostly failed. In the meantime Bug #71847 was reported, "Predefined compiler flags for SunPro/Solaris Studio 12.x are missing in cmake". Using section from Solaris OS part helped a bit in the process.

I was annoying enough to complain about warnings two times. See  Bug #71640, "Somewhat misleading deprecation warning for innodb_lock_monitor" and Bug #71817, "Warnings text is somewhat misleading (mentions max_open_files and table_cache)" for the details.

Surely I don't forget PERFORMANCE_SCHEMA and added a feature request, Bug #71755, "Provide per partition summary information in PERFORMANCE_SCHEMA".

The rest were actually documentation requests. I plan to write a long post about MySQL documentation in general eventually, so let me point out only bug reports I've created while working on explanation for that deadlock:
  • Bug #71637 - "Manual mentions IS_GAP and IX_GAP locks, but never explains them"
  • Bug #71638 - "Manual does NOT explain "insert intention" lock mode properly"
  • Bug #71735 - "Manual does not explain locks set by SELECT ... FOR UPDATE properly"
  • Bug #71736 - "Manual does not explain locks set by UPDATE properly"
They are all "Verified" for now, so it seems my concerns are valid.

February was a really hot and sad month here in Ukraine, so I was not productive enough with MySQL bugs reporting. Sorry.