Friday, April 20, 2018

Fun with Bugs #67 - On Some Public Bugs Fixed in MySQL 8.0.11 GA

I stopped reviewing MySQL Release Notes for quite a some time, but major GA releases of MySQL do not happen often, so I decided to make an exception and write about some bugs from Community users fixed in MySQL 8.0.11 GA.

I'll start with good news about MySQL 8.0.11 GA! You can get sources at GitHub, and I had no problems to build on Fedora 27 on my good old QuadCore box, using the following cmake command line:
[openxs@fc23 mysql-server]$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_CONFIG=mysql_release -DFEATURE_SET=community -DWITH_EMBEDDED_SERVER=OFF -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/home/openxs/boost -DENABLE_DOWNLOADS=1 -DWITH_UNIT_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/8.0
[openxs@fc23 mysql-server]$ time make -j 4

[100%] Built target mysqld

real    33m52.791s
user    105m47.475s
sys     8m19.018s
Comparing to previous experience, I had minor problem with unit tests, so just skipped them with -DWITH_UNIT_TESTS=OFF option. There is no problem to run the resulting binaries, unless you try to use data directory from older 8.0.x. Then you'll end up with:
2018-04-19T15:36:35.165841Z 1 [ERROR] [MY-011092] [Server] Upgrading the data dictionary from dictionary version '80004' is not supported.
2018-04-19T15:36:35.166239Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2018-04-19T15:36:35.166310Z 0 [ERROR] [MY-010119] [Server] Aborting
I had to remove data directory and initialize it from scratch (it was testing instance anyway, last time used for real while I worked on this presentation):
[openxs@fc23 8.0]$ rm -rf data/*
[openxs@fc23 8.0]$ bin/mysqld --no-defaults --initialize-insecure --port=3308 --socket=/tmp/mysql.sock --basedir=/home/openxs/dbs/8.0 --datadir=/home/openxs/dbs/8.0/data --skip-log-bin
2018-04-19T15:43:55.324606Z 0 [Warning] [MY-010139] [Server] Changed limits: max_open_files: 1024 (requested 8161)
2018-04-19T15:43:55.324726Z 0 [Warning] [MY-010142] [Server] Changed limits: table_open_cache: 431 (requested 4000)
2018-04-19T15:43:55.325147Z 0 [System] [MY-013169] [Server] /home/openxs/dbs/8.0/bin/mysqld (mysqld 8.0.11) initializing of server in progress as process 20034
2018-04-19T15:44:14.438776Z 4 [Warning] [MY-010453] [Server] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
2018-04-19T15:44:29.625227Z 0 [System] [MY-013170] [Server] /home/openxs/dbs/8.0/bin/mysqld (mysqld 8.0.11) initializing of server has completed
[openxs@fc23 8.0]$ bin/mysqld_safe --no-defaults --port=3308 --socket=/tmp/mysql.sock --basedir=/home/openxs/dbs/8.0 --datadir=/home/openxs/dbs/8.0/data --skip-log-bin &
[1] 20080
[openxs@fc23 8.0]$ 2018-04-19T15:44:58.224816Z mysqld_safe Logging to '/home/openxs/dbs/8.0/data/fc23.err'.
2018-04-19T15:44:58.271255Z mysqld_safe Starting mysqld daemon with databases from /home/openxs/dbs/8.0/data

[openxs@fc23 8.0]$ bin/mysql -uroot --socket=/tmp/mysql.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.11 MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show variables like '%version%';
| Variable_name           | Value                        |
| innodb_version          | 8.0.11                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| tls_version             | TLSv1,TLSv1.1,TLSv1.2        |
| version                 | 8.0.11                       |
| version_comment         | MySQL Community Server (GPL) |

| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
| version_compile_zlib    | 1.2.11                       |
9 rows in set (0.00 sec)
So, you can build MySQL 8.0.11 right now and start using it to make your own conclusions about this release.

I still do not care about NoSQL, JSON, new cool features etc. You'll see megabytes of texts about these by the end of 2018. I am going to concentrate mostly on InnoDB, replication bugs and few others:
  •  I am happy to start with Bugt #89509 - "Valgrind error on innodb.blob_page_reserve, bundled zlib", reported by Laurynas Biveinis. See also his Bug #89597 - "Valgrind reporting memory leak on MTR test main.validate_password_component" and Bug #89433 - "NULL dereference in dd::tables::DD_properties:unchecked_get". Percona engineers spent a lot of efforts recently testing MySQL 8.0.x and reporting bugs noted. I think Oracle should explicitly admit the impact of Percona's QA effrots for the quality of this GA release.
  • Biug #89127 - "Optimize trx_rw_is_active() by tracking the lowest active transaction id". This bug was reported by Zhai Weixiang, who had suggested a patch also.
  • Bug #89129 - "create table+DML on innodb_ddl_log table=crash in". This bug was reported by Ramana Yeruva. Tables were made protected and DDL and DML operations on these tables are no longer permitted.
  • Bug #89087 - "Assertion `key->flags & 1' failed". This debug assertion (related to the way PRIMARY key was created based on UNIQUE one) was reported by Roel Van de Paar for 5.7.21, but we see the fix documented only for 8.0.x.
  • Bug #87827 - "Performance regression in "create table" speed and scalability in 8.0.3". It was reported by Alexander Rubin from Percona.
  • Bug #87812 - "Concurrent DDL operation in progress even after acquiring backup lock". Nice bug report from Debarun Banerjee.
  • Bug #87532 - "Replay log record cause mysqld crash during online DDL". I am happy to see impovements in "online ALTER" implementation that covers all GA versions, not just 8.0. I am also happy to see Oracle engineers (Ohm Hong in this case) reporting bugs in public!
  • Bug #88272 - "Assertion `new_value >= 0' failed.". Yet another debug assertion found by Roel Van de Paar, this time related to GTIDs and XA transactions. Check also his Bug #88262 - "ERROR 1598 (HY000): Binary logging not possible + abort".
  • Bug #84415 - "slave don't report Seconds_Behind_Master when running slave_parallel_workers > 0". Yet another contribution from Percona engineers. This bug was reported by Marcelo Altmann and patches were provided by Robert Golebiowski. This bug is also fixed in MySQL 5.7.22.
  • Bug #89793 - "INFORMATION_SCHEMA.STATISTICS field type change". Unexpected change in early 8.0.x versions was noted and reported by Mark Guinness.
  • Bug #89584 - "5.7->8.0 upgrade crash with default-time-zone set". Nice to see this bug (reported by Shane Bester) fixed in GA release.
  • Bug #89487 - "ALTER TABLE hangs in "Waiting for tablespace metadata lock" state". This regression bug was reported by Sveta Smirnova.
  • Bug #89324 - "main.comment_column2 fails with compression". This regression was noted and reported by Manuel Ung.
  • Bug #89122 - "Severe performance regression in server bootstrap". I am really happy to see this bug reported by Georgi Kodinov fixed. I noted it as soon as I started testing 8.0.x (see a duplicate by Roel Van de Paar, Bug #89444) and it was very annoying. I've already checked (see above) that the problem is gone!
  • Bug #89038 - "Add new column to 'mysql.routines' to accommodate the Polygot project". So, Oracle is planning to support stored programs in different languages! Thank you, Sivert Sørumgård, for reporting this in public! See also his Bug #89035 - "Reject LCTN changing after --initialize".
  • Bug #87836 - "XA COMMIT/ROLLBACK rejected by non-autocommit session with no active transaction". It would be sad if this bug is not fixed in MySQL 5.7.x, where it was originally found by Wei Zhao.
  • Bug #87708 - "MDL for column statistics is not properly reflected in P_S.METADATA_LOCKS". It was reported by Erik Frøseth.
  • Bug #85997 - "inplace alter table with foreign keys causes table definition mismatch". This bug was reported by Magnus Blåudd.
  • Bug #85561 - "Users can be assigned non-existing roles as default". Nice to see this bug reported by Giuseppe Maxia fixed in GA release.
  • Bug #33004 - "integer constants casted to bigints by unions". This bug was reported by Domas Mituzas more than 10 years ago!
Now I have to stop, as I found private bug in release notes, Bug #89512. Based on description:
"Window function row-buffer handling has been refactored to reduce the number of handler reads by 25%. (Bug #89512, Bug #27484133)"
I truly do not get why it remains private (or why it was reported in public for such a "sensitive" matter), so I better stop.

MySQL 8 is GA, finally! There are a lot more fixes there that I had not mentioned above. I am surely there is even more bugs to find. So, happy hunting!

Sunday, April 15, 2018

Fun with Bugs #66 - On MySQL Bug Reports I am Subscribed to, Part VI

I have some free time today, but I am still lazy enough to work on numerous planned and pending "ToDo" kind of posts, so why not to continue review of older MySQL bugs I am subscribed to. Today I am going to list 15 more bugs reported more than a year ago and still not fixed:
  • Bug #85805 - "Incorrect ER_BAD_NULL_ERROR after LOAD DATA LOCAL INFILE". This detailed bug report by Tsubasa Tanaka stays "Verified" for more than a year already. It's a great example of gdb use for MySQL troubleshooting. Setting a couple of breakpoints may really help to understand how MySQL works and why some weird errors happen.
  • Bug #85536 - "Build error on 5.5.54". It's clear that almost nobody besides Roel Van de Paar cares about build problem of MySQL 5.5.x(!) on Ubuntu 16.10(!). Anyway, it's strange that the bug remains "Verified" and not closed in any way if Oracle really does not intend to support MySQL 5.5 any longer. For now it seems MySQL 5.5 is still under extended support, so I hope to see this build problem fixed with some final 5.5.x release.
  • Bug #85501 - "Make all options settable as variables in configuration files". We usually see Umesh Shastry processing bugs reported by other, but this is a rare case when he reports something himself. It's a great feature request.
  • Bug #85447 - "Slave SQL thread locking issue on a certain XA workload on master". There are good reasons to think that this bug reported by Laurynas Biveinis may be fixed since MySQL 5.7.18, but no one cares to close it properly.
  • Bug #85382 - "Getting semi-sync reply magic number errors when slave_compressed_protocol is 1". This bug was reported by Jaime Sicam. Read also comments from other community members and make your own conclusions. It seems setting slave_compressed_protocol to 1 is a bad idea in general...
  • Bug #85191 - "performance regression with HANDLER READ syntax". Zhai Weixiang found clear performance regression in the way MySQL 5.7 uses metadata locking for HANDLER commands.
  • Bug #85016 - "better description for: OS error: 71". Clear and simple request from Shane Bester still stays "Verified". I am not that Oracle customer affected anyway, but this seems strange to me.
  • Bug #84958 - "InnoDB's MVCC has O(N^2) behaviors". This one bug report from Domas Mituzas could be a topic for a series of blog posts... It clearly states that:
    "if there're multiple row versions in InnoDB, reading one row from PK may have O(N) complexity and reading from secondary keys may have O(N^2) complexity"
    There is a patch that partially fixes the problem submitted by Laurynas Biveinis and created by Alexey Midenkov. While this bug is still "Verified" take carer when using secondary indexes in concurrent environments when the same data are often changed.
  • Bug #84868 - "Please make it possible to query replication information consistently". Great feature request (or bug report, if you consider inconsistency as a bug) from Simon Mudd.
  • Bug #84615 - "More steps in connection processlist state/ events_stages". Sveta Snirnova cared to ask to split some well known statement execution stages like "cleaning up" into more detailed ones. I think this is really important to simplify troubleshooting with performance_schema. Wrong/misleading/too generic stages forces to use other tools and may lead to wrong conclusions. I hit this with "statistics" also, see Bug #84858. Rare case when Sveta's request just stays "Open", for more than a year already.
  • Bug #84467 - "ALTERing KEY_BLOCK_SIZE keeps the old kbs in KEYs.". Jean-François Gagné and other well known bug reporters found several problems related to KEY_BLOCK_SIZE. It seems Oracle engineers decided NOT to fix them (see Bug #88220). But then why this bug still stays "Verified"? Consistency in bugs processing is one of my dreams...
  • Bug #84439 - "Table of row size of ~800 bytes does not compress with KEY_BLOCK_SIZE=1." Yet another bug report from Jean-François Gagné. Based on lack of activity, those looking for smaller data size, compression etc should look elsewhere and do not expect much from Oracle's InnoDB. Question is, what other engines with data compression will be supported by Oracle's MySQL 8 (or 9) GA? When you get tired wondering, consider MariaDB or Percona Server instead - they do support storage engines that are both transactional and were designed with write efficiency and space efficiency in mind. Hint: they rock...
  • Bug #84274 - "READ COMMITTED does not scale after 36 threads (in 5.6 after 16 threads)". Sveta Smirnova had a chance to run benchmarks on 144 cores (the largest box I ever had a chance to use for benchmarking had 12 cores, so what do I know...) and the result is clear - READ COMMITTED transaction isolation level does not scale well (comparing to default REPEATABLE READ). It's counter intuitive for many, but that's what we have. I doubt MySQL 8 is going to change this (unfortunate) situation.
  • Bug #84241 - "Potential Race Condition". This was found in MySQL 5.7 by Rui Gu with a little help from Helgrind.
  • Bug #84024 - "Optimizer thinks clustered primary key is not covering". This bug was reported by Manuel Ung. Let me quote a comment by Øystein Grøvlen:
    "I can agree that the cost model for join buffering is not perfect. If so, I think we should improve this model, not rely on heuristics about covering indexes versus table scan."
    I can not agree more! Let's hope this really happens in MySQL 9 at least.
You probably noted that we see mostly already famous bug reporters mentioned in this list. But names of reporters, their customer or partner status, known achievements, even clear regressions found or patches provided do not force Oracle to fix problems faster these days... They have their own agenda and great plans for MySQL, obviously.

I also have my own agenda, so I'll proceed with this glass of wine...

Tuesday, April 10, 2018

Fun with Bugs #65 - On MySQL Bug Reports I am Subscribed to, Part V

I think it's time to review some bugs I've subscribed to several months ago, those older than in the first post from this series. There are several really serious bugs in the list of 15 below:
  • Bug #87560 - "XA PREPARE log order error in replication and binlog recovery". This bug was reported by Wei Zhao, who also provided patches.
  • Bug #87526 - "The output of 'XA recover convert xid' is not useful". This bug reported by Sveta Smirnova is well known and is a real pain for DBAs who have to deal with incomplete XA transactions after some crash or unexpected restart. Check PS-1818 and MariaDB task MDEV-14593. The problem is resolved in MariaDB 10.3.3+ by a new XA RECOVER FORMAT='SQL' option.
  • Bug #87164 - "Queries running much slower in version 5.7 versus 5.6". It was reported by  Alok Pathak from Percona and stays "Verified" since August, 2017.
  • Bug #87130 - "XA COMMIT not taken as transaction boundary". Yet another XA bug report with a patch contributed by Wei Zhao.
  • Bug #87084 - "FK DELETE CASCADE does not honor innodb_lock_wait_timeout". Nice report by Elena Stepanova from MariaDB. As you can find out from MDEV-15219, it's properly fixed in MariaDB 10.2.13+.
  • Bug #87065 - "Release lock on table statistics after query plan created". Great feature request by Sveta Smirnova. The actual problem behind this feature request was resolved by Percona in versions 5.7.20-18+. This fix is one of few really good reasons to use recent Percona Server 5.7, so I opened MDEV-15101 for MariaDB also.
  • Bug #86926 - "The field table_name (varchar(64)) from mysql.innodb_table_stats can overflow.". I really wonder why this bug report by Jean-François Gagné still remains just "Verified".
  • Bug #86865 - "InnoDB does unnecessary work when extending a tablespace". Bug report and patch by Alexey Kopytov.
  • Bug #86705 - "Memory leak of Innodb". Great example of a leak (in MySQL 5.5.x only) found with a help of Valgrind/Massif. Qinglin Zhang also suggested a simple patch.
  • Bug #86475 - "Error with functions and group by with ONLY_FULL_GROUP_BY". Nice feature request from Arnaud Adant.
  • Bug #86462 - "mysql_ugprade: improve handling of upgrade errors". Simon Mudd asked for some better error messages at least, so that running under strace would not be needed.
  • Bug #86215 - "MySQL is much slower in 5.7 vs 5.6". This report from Mark Callaghan includes a lot of results and details on performance regressions at low concurrency starting from 5.0 and up to 8.0. For some cases studied the biggest drop in QPS is from 5.6 to 5.7.
  • Bug #86163 - "can't update temporary table when joined with table with triggers on read-only". I'd call this bug found by Bret Westenskow funny. The rest of them in this list are serious.
  • Bug #85970 - "Memory leak with transactions greater than 10% of the total redo log size". Nice corner case was studied by Joffrey MICHAÏE. Whenever you see
    The size of BLOB/TEXT data inserted in one transaction is greater than 10% of redo log size...
    error messages, take care, as memory allocated may not be released.
  • Bug #85910 - "Increased Performance_Schema overhead on Sending data". Seems to be a known problem fixed in MySQL 5.7. Still a nice report by Jervin R on the overhead one may expect from MySQL 5.6 on some systems.
So, we traveled back in time for a year. I am really sorry to see XA-related bugs and some reports with patches properly contributed still active.

Sunday, April 1, 2018

Fun with Bugs #64 - On MySQL Bug Reports I am Subscribed to, Part IV

I've subscribed to more than 15 new MySQL bug reports since the previous post in this series, so it's time for a new one. I am trying to follow important, funny or hard to process bug reports every day. Here is the list of the most interesting recent ones starting from the latest (with several still not processed properly):
  • Bug #90211 - "Various warnings and errors when compiling MySQL 8 with Clang".  Roel Van de Paar and Percona in general continue their QA efforts in a hope to make MySQL 8 better. Current opinion of Oracle engineers on this bug is the following:
    "First of all, these issues are in protobuf, not MySQL per se. There are some warnings with Clang 6, but since they're in third-party code, we have simply disabled them when compiling protobuf (will be part of 8.0.11). Optionally, -DUSE_SYSTEM_LIBS=1 will use system protobuf and thus not compile the files in question.
    As for the crash, we don't support prerelease compilers (more generally, we support platforms, not compilers). Given the stack trace, it is highly likely that the issue either is in the prerelease Clang, or in protobuf.
    Let's see how it may end up. Roel rarely gives up easily...
  • Bug #90209 - "Performance regression with > 15K tables in MySQL 8.0 (with general tablespaces)". Nice regression bug report from Alexander Rubin. It is still "Open".
  • Bug #90190 - "Sig=6 assertion in MYSQL_BIN_LOG::new_file_impl |". Yet another bug report from Percona employee, Ramesh Sivaraman.
  • Bug #89994 - "INDEX DIRECTORY shown as valid option for InnoDB table creation". Everybody knows how much I like fine MySQL manual. Even more I like when missing or wrong details are found there, like in this case reported by by colleague from MariaDB, Claudio Nanni.
  • Bug #89963  - "Slowdown in creating new SSL connection". Maybe it's comparing apples to oranges, as stated in one of comments, but I am surprised that this (performance regression) bug report by Rene' Cannao' is still "Open". It requires more attention, IMHO. Speed of connections matters a lot for MySQL.
  • Bug #89904 - "Can't change innodb_max_dirty_pages_pct to 0 to flush all pages". Good intentions to set better default value (applied a bit later than needed) led to the problem. As Simon Mudd put it:
    "innodb_max_dirty_pages_pct_lwm setting has existed since 5.6. This issue only comes up as by changing the default value to 10 those of us who have ignored it until now never noticed it existed. That is a shame as setting this value to a value other than 0 (e.g. 10 which is the new default) should be better and trigger some background flushing of dirty pages avoiding us hitting innodb_max_dirty_pages_pct which would trigger much more aggressive behaviour which is not really desirable."
  • Bug #89876 - "mysqladmin flush-hosts is not safe in GTID mode". Yet another bug report from Simon Mudd. See also Bug #88720 that highlights even more problems with various FLUSH statements and GTIDs.
  • Bug #89870 - "Group by optimization not used with partitioned tables". For some reason this report from Arnaud Adant is still "Open". As my colleague Richard Stracke stated:
    "The only solution would be, that the optimizer is able to check, if the condition in the where clause include the whole table (or partition) and in this case use group by optimization."
  • Bug #89860 - "XA may lost prepared transaction and cause different between master and slave." As this (and other, like Bug #88534) bug report from Michael Yang shows, there is still a long way to go until it would be safe to use XA transactions with MySQL.
  • Bug #89834 - "Replication will not connect on IPv6 - does not function in an IPv6 only environ". This bug report from Tim St. Pierre is still "Open".
  • Bug #89822 - "InnoDB retries open on EINTR error only if innodb_use_native_aio is enabled". We have patch contributed by Laurynas Biveinis from Percona.
  • Bug #89758 - "Conversion from ENUM to VARCHAR fails because mysql adds prefix index". This funny bug was found and reported by Monty Solomon.
  • Bug #89741 - "Events log Note level messages even for log_warnings=0". Nikolai Ikhalainen found that this problem happens only in versions 5.5.x and 5.6.x, so chances to see it fixed are low. But I still want to know if this ever happens.
  • Bug #89696 - "Cyclic dependencies are not resolved properly with cascade removal". Make sure to check nice discussion that my dear friend Sinisa Milivojevic had with a bug reporter, Andrei Anishchenko, before marking the bug as "Verified". This regression was most likely caused by a change in MySQL 5.7.21:
    "InnoDB: An iterative approach to processing foreign cascade operations resulted in excessive memory use. (Bug #26191879, Bug #86573)"
  • Bug #89625 - "please package the debug symbols *.pdb files!". Shane Bester always cared about having a way to debug on Windows. Recently I also started to care about this...
It's April Fools' Day today, so why not to make fool of myself assuming that anyone cares about the series of blog posts.

Sunday, March 25, 2018

Windows Tools for MySQL DBAs: Basic Minidump Analysis

"To a man with a hammer, everything looks like a nail."

Even though I had written many posts explaining the use of gdb for various MySQL-related tasks, I have to use other OS level troubleshooting tools from time to time. Moreover, as MySQL and MariaDB are still supported and used under Microsoft Windows in production by customers I have to serve them there, and use Windows-specific tools sometimes. So, I decided to start a series of posts (that I promised to my great colleague Vladislav Vaintroub (a.k.a Wlad) who helped me a lot over years and actually switched my attention from Performance Schema towards debuggers) about different Windows tools for MySQL DBAs (and support engineers).

Developers (and maybe even power users) on Windows probably know all I plan to describe and way more, by heart, but for me many things were not obvious and took some time to search, try or even ask for some advises... So, this series of posts is going to be useful at least for me (and mostly UNIX users, like me), as a source of hints and links that may save me some time and efforts in the future.

In this first post I plan to describe basic installation of "Debugging Tools for Windows" and use of cdb command line debugger to analyze minidumps (that one gets on Windows upon crashes when core-file option is added to my.ini and may get for hanging mysqld.exe process with minimal efforts using different tools) and get backtraces and few other details from them. I also plan to show simple command lines to share with DBAs and users whom you help, that allow to get useful details (more or less full backtraces, crash analysis, OS details etc) for further troubleshooting when/if dumps can not or should not be shared.
I have to confess: I use Microsoft Windows on desktops and laptops. I started from Windows 3.0 back in 1992 and ended with Windows 10 on my wife's laptop. I use Windows even for work. Today 2 of my 4 machines used for work-related tasks run Windows (64-bit XP on old Dell box I've got from MySQL AB back in 2005 and 64-bit Windows 7 on this Acer netbook). At the same time, most of work I have to do since 1992 is related to UNIX of all kinds (from Xenix and SCO OpenDesktop that I connected to from VT220 terminal in at my first job after the university, to recent Linux versions used by customers in production, my Fedora 27 box and Ubuntu 14.04 netbook used as build, Docker, VirtualBox, testing, benchmarking etc servers). I had never become a real powerful user of Windows (no really complex .bat files, PowerShell programming or even Basic macros in Word, domains, shadow copy services usage for backups, nothing fancy). But on UNIX I had to master shell, vi :), some Perl and a lot of command line tools.

I had to do some software development on Windows till 2005, built MySQL on Windows sometimes up to 2012 when I joined Percona (that had nothing to do with Windows at all), so I have old version of Visual Studio, some older WinDbg and other debugging tools here and there, but had not used them more than once a year, until recently... Last time I attached WinDbg to anything MySQL-related it was MariaDB 10.1.13, during some troubleshooting related to MDEV-10191.

Suddenly in March I've got issues from customers related to hanging upon startup/InnoDB recovery and under load, and crashing while using some (somewhat exotic) storage engine, all these - on modern versions of Microsoft Windows, in production. I had no other option but to get and study backtraces (of all threads or crashing threads) and check source code. It would be so easy to get them on Linux (just ask them to install gdb , attach it to hanging mysqld process or point out to the mysqld binary and core, and get the output of thread apply all backtrace, minor details aside). But how to do this on Winsdows, in command line if possible (as I hate to share screenshots and write long explanations on where to click and what to copy/paste)? I had to check in WinDbg, get some failures because of my outdated and incomplete environment (while customer with proper environment provided useful outputs anyway), then, eventually, asked Wlad for some help. Eventually I was able to make some progress.

To be ready to do this again next time with confidence, proper test environment and without wasting anybody else's time, I decided to repeat some of these efforts in clean environment and make notes, that I am going to share in this series of blog posts. Today I'll concentrate on installing current "Debugging Tools for Windows" and using cdb from them to process minidumps.

1. Installing "Debugging Tools for Windows"

There is a nice, easy to find document from Microsoft on how to get cdb and other debugging tools for Windows. For recent versions you just have to download Windows 10 SDK and then install these tools (and everything else you may need) from it. Proceed to this page, read the details, click on "Download .EXE" to get winsdksetup.exe , start it and select "Debugging Tools for Windows" when requested to select the features. Eventually you'll get some 416+ MB downloaded and installed by default in C:\Program Files (x86)\Windows Kits\10\Debuggers\. (on default 64-bit Windows installation with C: as system disk). Quick check shows I have everything I need:
C:\Program Files (x86)\Windows Kits\10\Debuggers\x64>dir
11/10/2017  11:55 PM           154,936 cdb.exe
11/10/2017  11:55 PM           576,312 windbg.exe
Here is the list of most useful cdb options for the next step:
C:\Program Files (x86)\Windows Kits\10\Debuggers\x64>cdb /?
cdb version 10.0.16299.91
usage: cdb [options]


  <command-line> command to run under the debugger
  -? displays command line help text
  -i <ImagePath> specifies the location of the executables that generated the
                 fault (see _NT_EXECUTABLE_IMAGE_PATH)
  -lines requests that line number information be used if present
  -logo <logfile> opens a new log file
  -p <pid> specifies the decimal process ID to attach to
  -pv specifies that any attach should be noninvasive
  -y <SymbolsPath> specifies the symbol search path (see _NT_SYMBOL_PATH)
  -z <CrashDmpFile> specifies the name of a crash dump file to debug
Environment Variables:

        Specify symbol image path.
Control Keys:

     <Ctrl-B><Enter> Quit debugger
Remember Crtl-B key combination as a way to quit from cdb. I looked as funny as the beginner vi user few times, clicking on everything to get out of that tool...

2. Basic Use of cdb to Process Minidump

Let's assume you've got mysqld.dmp minidump file (a kind of "core" file on UNIX, but better, at least smaller usually) created during some crash. Depending on binaries used, you may need to make sure you have .PDB files in some directory, for the mysqld.exe binary and all .dll files for plugins/extra storage engines used, in some directory. Default path to .PDB files is defined by the _NT_SYMBOL_PATH environment variable and may include multiple directories ad URLs.

Initially I've got advice to set this environment variable as follows:
set _NT_SYMBOL_PATH=srv*c:\symbols*
This assumes that I have a collection of .PDB files in c:\symbols on some locally available server and rely on Microsoft's symbols server for the rest. For anything missing we can always add -y option to point to some directory with additional .PDB files. Note that MariaDB provides .pdb files along with .exe in .msi installer, not only in .zip file with binaries.

So, if your mysqld.dmp file is located in h:\, mysqld.exe for the same version as generated that minidump is located in p:\software and all related .dll files and .pdb files for them all are also there, the command to get basic details about the crash in file h:\out.txt would be the following:
cdb -z h:\mysqld.dmp -i p:\software -y p:\software -logo h:\out.txt -c "!sym prompts;.reload;.ecxr;q"
You can click on every option underlined above to get details. It produces output like this:
C:\Program Files (x86)\Windows Kits\10\Debuggers\x64>cdb -z h:\mysqld.dmp -i p:\
software -y p:\software -logo h:\out.txt -c "!sym prompts;.reload;.ecxr;q"

Microsoft (R) Windows Debugger Version 10.0.16299.91 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [h:\mysqld.dmp]
User Mini Dump File: Only registers, stack and portions of memory are available

************* Path validation summary **************
Response                         Time (ms)     Location
OK                                             p:\software

************* Path validation summary **************
Response                         Time (ms)     Location
OK                                             p:\software
Deferred                                       srv*c:\symbols*http://msdl.micros
Symbol search path is: p:\software;srv*c:\symbols*
Executable search path is: p:\software
Windows 10 Version 14393 MP (4 procs) Free x64
Product: Server, suite: TerminalServer SingleUserTS
10.0.14393.206 (rs1_release.160915-0644)
Machine Name:
Debug session time: ...
System Uptime: not available
Process Uptime: 0 days X:YY:ZZ.000
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(1658.fd0): Access violation - code c0000005 (first/second chance not available)

00007fff`804a7d84 c3              ret
0:053> cdb: Reading initial command '!sym prompts;.reload;.ecxr;q'
quiet mode - symbol prompts on
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000006
rdx=000001cf0ac2e118 rsi=000001cf0abeeef8 rdi=000001cf0ac2e118
rip=00007fff5f313b0d rsp=000000653804e2b0 rbp=000001cf165c9cc8
 r8=0000000000000000  r9=00007fff5f384448 r10=000000653804ef70
r11=000000653804eb28 r12=0000000000000000 r13=000001cf0ab49d48
r14=0000000000000000 r15=000001cf0b083028
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
00007fff`5f313b0d 8b1498          mov     edx,dword ptr [rax+rbx*4] ds:00000000`

C:\Program Files (x86)\Windows Kits\10\Debuggers\x64>
that also goes to the file pointed out by the -logo option. Here we have some weird crash in Spider engine of MariaDB that  is not a topic of current post.

If you think the crash is related to some activity of other threads, you can get all unique stack dumps with the following options:
cdb -lines -z h:\mysqld.dmp -i p:\software -y p:\software -logo h:\out.txt -c "!sym prompts;.reload;!uniqstack -p;q"
This is how the backtrace of slave SQL thread may look like, note files with line numbers for each frame (-lines option):
. 44  Id: 1658.1584 Suspend: 0 Teb: 00000065`32185000 Unfrozen
      Priority: 0  Priority class: 32
Child-SP          RetAddr           Call Site
00000065`353fed08 00007fff`8046d119 ntdll!NtWaitForAlertByThreadId+0x14
00000065`353fed10 00007fff`7cbd8d78 ntdll!RtlSleepConditionVariableCS+0xc9
00000065`353fed80 00007ff6`2d7d62e7 KERNELBASE!SleepConditionVariableCS+0x28
00000065`353fedb0 00007ff6`2d446c8e mysqld!pthread_cond_timedwait(struct _RTL_CO
NDITION_VARIABLE * cond = 0x000001ce`66805688, struct _RTL_CRITICAL_SECTION * mu
tex = 0x000001ce`668051b8, struct timespec * abstime = <Value unavailable error>
)+0x27 [d:\winx64-packages\build\src\mysys\my_wincond.c @ 85]
(Inline Function) --------`-------- mysqld!inline_mysql_cond_wait+0x61 [d:\winx6
4-packages\build\src\include\mysql\psi\mysql_thread.h @ 1149]
00000065`353fede0 00007ff6`2d4b1718 mysqld!MYSQL_BIN_LOG::wait_for_update_relay_
log(class THD * thd = <Value unavailable error>)+0xce [d:\winx64-packages\build\
src\sql\ @ 8055]
00000065`353fee90 00007ff6`2d4af03f mysqld!next_event(struct rpl_group_info * rg
i = 0x000001ce`667fe560, unsigned int64 * event_size = 0x00000065`353ff008)+0x2b
8 [d:\winx64-packages\build\src\sql\ @ 7148]
00000065`353fef60 00007ff6`2d4bb038 mysqld!exec_relay_log_event(class THD * thd
= 0x000001ce`6682ece8, class Relay_log_info * rli = 0x000001ce`66804d58, struct
rpl_group_info * serial_rgi = 0x000001ce`667fe560)+0x8f [d:\winx64-packages\buil
d\src\sql\ @ 3866
00000065`353ff000 00007ff6`2d7d35cb mysqld!handle_slave_sql(void * arg = 0x00000
1ce`66803430)+0xa28 [d:\winx64-packages\build\src\sql\ @ 5145]
00000065`353ff780 00007ff6`2d852d51 mysqld!pthread_start(void * p = <Value unava
ilable error>)+0x1b [d:\winx64-packages\build\src\mysys\my_winthread.c @ 62]
(Inline Function) --------`-------- mysqld!invoke_thread_procedure+0xe [d:\th\mi
nkernel\crts\ucrt\src\appcrt\startup\thread.cpp @ 91]
00000065`353ff7b0 00007fff`80338364 mysqld!thread_start<unsigned int (void * par
ameter = 0x00000000`00000000)+0x5d [d:\th\minkernel\crts\ucrt\src\appcrt\startup
\thread.cpp @ 115]
00000065`353ff7e0 00007fff`804670d1 kernel32!BaseThreadInitThunk+0x14
00000065`353ff810 00000000`00000000 ntdll!RtlUserThreadStart+0x21
For crash analysis usually !analyze command is also used:
cdb -lines -z h:\mysqld.dmp -i p:\software -y p:\software -logo h:\out.txt -c "!sym prompts;.reload;!analyze -v;q"
It may give some details about the exception happened:
ha_spider!spider_db_connect+dd00007fff`5f313b0d 8b1498          mov     edx,dword ptr [rax+rbx*4]

ExceptionAddress: 00007fff5f313b0d (ha_spider!spider_db_connect+0x00000000000000
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000000000
Attempt to read from address 0000000000000000

PROCESS_NAME:  mysqld.exe

ERROR_CODE: (NTSTATUS) 0xc0000005 - <Unable to get error code text>
00000065`3804e2b0 00007fff`5f3132ad : 00000000`00000000 000001cf`1741ab68 000001
cf`0b083028 000001cf`0ac2e118 : ha_spider!spider_db_connect+0xdd
00000065`3804e330 00007fff`5f3117f8 : 000001ce`669107c8 000001cf`0ac2e118 000001
cf`1741ab68 00000000`00000001 : ha_spider!spider_db_conn_queue_action+0xad
00000065`3804ea20 00007fff`5f31a1ee : 00000000`00000000 000001cf`0abeeef8 000000
00`00000000 000001cd`c0b30000 : ha_spider!spider_db_before_query+0x108
00000065`3804eaa0 00007fff`5f31a0bf : 00000000`00000000 00000000`00000000 000000
65`3804ec70 00000000`00000038 : ha_spider!spider_db_set_names_internal+0x11e
00000065`3804eb30 00007fff`5f369b4e : 00000000`00000000 00000065`3804ec70 00007f
ff`5f387f08 00000000`00000000 : ha_spider!spider_db_set_names+0x3f
00000065`3804eb70 00007fff`5f32f5f1 : 00000000`00000001 00000065`00000000 41cfff
ff`00000001 00000000`00000001 : ha_spider!spider_mysql_handler::show_table_statu
00000065`3804ece0 00007fff`5f3222e8 : 00000000`00000001 00000065`00000000 000000
00`5ab175cd 000001cf`0b0886f8 : ha_spider!spider_get_sts+0x201
00000065`3804edb0 00007ff6`2d7d35cb : 00000000`00000057 000001cf`0ab49d48 000000
00`00000000 00007fff`5f321c10 : ha_spider!spider_bg_sts_action+0x6d8
00000065`3804fa30 00007ff6`2d852d51 : 000001cf`17008fe0 000001cf`0aa3fef0 000000
00`00000000 00000000`00000000 : mysqld!pthread_start+0x1b
00000065`3804fa60 00007fff`80338364 : 00000000`00000000 00000000`00000000 000000
00`00000000 00000000`00000000 : mysqld!thread_start<unsigned int (__cdecl*)(void
 * __ptr64)>+0x5d
00000065`3804fa90 00007fff`804670d1 : 00000000`00000000 00000000`00000000 000000
00`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0x14
00000065`3804fac0 00000000`00000000 : 00000000`00000000 00000000`00000000 000000
00`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21
Finally (for this post), this is how we can get information about a crashing thread, including details about local variables (like full backtrace in gdb). We apply !for_each_frame extension and use dv to "display variable":
cdb -z h:\mysqld.dmp -i p:\software -y p:\software -logo h:\out.txt -c "!sym prompts;.reload;.ecxr;!for_each_frame dv /t;q"
The result will include details about each frame, parameters and local variables, like this:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
00 00000065`3804e2b0 00007fff`5f3132ad ha_spider!spider_db_connect+0xdd
struct st_spider_share * share = 0x000001cf`165c9cc8
struct st_spider_conn * conn = 0x000001cf`0ac2e118
int link_idx = 0n0
int error_num = <value unavailable>
class THD * thd = 0x000001cf`0abeeef8
int64 connect_retry_interval = <value unavailable>
int connect_retry_count = <value unavailable>
int64 tmp_time = <value unavailable>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
01 00000065`3804e330 00007fff`5f3117f8 ha_spider!spider_db_conn_queue_action+0xa
struct st_spider_conn * conn = 0x000001cf`0ac2e118
int error_num = 0n0
char [1532] sql_buf = char [1532] ""
class spider_string sql_str = class spider_string
class spider_db_result * result = <value unavailable>
struct st_spider_db_request_key request_key = struct st_spider_db_request_key
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
02 00000065`3804ea20 00007fff`5f31a1ee ha_spider!spider_db_before_query+0x108
struct st_spider_conn * conn = 0x000001cf`0ac2e118
int * need_mon = 0x000001cf`1741ab68
int error_num = 0n0
bool tmp_mta_conn_mutex_lock_already = true
class ha_spider * spider = <value unavailable>
bool tmp_mta_conn_mutex_unlock_later = <value unavailable>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
03 00000065`3804eaa0 00007fff`5f31a0bf ha_spider!spider_db_set_names_internal+0x
struct st_spider_transaction * trx = 0x000001cf`0b083028
struct st_spider_share * share = 0x000001cf`0ab49d48
struct st_spider_conn * conn = 0x000001cf`0ac2e118
int all_link_idx = 0n0
int * need_mon = 0x000001cf`1741ab68
bool tmp_mta_conn_mutex_lock_already = true
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
04 00000065`3804eb30 00007fff`5f369b4e ha_spider!spider_db_set_names+0x3f
class ha_spider * spider = <value unavailable>
struct st_spider_conn * conn = <value unavailable>
int link_idx = <value unavailable>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
05 00000065`3804eb70 00007fff`5f32f5f1 ha_spider!spider_mysql_handler::show_tabl
class spider_mysql_handler * this = 0x000001cf`0a15dd00
int link_idx = 0n0
int sts_mode = 0n1
unsigned int flag = 1
int error_num = 0n1
struct st_spider_share * share = 0x000001cf`0ab49d48
struct st_spider_conn * conn = 0x000001cf`0ac2e118
class spider_db_result * res = <value unavailable>
unsigned int64 auto_increment_value = 0
unsigned int pos = 0
struct st_spider_db_request_key request_key = struct st_spider_db_request_key
struct st_spider_db_request_key request_key = struct st_spider_db_request_key
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
06 00000065`3804ece0 00007fff`5f3222e8 ha_spider!spider_get_sts+0x201
struct st_spider_share * share = 0x000001cf`0ab49d48
int link_idx = 0n0
int64 tmp_time = 0n1521579469
class ha_spider * spider = 0x00000065`3804ef70
double sts_interval = 10
int sts_mode = 0n1
int sts_sync = 0n0
int sts_sync_level = 0n2
unsigned int flag = 0x18
int error_num = <value unavailable>
int get_type = 0n1
struct st_spider_patition_handler_share * partition_handler_share = <value unava
double tmp_sts_interval = <value unavailable>
struct st_spider_share * tmp_share = <value unavailable>
int tmp_sts_sync = <value unavailable>
class ha_spider * tmp_spider = <value unavailable>
int roop_count = <value unavailable>
int tmp_sts_mode = <value unavailable>
class THD * thd = <value unavailable>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
07 00000065`3804edb0 00007ff6`2d7d35cb ha_spider!spider_bg_sts_action+0x6d8
void * arg = 0x000001cf`0ab49d48
int error_num = 0n0
class ha_spider spider = class ha_spider
unsigned int * conn_link_idx = 0x000001cf`1741ab78
unsigned char * conn_can_fo = 0x000001cf`1741ab80 "--- memory read error at addr
ess 0x000001cf`1741ab80 ---"
struct st_spider_conn ** conns = 0x000001cf`1741ab70
int * need_mons = 0x000001cf`1741ab68
int roop_count = 0n0
char ** conn_keys = 0x000001cf`1741ab88
class THD * thd = 0x000001cf`0abeeef8
class spider_db_handler ** dbton_hdl = 0x000001cf`1741ab90
struct st_spider_transaction * trx = 0x000001cf`0b083028
struct st_mysql_mutex spider_global_trx_mutex = <value unavailable>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
08 00000065`3804fa30 00007ff6`2d852d51 mysqld!pthread_start+0x1b
void * p = <value unavailable>
void * arg = 0x000001cf`0ab49d48
<function> * func = 0x00007fff`5f321c10
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
09 (Inline Function) --------`-------- mysqld!invoke_thread_procedure+0xe
void * context = 0x000001cf`17008fe0
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
0a 00000065`3804fa60 00007fff`80338364 mysqld!thread_start<unsigned int (__cdecl
*)(void * __ptr64)>+0x5d
void * parameter = 0x00000000`00000000
<function> * procedure = 0x00007ff6`2d7d35b0
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
0b 00000065`3804fa90 00007fff`804670d1 kernel32!BaseThreadInitThunk+0x14
Unable to enumerate locals, Win32 error 0n87
Private symbols (symbols.pri) are required for locals.
Type ".hh dbgerr005" for details.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
0c 00000065`3804fac0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Unable to enumerate locals, Win32 error 0n87
Private symbols (symbols.pri) are required for locals.
Type ".hh dbgerr005" for details.

Stay tuned. I keep working on complex MySQL/MariaDB problems under Windows, so soon will have few more findings and links to share.

Saturday, March 24, 2018

Fun with Bugs #63 - On Bugs Detected by ASan

Among other things Geir Hoydalsvik stated in his nice post yesterday:
 "We’ve fixed a number of bugs detected by UBsan and Asan."
This is indeed true, I already noted many related bugs fixed in recent MySQL 8.0.4. But I think that a couple of details are missing in the blog post. First of all, there still a notable number of bugs detected by ASan or noted in builds with ASan that remain "Verified". Second, who actually found and reported these bugs?

I decided to do a quick search and present my summary to clarify these details. Let me start with the list of "Verified" or "Open" bugs in public MySQL bugs database, starting from the oldest one:
  • Bug #69715 - "UBSAN: Item_func_mul::int_op() mishandles 9223372036854775809*-1". The oldest related "Verified" bug I found was reported back in 2013 by Arthur O'Dwyer. Shane Bester from Oracle kindly keeps checking it with recent and upcoming releases, so we know that even '9.0.0-dmr-ubsan' (built on 20 October 2017) was still affected.
  • Bug #80309 - "some innodb tests fail with address sanitizer (WITH_ASAN)". It was reported by Richard Prohaska and remains "Verified" for more than two years already.
  • Bug #80581 - "rpl_semi_sync_[non_]group_commit_deadlock crash on ASan, debug". This bug reported by Laurynas Biveinis from Percona two years ago is still "Verified".
  • Bug #81674 - "LeakSanitizer-enabled build fails to bootstrap server for MTR". This bug reported by  Laurynas Biveinis affects only MySQL 5.6, but still, why not to backport the fix from 5.7?
  • Bug #82026 - "Stack buffer overflow with --ssl-cipher=<more than 4K characters>". Bug detected by ASan was noted by Yura Sorokin from Percona and reported by Laurynas Biveinis.
  • Bug #82915 - "SIGKILL myself when using innodb_limit_optimistic_insert_debug=2 and drop table". ASan debug builds are affected. This bug was reported by Roel Van de Paar from Percona.
  • Bug #85995 - "Server error exit due to empty datadir causes LeakSanitizer errors". This bug in MySQL 8.0.1 (that had to affect anyone who runs tests on ASan debug builds on a regular basis) was reported by Laurynas Biveinis and stay "Verified" for almost a year.
  • Bug #87129 - "Unstable test main.basedir". This test problem reported by Laurynas Biveinis affects ASan builds, among others. See also his Bug #87190 - "Test main.group_by is unstable".
  • Bug #87201 - "XCode 8.3.3+ -DWITH_UBSAN=ON bundled protobuf build error". Yet another (this time macOS-specific) bug found by Laurynas Biveinis.
  • Bug #87295 - "Test group_replication.gr_single_primary_majority_loss_1 produces warnings". Potential bug in group replication noted by Laurynas Biveinis in ASan builds.
  • Bug #87923 - "ASan reporting a memory leak on merge_large_tests-t". This bug by Laurynas Biveinis is still "Verified", while Tor Didriksen's comment states that it it resolved with the fix for Bug #87922 (that is closed as fixed in MySQL 8.0.4). Why not to close this one also?
  • Bug #89438 - "LeakSanitizer errors on xplugin unit tests". As Laurynas Biveinis found, X Plugin unit tests report errors with LeakSanitizer.
  • Bug #89439 - "LeakSanitizer errors on GCS unit tests". yet another bug report for MySQL 8.0.4 by Laurynas Biveinis.
  • Bug #89961 - "add support for clang ubsan". This request was made by Tor Didriksen from Oracle. It is marked as "fixed in 8.0.12". It means we may get MySQL 8.0.11 released soon. That's why I decided to mention the bug here.
There were also few other test failures noted on ASan debug builds. I skipped them to make this post shorter.

Personally I do not run builds or tests with ASan on a regular basis. I appreciate Oracle's efforts to make code warning-free, UBSan- and ASan-clean, and fix bugs found with ASan. But I'd also want them to process all/most of related bugs in public database properly before making announcements of new related achievement, and clearly admit and appreciate a lot of help and contribution from specific community members (mostly Laurynas Biveinis in this case).

Percona engineers seem to test ASan builds of MySQL 5.7 and 8.0 (or Percona's closely related versions) regularly, for years, and contribute back public bug reports. I suspect they found way more related bugs than internal Oracle's QA. I think we should explicitly thank them for this contribution that made MySQL better!

Tuesday, March 20, 2018

Fun With Bugs #62 - On Bugs Related to JSON Support in MySQL

Comparing to "online ALTER" or FULLTEXT indexes in InnoDB, I am not that much exposed to using JSON in MySQL. OK, there is EXPLAIN ... FORMAT=JSON statement that is quite useful, optimizer trace in MySQL is also in JSON format, but otherwise I rarely have to even read this kind of data since my lame attempt to pass one introductory MongoDB training couse back in 2015 (when I mostly enjoyed mongo shell). Even less I care about storing JSON in MySQL.

But it seems community users do use it, and there is a book for them... So, out of pure curiosity last week I decided to check known bugs reported by MySQL Community users that are related to this feature (it has a separate category "MySQL Server: JSON" in public bugs database. This is a (short enough) list of my findings:
  • Bug #89543 - "stack-overflow on gis.geojson_functions / json.json_functions_innodb". This is a recent bug report from Laurynas Biveinis. Based on last comment, it may have something to do with specific GCC compiler version and exact build option (WITH_UBSAN=ON) used. But for me it's yet another indication that community (Percona's in this case) extra QA still matters for MySQL quality. 
  • Bug #88033 - "Generated INT column with value from JSON document". As Geert Vanderkelen says: "would be great if JSON null is NULL".
  • Bug #87440 - "use union index,the value of key_len different with used_key_parts’ value." This is actually a bug in optimizer (or EXPLAIN ... FORMAT=JSON), but as it is still "Open" and nobody cared to check it, it still stays in the wrong category.
  • Bug #85755 - "JSON containing null value is extracted as a string "null"". Similar to Geert's case above, Dave Pullin seems to need an easy way to get real NULL in SQL when the value is null in JSON. It is stated in comments that now everything works as designed, and it works exactly the same in MariaDB 10.2.13. Some users are not happy though and the bug remains still "Verified".
  • Bug #84082 - "JSON_OBJECT("k", dateTime) adds milliseconds to time". In MariaDB 10.2.13 the result is different:
    MariaDB [(none)]> SELECT version(), JSON_OBJECT("k", cast("2016-11-19 17:46:31" as datetime(0)));
    | version()       | JSON_OBJECT("k", cast("2016-11-19 17:46:31" as datetime(0))) |
    | 10.2.13-MariaDB | {"k": "2016-11-19 17:46:31"}                                 |
    1 row in set (0.00 sec)
  • Bug #83954 - "JSON handeling of DECIMAL values, JSON from JSON string". Take care when you cast values to JSON type.
  • Bug #81677 - "Allows to force JSON columns encoding into pure UTF-8". This report is still "Open" and probably everything works as designed.
  • Bug #79813 - "Boolean values are returned inconsistently with JSON_OBJECT". Take care while working with boolean properties.
  • Bug #79053 - "Second argument "one" to JSON_CONTAINS_PATH is a misnomer". Nice request from Roland Bouman to use "some" instead of "one" in this case. Probably too late to implement it...
There also several related feature requests, of them the following are still "Open":
  • Bug #84167 - "Managing deeply nested JSON documents with JSON_SET, JSON_MERGE". As a comment says, "It would by nice if MySQL came up with a solution for this in a more constructive way".
  • Bug #82311 - "Retrieve the last value of an array (JSON_POP)". I am not 100% sure what the user wants here, but even I can get last array element in MongoDB using $slice, for example:
    > db.text.find();
    { "_id" : ObjectId("5aafe41c6bd09c09625aa3d1"), "item" : "test" }
    { "_id" : ObjectId("5ab0e8c208d6fe081ebb4f40"), "item" : null }
    { "_id" : ObjectId("5ab0f0f6cc695c62b4b0a301"), "item" : [ { "val" : 10 }, { "val" : 11 } ] }
    > db.text.find({}, { item: { $slice: -1 }, _id: 0});
    { "item" : "test" }
    { "item" : null }
    { "item" : [ { "val" : 11 } ] }
  • Bug #80545 - "JSON field should be allow INDEX by specifying table_flag". It seems some 3rd party storage engine developers want to have a way to crate and use some nice indexes of their engines to get efficient index-based search in JSON documents.
  • Bug #80349 - "MySQL 5.7 JSON: improve documentation and possible improvements". Simon Mudd asked for more documentation on the way feature is designed (for those interested in the details of JSON data type implementation in modern MySQL versions there is a detailed enough public worklog, WL#8132) and asked some more questions. Morgan Tocker answered at least some of them, but this report still stays "Open" since that times...
It would be great for remaining few "Open" reports from community users to be properly processed, and a couple of real bugs identified fixed. There are also features to work on, and in the meantime workarounds and best practices should be shared. Maybe there are enough of them for the entire book.

That's almost all I have to say about JSON in MySQL at the moment.