If you’re a member of the Magento community, you’ve probably heard that Magento has declared end of life for 2.1.x support to be June 2019 (that’s now!). As Magento 2.1 approaches end of life, we at SD have started the processing of upgrading a number of our clients from 2.1 to 2.3, the latest major version of Magento 2.
Because of the number of core data changes that come along with the upgrade from 2.1 to 2.2 alone (i.e. the move from serialized strings to JSON), we ran into a few issues during the initial process of deploying a 2.3 instance to a staging environment. The most challenging issue came in the form of a MySQL error during the
setup:upgrade process. If you’re unfamiliar with this particular Magento command, it’s used to perform changes to the database that the application code will require. Essentially, modules can be created to make changes to preexisting database tables, or even create new ones for their own use. Because of the conversion of serialized data to JSON in many of Magento’s core tables, including customer and sales data, this process can take a very, very long time to execute depending on the amount of data in your database.
Just to take a step back for a second, I was recently fortunate enough to have the opportunity to attend on online Oracle course for MySQL Performance Tuning. At the very core of understanding how to tune your MySQL instance’s performance lies the requirement that you must establish a consistent baseline of which to work off of. Along with setting this baseline, comes the responsibility of making sure that test environments have a similar amount of data and load on them to accurately test any updates. Clearly for us, it is of the utmost importance to have near identical replication of production data in a staging environment for us to test (and benchmark) the process of converting these loads of data for these upgrades. For a number of our clients, this means making sure that we have enough anonymized customer and sales data to give us an accurate representation of how long these deployments will take, or if we’ll run into issues in a production environment.
And run into issues we did! Aside from the usual errors one would expect to run into caused by corrupted serialized data, or incompatible 3rd party modules, we ran into a rather bizarre (see: misleading) MySQL error during that
setup:upgrade process on one of our Community Edition clients with a large amount of sales data. See the error below:
Warning: Error while sending QUERY packet. PID=14322 in /vendor/magento/zendframework1/library/Zend/Db/Statement/Pdo.php on line 228
And that’s all folks! No other exceptions, errors, warnings, clues. Nada. Digging into the code a little further, Magento was attempting to query the database, but when the MySQL instance unexpectedly threw an error during execution of the query, the application code wasn’t handling it. So, all we were left with was this somewhat vague error message from MySQL.
If you’re unfamiliar with this error, a quick Google search will point you in the direction of increasing the
max_allowed_packets variable on your MySQL instance. Which is what we did… again and again until we maxed out the variable at 1G. So, what gives?
As it turns out, this particular staging instance’s MySQL instance had been greatly modified. Many variables were not only changed from the default, but drastically different than what was configured on production. Which might make sense right? Production experiences heavier loads than staging, duh. But if we go back to proper performance tuning and the goals we mentioned above, we should have remembered to not just replicate the data in staging, but the MySQL instance itself.
One system variable in particular had been significantly decreased in staging, and that was the
wait_timeout variable. As defined in the official MySQL Documentation this variable defines “the number of seconds the server waits for activity on a non-interactive connection before closing it.”. We increased this variable as necessary, re-ran the deployment, and now everything worked– the deployment, specifically
setup:upgrade, was successful. So again, what gives?
We came to understand that one of the queries – the query referenced in the error above – was actually timing out because it was waiting on another very large query to finish. We were able to replicate this behavior in a local environment when reducing this
wait_timeout variable, running two competing queries at the same time, and constructing one of these to be very large and time consuming.
There is no way that we would have run into this issue prior to a production deployment had we not made an attempt at replicating the live MySQL instance and its data in a test environment. Overall, I think our team learned a valuable lesson about both the size of upgrading from 2.1 to 2.3, and the importance of data replication in a test environment.