MySQL 8 is coming

OpenSource.com covers the upcoming release of the MySQL 8.

What happened to 6 & 7?

Years ago, before the Sun Microsystems purchase of MySQL AB, there was a version of MySQL with the number 6. Sadly, it was a bit ambitious and the change of ownership left it to wither. The MySQL Cluster product has been using the 7 series for years. With the new changes for MySQL 8, developers feel they have modified it enough to bump the big number.

The new version brings a whole lot of changes to filesystem organization, indexes, faster ALTER TABLE, and more.

GitHub to MySQL

GitHub to MySQL is a handy little app in PHP that pulls labels, milestones and issues from GitHub into your local MySQL database.  This is useful for analysis and backup purposes.

There are a few example queries provided that show issues vs. pull requests, average number of days to merge a pull request over the past weeks, average number of pull requests open every day, and total number of issues.

I think this tool can be easily extended to pull other information from GitHub, such as release notes, projects, web hooks.  Also, if you are using multiple version control services, such as BitBucket and GitLab, extending this tool can help with merging data from multiple sources and cross-referencing it with the company internal tools (bug trackers, support ticketing systems, CRM, etc).

This is not something I’ll be doing now, but I’m sure the future is not too far away.

RethinkDB: why we failed

Startups are born and gone every single day.  Much more often so in technology sector.  Most of these just disappear into the ether.  RethinkDB at least leaves the useful trace of analysis of what happened and why they failed.

When we announced that RethinkDB is shutting down, I promised to write a post-mortem. I took some time to process the experience, and I can now write about it clearly.

In the HN discussion thread people proposed many reasons for why RethinkDB failed, from inexplicable perversity of human nature and clever machinations of MongoDB’s marketing people, to failure to build an experienced go-to-market team, to lack of numeric type support beyond 64-bit float. I aggregated the comments into a list of proposed failure reasons here.

Some of these reasons have a ring of truth to them, but they’re symptoms rather than causes. For example, saying that we failed to monetize is tautological. It doesn’t illuminate the reasons for why we failed.

In hindsight, two things went wrong – we picked a terrible market and optimized the product for the wrong metrics of goodness. Each mistake likely cut RethinkDB’s valuation by one to two orders of magnitude. So if we got either of these right, RethinkDB would have been the size of MongoDB, and if we got both of them right, we eventually could have been the size of Red Hat[1].

Thank you, guys.  There are valuable lessons in there.  And three points, of course:

If you remember anything about this post, remember these:

  • Pick a large market but build for specific users.
  • Learn to recognize the talents you’re missing, then work like hell to get them on your team.
  • Read The Economist religiously. It will make you better faster.

Amazon RDS and Amazon Virtual Private Cloud (VPC)

Yesterday I helped a friend to figure out why he couldn’t connect to his Amazon RDS database inside the Amazon VPC (Virtual Private Cloud).  It was the second time someone asked me to help with the Amazon Web Services (AWS), and it was the first time I was actually helpful.  Yey!

While I do use quite a few of the Amazon Web Services, I don’t have any experience with the Amazon RDS yet, as I’m managing my own MySQL instances.  It was interesting to get my toes wet in the troubleshooting.

Here are a few things I’ve learned in the process.

Lesson #1: Amazon supports two different ways of accessing the RDS service.  Make sure you know which one you are using and act accordingly.

gs-vpc-network

If you run an Amazon RDS instance in the VPC, you’ll have to setup your networking and security access properly.  This page – Connecting to a DB Instance Running the MySQL Database Engine – will only be useful once everything else is taken care of.  It’s not your first and only manual to visit.

Lesson #2 (sort of obvious): Make sure that both your Network ACL and Security Groups allow all the necessary traffic in and out.  Double-check the IP addresses in the rules.  Make sure you are not using a proxy server, when looking up your external IP address on WhatIsMyIP.com or similar.

Lesson #3: Do not use ICMP traffic (ping and such) as a troubleshooting tool.  It looks like Amazon RDS won’t be ping-able even if you allow it in your firewalls.  Try with “telnet your-rds-end-point-server your-rds-end-point-port” (example: “telnet 1.2.3.4 3306” or with a real database client, like the command-line MySQL one.

Lesson #4: Make sure your routing is setup properly.  Check that the subnet in which your RDS instance resides has the correct routing table attached to it, and that the routing table has the default gateway (0.0.0.0/0) route configured to either the Internet Gateway or to some sort of NAT.  Chances are your subnet is only dealing with private IP range and has no way of sending traffic outside.

Lesson #5: When confused, disoriented, and stuck, assume it’s not Amazon’s fault.  Keep calm and troubleshoot like any other remote connection issue.  Double-check your assumptions.

There’s probably lesson 6 somewhere there, about contacting support or something along those lines.  But in this particular case it didn’t get to that.  Amazon AWS support is excellent though.  I had to deal with those guys twice in the last two-something years, and they were awesome.

Taking the Pain Out of MySQL Schema Changes

Taking the Pain Out of MySQL Schema Changes” covers the following approaches to deploying MySQL schema changes:

  1. Schema Change in Downtime
  2. Role Swap (cluster setup)
  3. pt-online-schema-change

The last one is the usage of pt-online-schema-change tool developed by Percona guys, as part of their Percona Toolkit – an Open Source set of command-line tools for MySQL.

Database Engines Ranking

db-engines-ranking-table

DB-Engines.com provides some insight into some of the most popular database engines (312 of them to be precise).  Nothing too surprising there – Oracle and MySQL leading the charts, but it’s nice to have the numbers and trends.

db-engines-ranking

There are, of course, many different ways how the popularity can be calculated.  Their method is based on the popularity of each engine in a variety of online outlets, from Google Search to social networks.

  • Number of mentions of the system on websites, measured as number of results in search engines queries. At the moment, we use Google, Bing and Yandex for this measurement. In order to count only relevant results, we are searching for <system name> together with the term database, e.g. “Oracle” and “database”.
  • General interest in the system. For this measurement, we use the frequency of searches in Google Trends.
  • Frequency of technical discussions about the system. We use the number of related questions and the number of interested users on the well-known IT-related Q&A sites Stack Overflow and DBA Stack Exchange.
  • Number of job offers, in which the system is mentioned. We use the number of offers on the leading job search engines Indeed and Simply Hired.
  • Number of profiles in professional networks, in which the system is mentioned. We use the internationally most popular professional networks LinkedIn and Upwork.
  • Relevance in social networks. We count the number of Twitter tweets, in which the system is mentioned.

It seems objective and representative enough to me.

Magento database maintenance

If you are running a Magento-based website, make sure you add the database maintenance script to the cron.  For example, append this to the /etc/crontab:

# Magento log maintenance, as per
# https://docs.nexcess.net/article/how-to-perform-magento-database-maintenance.html
0 23 * * 0 root (cd /var/www/html/mysite.com && php -f shell/log.php clean)

Thanks to this page, obviously.  You’ll be surprised how much leaner your database will be, especially if you get any kind of traffic to the site.  Your database backups will also appreciate the trim.

 

SQL Server in a Fedora Docker Container

MS SQL Server and Docker

It’s a well known fact that I am not the greatest fan of Microsoft and their technologies.  I’ve been bitten many a time through the years.  And not even them becoming a Platinum Partner in the Linux Foundation can change my attitude towards them.  It’s just been too much pain, and scars, and tears, and sweat.

But the way life is, once in a while, I just have to work with or around them.  Recently, for example, at work, we’ve done a project that just had to use MS SQL Server and there was no way to get around it.  Gladly, I managed to find just the right image on the Amazon AWS Marketplace, and spin a new EC2 instance for testing.  The local development was difficult, but at least we had a place to test stuff before sending it off to the customer.

If such a need arises in the future, I think I’ll give the MS SQL for Linux a try.  And that’s when this article from Fedora Magazine might come in handy.  MS SQL + Docker + Fedora.  Hmm.

Fixing “InnoDB: Error: log file ./ib_logfile0 is of different size”

For the last few days I’ve been moving MySQL databases around at work.  Being a bit in a rush and overconfident (I have backups!),  I was simply detaching the /var/lib/mysql volume on one host (running Amazon AMI and MySQL) and attaching it to another host (running CentOS 7 and MariaDB).

It’s not surprising that I got this error: “InnoDB: Error: log file ./ib_logfile0 is of different size“.  Gladly, this ServerFault thread provided enough hints for me to solve the problem.  In a nutshell:

  1. Temporarily comment out the InnoDB log file size setting (e.g.: innodb_log_file_size = 64M) in /etc/my.cnf.
  2. Set innodb_fast_shutdown to 0 (read more).
  3. Restart the MySQL service once or twice.
  4. Uncomment the log file size setting.
  5. Set InnoDB fast shutdown back to default or remove it from your my.cnf altogether.
  6. Celebrate!

Knowing how little I learn from my own mistakes, I’m sure I’ll find this post useful in the future.

MySQL, PHP and “Integrity constraint violation: 1062 Duplicate entry”

Anna Filina blogs about an interesting problem she encountered with when working on a PHP and MySQL project:

MySQL was complaining about “Integrity constraint violation: 1062 Duplicate entry”. I had all the necessary safeguards in my code to prevent duplicates in tha column.

I gave up on logic and simply dumped the contents of the problematic column for every record. I found that there was a record with and without an accent on one of the characters. PHP saw each as a unique value, but MySQL did not make a distinction, which is why it complained about a duplicate value. It’s a good thing too, because based on my goal, these should have been treated as duplicates.

She also mentions two possible solutions to the problem:

My solution was to substitute accented characters before filtering duplicates in the code. This way, similar records were rejected before they were sent to the database.

and

As pointed out in the comments, a more robust and versatile solution would be to check the collation on the column.

I’m sure this will come in handy one day.