Before I start, I want to give a little background. INAP operates a very comprehensive CRM (customer relationship management) system which generates a lot of information. We store this information in a MySQL database, and as the database has grown over time we have had trouble keeping up with our backup needs. Lucky for us, we have a lot of very smart and talented systems administrators on staff… Here’s the story of the challenge we faced, and how we solved it.
First, keeping ANY database without backups is a very bad idea. Keeping just one copy of the data (any data, for that matter) is a terrible idea. One should always, always, always have backups. For static data, such as word documents, or photos and music files, making a manual backup here and there is fine. However, for something like an active MySQL database it’s not. Database information changes constantly, in our case there are hundreds of queries at any given moment. Because of this the actual database changes, and keeping a backup from even a few hours ago is just not enough. To remedy this, MySQL (and most other database systems) has an option for replication.
Here is how it works:
For this configuration to work, one would need two servers. Ideally connected to each other using a cross over cable (a direct connection between the two machines), but being on the same internal network should be fine, too. Then configure MySQL in a master/slave configuration, where one of the servers is the primary and one is the slave. The slave would have a constant copy of the database that is updated in real time, so when data is written to the master, the master will write to the slave.
There are a lot of advantages to doing it this way. One, there is a hot spare that is constantly up-to-date. If something happens to the primary database server, then just switch to the hot spare (be careful, though, because the replication setup would then need to be reversed…). The other advantage is that there won’t be any locked tables during database dumps. Instead, one could dump off the replication server to a backup server, and the replication will pick up where it left off (bin logs), once the dump is completed.
So, now for the pitfalls. This type of configuration requires more than one server. I like to run this with three: one master, one slave, and one for hard dumps made nightly. This gives you at least two copies of your database that are current up-to-the-minute, and one nightly full dump backup made off the master.
The moral of the story: Dynamic, or user-generated data, is invaluable. Often impossible to recreate. Keeping proper backups is only half the battle because if the data changes frequently, the backups will be out of date by the time they are needed. Replication solves this by keeping a live second copy of the data — and also solves the problem with restoration time because the database servers can simply be flipped around (the slave becomes the master), and bam! You’re back in business.
Updated: January 2019