Wednesday, July 30, 2008

Quickly Populate a Table

There are a lot of times in MySQL when I want to populate a table with dummy data. I've used scripts to do this on many occasions; however, this stored procedure works just as well. I'm posting it here mainly for my own reference. Obviously the insert and while sentinel need to be modified for your particular uses.

delimiter //
create procedure populate_table()
begin declare i int default 0;
while i < 10000
insert into test values (i, 'test', 'test2');
set i = i+1;
end while;
delimiter ;

Window Management

If you've got a bunch of split windows in Vim and want to close them, here are two handy window management shortcuts.

Ctrl-w q (unsplit one window at a time)
Ctrl-w Ctrl-o (close all windows except current)

Thanks to Bill Pitre for the tips!

Tuesday, July 29, 2008

MySQL High Availability, Sandbox + Proxy

I've been doing a lot of research and testing lately in regard to getting a high availability MySQL solution in place. Namely, I've been looking into moving from a master/slave topology to a multi-master replication ring. This article is not intended to be a comprehensive howto; rather, a cursory overview of some of the available technologies and how they fit together.

Multi-master replication is one area where MySQL leads the pack of open-source DB solutions. Until 5.x, there wasn't really a satisfying solution in regard to avoiding primary-key collissions across masters. Fortunately, this key obstacle has since been resolved with the addition of a few simple commands:

server 1:
replicate-same-server-id = 0
auto-increment-increment = 2
auto-increment-offset = 1

server 2:
server-id = 2
replicate-same-server-id = 0
auto-increment-increment = 2
auto-increment-offset = 2

By defining a unique increment and offest value per server, primary key collisions on inserts are avoided when the binlog is processed. Telling MySQL not to replicate the same server ID means that in a circular replication topology, once an event has made it's way around the ring, it stops at the server from which it originated.

Setting up a test environment with multiple MySQL servers used to be a pain. Luckily, a relatively new project on the scene allows one to easily setup an arbitrary number of MySQL install instances on a single machine. Enter MySQL sandbox. Setting up a replication ring consisting of two masters is a matter of downlading the tarball of the version you want to sandbox and running the following command:

./make_replication_sandbox --master_master /path/to/mysql/tarball.gz

Passing the appropriate command-line arguments allows you to specify the number of servers you want sandboxed, concurrent versions you want installed, the topology, etc...

The last piece of software I want to mention is MySQL proxy. Although still in pre-beta, proxy is one of the more impressive pieces of software I've seen lately. As most proxies are, it's a piece of middleware, in this case, delegating wire requests between a client and MySQL server. An embedded lua interpreter allows fine-grained control over all aspects of communication. Creative uses include things like re-writing query syntax and providing a SQL interface to the Unix shell; however, the real attractiveness of proxy in my opinion is that it provides two very valuable facilities not baked into MySQL by default. The first is load balancing, and the second is failover. Consider an invocation such as the following:

LUA_PATH="/usr/share/mysql-proxy/?.lua" /usr/sbin/mysql-proxy \
--proxy-address=:3306 \
--proxy-lua-script=/usr/share/mysql-proxy/rw-splitting.lua \
--proxy-backend-addresses= \
--proxy-backend-addresses= \

This would tell proxy to listen on port 3306 in place of your standard MySQL server. Two read-write backends are given to proxy (ports 19101 and 19102) and one read-only backend (port 19103) is provided. Proxy is smart enough to delegate read-write requests to the masters and read-only requests to the slave. From my tests, requests seem to be sent to the first master by default until load picks up. From there, they seem to be load balanced in a round-robin type fashion. In the event that a server fails, proxy will gracefully remove a backend from the server pool until it comes back online. In regard to concurrency, it can scale well into 1000 connections from my own tests; although, real-world usage obviously depends on your hardware configuration among other things. Used in concert with MySQL sandbox, the possibilities are left to your imagination.

Obviously, I've only scratched the surface of each of these technologies. Really digging into their capabilities and learning their inherent limitations is left as an exercise to the reader. I leave you with the following links.

Dual-Master MySQL Replication Done Right

Advanced Replication Techniques

MySQL Sandbox

MySQL Proxy

Tuesday, July 22, 2008

Less? More? Vim

When viewing large files, people typically tend to use a pager program such as more or preferably less. Neither of these buffers the entire file into memory, so it's quick to open a large file and scroll around. If you're using more, you should try less. It lets you scroll backwards in the file as well as forwards.

Out of the box, Vim doesn't do so great for really big files. Features like syntax highlighting and undo history can really slow things down on huge buffers; however, there is a plugin that can ease the pain. The LargeFile plugin is available here, and it really does seem to speed up opening and moving around in files larger than a few megs. A word of caution though, if you're dealing with really huge files, a pager program is still the way to go due to the sheer amount of memory required to manipulate those files.

Monday, July 21, 2008

Untar a Single File

From time to time, I've had the need to extract a single file from a tar archive. How to do so is pretty poorly documented in the man page, but it's actually easy to do. If you have a tarball named "my.tar" with a group of files in it, and you want to extract "hello.txt", you would do the following:

tar -x hello.txt -vf my.tar

Before I extract anything from a tarball, I always preview it's contents to avoid "tarbombs" (tarballs that extract directly into the CWD).

tar -tvf my.tar

Or if it's gzipped:

tar tzvf my.tar.gz

Lastly, newer versions of Vim can browse a tarball out of the box and open any file contained within.

vim my.tar (displays a file browser)

Friday, July 18, 2008

Sorting Files By Size

In case anybody was wondering, I'm still alive. I've been inordinately busy lately and have been spending the majority of my free time online re-learning Ruby on Rails. I've also been digging into multi-master and circular replication schemes for MySQL and MySQL proxy. I plan on sharing some of the things I've learned sometime in the near future. Anyway, here's a handy tip that I just used this morning.

If you want to sort all the files in a directory by size, you can do the following:

ls -lsR | sort -nr

The sort command will order by the number of blocks in each file. The biggest files are at the top.