Bob Mitchell

bob-o-rama: The Art of Slack.

Linux Ramdisks

Making things go faster - less waiting on disks

18th August 2010 - 22:01 - bob

Thinking about using /dev/shm for fast storage (in my case for Apache) made me think about ramdisks in general.

The last time I tried anything to do with ramdisks was on my old Amstrad 5286, when I created a massive 1MB (ish) ramdisk under MS DOS 3.3 (probably). This time I was using a £500 server from ebay with 16GB of memory.

The idea was to keep a few GB for the OS and allocate the rest to a ramdisk - about 4GB for the OS and apps and whatever else was going and then the rest, about 12GB for the ramdisk.

There were two options that I saw for doing persistence, either recreating the content of the disk from scratch just after boot, or restoring them from disk.

I chose the latter option.

As I was using LVM for storage, I created a new LV of about 12G - lvcreate tends to round this up to the nearest extent, so being precise was tricky (and unnecessary).

using fdisk on my new LV gave me the size (there are probably a million other ways, but this came to mind first)

fdisk /dev/VolGroup00/ramdiskbackup

And then 'p' to print the layout gives the exact size of the device (in bytes).

Then create the ramdisk - edit /etc/grub.conf and add the option

ramdisk_size=xxxxxxx

to create a line that looks like

kernel /vmlinuz-y.y.yyy-yyy ro root=LABEL=/ ramdisk_size=xxxxxxx

Where xxxxxxx is the size in K (the bytes number from up there divided by 1024)

reboot

Checking /dev/ram* will reveal a bunch of ramdisk devices, I chose /dev/ram which is a symlink to /dev/ram0

Do not work with more than one of these, as each one will be 12G and quickly out-of-memory the box! (been there, done that)

You can create a filesystem on the ramdisk using

mke2fs /dev/ram

ext2 is just fine - no journalling really needed.

But, as we need the lv we created to have the same content, I ran:

mke2fs /dev/VolGroup00/ramdiskbackup

and then copied from here using 'dd' to the ramdisk:

dd if=/dev/VolGroup00/ramdiskbackup bs=10M of=/dev/ram

Which, at about 50MB/s (the read speed of the actual hard drives being the limiting factor) takes a few minutes. (Time for tea!)

Once complete, create a mountpoint and mount up the ramdisk

mkdir /ramdisk mount /dev/ram /ramdisk

Then do what you want with it.

I moved the mysql DB data directory and temp to this location

service mysqld stop mkdir -p /ramdisk/mysql/data mkdir -p /ramdisk/mysql/tmp vi /etc/my.cnf

change datadir and tmpdir to your ramdisk

datadir=/ramdisk/mysql/data tmpdir=/ramdisk/mysql/tmp

Move the mysql database files to the ramdisk

mv /var/lib/mysql/* /ramdisk/mysql/data/

and startup mysql again

service mysqld start

I applied the same process to my application and parts httpd (mostly docroot, that, in my case, were a bit disk-intensive)

Once all this is done and vaguely working, back it up to disk:

service mysqld stop service httpd stop umount /ramdisk dd if=/dev/ram bs=10M of=/dev/VolGroup00/ramdiskbackup

This took about twice as long as before, as the write speed of the actual disks was only about 20MB/s (time for tea and cake!)

In my case, a simple benchmark that was taking well over two minutes on physical disks with ~30-40 percent iowait at times suddenly started taking about forty-five seconds, with never more than 1 percent iowait and now completely bound on actual CPU performance.

Time to investigate faster processors.

It occurs to me that it may be possible to speed-up the backup/restore operation (especially when quite empty) by ensuring that the empty bits of the filesystem are zeroed-out and writing to a much smaller file instead of a painstakingly perfectly-sized block device.

Of course, this really shouldn't be too much of a surprise. You take a heavily IO bound workload and improve the performance of the storage layer by at least 100x and you're no longer bound by IO. Still, it's wonderful to see and makes me happy.

This isn't for everyone - it will only work if you are IO bound (DB writes in my case), have enough RAM for your whole dataset, if you can either take your app down to back it up, or don't mind bringing a stale copy up-to-date and if you have reasonably reliable hardware and power.

Still - felt compelled to write about it.

Please let me know about any corrections / comments / ideas / whatever.