Zach Shepherd's WordPress Blog

Just another WordPress weblog

Sunday, October 28, 2007

Issue with etch images

It looks like there seems to be a fairly consistent problem on our virtual machines running debian etch. Jeremy noticed it on cluster, then I noticed the same issue on time and docs. The symptom: /dev/null and /dev/false aren’t
accessible to non-root users and not having udev installed. The fix: installing udev. I’m not exactly sure what the problem is, but it certainly seems to cause various programs (e.g. bash) to throw a “few” errors about not being able to write to /dev/null.

posted by Zach at 5:17 pm  

Sunday, October 28, 2007

The New Production Blade Setup

Because of the recent issues with 0001 and 0010 and the fact that the only proposed fix was to update the image (which would involve recompiling the kernel), it was decided that it would be easier (and make more sense) to speed along the development of a 64-bit dom0 (which was already in development on berylium [formerly 0100] by Jeremy), and get it to the point where it could be copied onto other blades to get everything back up (instead of fixing the current dom0s and switching to 64-bit in a week or two). A “rough draft” has been put on lithium [formerly 0011], and most of the VMs are up and running there. The game plan is as follows:

  • Get berylium cleaned up and everything installed and working (e.g. nagios plugin and live migration setup)
  • Boot into a live cd and tar it up (tar -cvvpf backups/tarnamehere.tar . –exclude=backups/* –exclude=dev/* –exclude=media/* –exclude=mnt/* –exclude=proc/* –exclude=tmp/* –exclude=xen/*)
  • Boot 0001 and 0010 into live cds and untar it there
    • Change dhclient.conf
    • Change hostname
    • Change hosts
    • Change sshd_config
    • Change motd
    • Change fstab UUIDs
      Change macid in network-bladecenter-bridges
    • Remove udev rules for networking
  • Reboot the blades (which would, at that point, be hydrogen and helium)
  • Hope everything works (troubleshoot as necessary)
posted by Zach at 11:19 am  

Sunday, October 28, 2007

Renaming the Nagios Image

Because the blades now (or will shortly) have 64-bit support, I changed the name of the current Nagios image to nagios32 and created a new 64-bit nagios image for Matt to move stuff over to. The rename process is kind of long (not nearly as bad as the resize process), so I thought I’d document it here (although it’s already on the wiki).

  • Shutdown the image
  • Kill the instance of vblade for the image
  • Move the physical file
  • Mount the filesystem
  • Edit the config files
    • /etc/hostname
    • /etc/dhcp3/dhclient.conf
    • /etc/ssh/sshd_config
  • Edit the masterlist on animal
  • Start the vblade daemon
  • Edit the hostsfile on righteous
posted by Zach at 11:08 am  

Saturday, October 27, 2007

Resizing the COSI image

The COSI image ran into an interesting issue; it ran out of space. For future reference, the resize process is as follows:

  • Shutdown Image
  • Create Backup Copy (cp /mnt/raidB/xenlib/images/imagename.disk /mnt/raidB/xenlib/temp/)
  • Turn Image back on if desired
  • Create image of new size
    • “dd if=/dev/zero of=/mnt/raidB/xenlib/temp/new_imagename.disk bs=1024k count=1 seek=X”
  • Partition the new image, mount the images, and copy the files, preserving permissions
    • “losetup -f” to get next open loopback device. (the following commands assume it’s loop0)
    • “losetup /dev/loop0 /mnt/raidB/xenlib/temp/new_imagename.disk”
    • “losetup /dev/loop1 /mnt/raidB/xenlib/temp/imagename.disk”
    • “fdisk /dev/loop0″ (Create swap and root partitions)
    • “kpartx -av /dev/loop0″
    • “kpartx -av /dev/loop1″
    • “mkswap /dev/mapper/loop0p1″ (assumes swap partition is 1)
    • “mkfs.ext3 /dev/mapper/loop0p2″ (assumes root partition is 2)
    • “mkdir /mnt/{old,new}”
    • “mount /dev/mapper/loop0p2 /mnt/new” (assumes root partition is 2)
    • “mount /dev/mapper/loop1p2 /mnt/old” (assumes root partition is 2)
    • “cp -pr /mnt/new/* /mnt/old”
    • “umount /mnt/new”
    • “umount /mnt/old”
    • “kpartx -d /dev/loop0″
    • “kpartx -d /dev/loop1″
    • “losetup -d /dev/loop0″
    • “losetup -d /dev/loop1″

    Shutdown the current image

  • Shutdown the vblade export
  • Put the new image in place (mv /mnt/raidB/xenlib/images/{,old}imagename.disk /mnt/raidB/xenlib/{temp/new_,images/}imagename.disk)
  • Start the vblade export
  • Start the image
  • If everything works, clean up after yourself (rm /mnt/raidB/xenlib/temp /mnt/raidB/xenlib/images/old_imagename.disk)
posted by Zach at 2:57 pm  

Saturday, October 27, 2007

Modification to Animal’s confgen Script

The confgen script on animal now has two additional boolean flags; an enabled/disabled flag for each network (so virtual machines can be only on one network if necessary).

posted by Zach at 11:59 am  

Tuesday, October 16, 2007

A note on using AoE with Virtual Machines

An issue I ran into recently (and was finally able to confirm was solved) is that when having a virtual machine on the COSI setup mount an AoE device, some special considerations need to be made. Because the modules directory is on a separate partition (shared between the images), modules listed in /etc/modules wont be added because the files haven’t yet been mounted. When the devices in fstab are finally mounted, the modules are now accessible, but because the AoE modules weren’t added, the mounting of those devices fails.

The solution? Add the modprobe commands, followed by “mount -a” to the rc.local (which is run after the modules directory has been mounted).

If anyone knows of a better solution, I’d certainly be interested, but this appears to work for now.

posted by Zach at 9:02 pm  

Saturday, October 6, 2007

Web Accessibility and the Americans with Disabilities Act

It seems that lawsuits over in-accessible commercial websites are becoming less uncommon. The outcome of these lawsuits (and the subsequent appeals) will determine if the Americans with Disabilities Act will be enforced in the field of web accessibility. Currently there are a variety of governmental policies related to accessibility, but the outcome of the Target lawsuit will set the stage for the future.

Many web development “professionals” don’t give topics like accessibility (or standards compliance) the attention they deserve. (If you’re wondering why I stuck professionals in quotes, the article A web professional never stops learning may shed some light on the subject.) I’ve found that many people in the business world don’t understand the importance of standards compliance and accessibility in website development. They simply go for the “best” (cheapest, easiest, and fastest) solution without realizing that they’re being cheated; all they end up with is a poorly engineered site that eventually needs to be replaced. Web developers need to educate each other about the necessity of good development practices, and then they need to educate their clients (and potential clients).

posted by Zach at 8:55 pm  

Saturday, October 6, 2007

Copyright and Music

An open letter to the CIRA recently posted on Digg got me thinking about the wide leeway the Supreme Court gives Congress in its interpretation of the Article I, Section 8, Clause 8 (also known as the Intellectual Property Clause).

To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

This short (and in my opinion, highly specific) piece of the Constitution is the only clause related to Intellectual Property Rights. In my opinion, the clause specifically states that the purpose of the protection of Intellectual Property Rights is to promote the progress of science and useful arts for a limited time.

How do copyrights lengths on the order of lifetimes encourage progress? I just can’t imagine that the current interpretation is what the writers had in mind. The Supreme Court needs to step up and do their job; they need to place some reasonable boundaries on what Congress can place under the umbrella of copyright.

posted by Zach at 2:33 am  

Friday, October 5, 2007

Fedora Images (rpmstrap)

I recently tried to set up a fedora Xen image with little success. Maybe I’ll be able to find something when I look around again, but I thought I’d at least make a post so I remembered where I ran into issues.

In theory the setup is basically the same as creating a debian/ubuntu image except that you use rpmstrap (apt-get/yum install rpmstrap). In theory, you should then be able to run a command such as “rpmstrap –arch i386 fedora7 /mnt/tmp/”, but rpmstrap doesn’t seem to come with scripts for Fedora (it comes with quite a few, but sadly the Fedora ones were missing). I tried installing rpmstrap on a few platforms (debian etch, ubuntu fiesty, fedora 7) to see if the scripts were included on one or them, but I couldn’t find them if they were.

Update: It looks like rpmstrap supports up to Fedora Core 4, but since I’ve only used Core 6 and 7, I didn’t recognize the nicknames for the earlier versions. Sadly Core 4 is so out of date that there is no easy way to install it and upgrade from there, but now that I have something to work with, I might work with someone more familiar with Fedora and hammer out a script for 7 (I can’t be the only person wanting to use rpmstrap to install Fedora).

posted by Zach at 8:29 pm  

Friday, October 5, 2007

Migration to AoE

I’ve finally gotten around to making it happen. I thought it was going to be painful, but it actually wasn’t bad. It was just a matter modprobing aoe and dm-multipath (and adding them /etc/modules), changing the masterlist on animal, and exporting them using vblade daemon.

I only ran into one issue: in the process of changing over to AoE, I started two images (comm and storage) on a blade (0010) that only had 896MB of memory for the dom0, bringing the memory for the dom0 to 640MB (each image used 128MB). I then shut down the two images, and started a different image (cosi). I thought the other two images had fully shutdown, but apparently I jumped the gun. This lowered the memory for the dom0 to 128MB (cosi uses 512MB). You’d think xen would give the dom0 the extra 256MB of memory (from the shutdown of comm and storage), but it looks like it just sets it aside. Since I don’t know enough about xen to know if there was a command to get the memory back and the dom0 was running dangerously slowly, I decided that it might be a good idea to get the memory back before the network (and network mounted images for the vms) timed out, so I did the only thing I knew would work: a reboot. It’s probably not the best solution, but it worked.

Now everything is set (including AoE on 0001) and I was able to reproduce the ‘issue’ of xen taking memory away from the dom0 when a vm was started, but just setting it aside (for use by other images if you create them) when the vm is shut down. If anyone knows anything about getting the memory back (or why the memory can’t be returned), I’d be interested in hearing about it.

posted by Zach at 7:24 pm  
Next Page »

Powered by WordPress