digital archives and the “bit rot” problem

February 23rd, 2010

Today, I went to a presentation given by Vint Cerf and Robert Kahn. One of the problems they presented as still unsolved was the problem of retaining information in a readable format in the long term. Vint made a pretty funny joke about trying to open a PPT file from 1997 in the year 3000. Even using Windows 3000 with Office 2998, the file may not necessarily be readable.

This is a problem that many people have experienced first-hand. Have any old 5 1/4″ floppies laying around? Think you can still read them? And assuming you can, can you then read the file formats, which may be proprietary to software that no longer exists?

Lo and behold, a couple of hours after the talk, there’s a Slashdot story on the very same topic, pointing to an American Scientist article titled “Avoiding a Digital Dark Age”.

My thoughts on the matter. There are three separate layers here: the media longevity, the media format, the file format, and each needs to be designed with the same longevity goal in mind.

I will give an example here of how one software product handles this problem. That software product is Bacula. It uses an open, documented format for its file contents. So you can print out the specification on paper, if you like, and then sit down and re-implement the code and be able to read its files. It also uses the same format on different media, be it tape or disk. AFAIK, this design decision was made after seeing the evolution of ‘tar’ and GNU tar. Even with the same name, there are some versions of tar that produce incompatible files.

So the key is to use an open, documented format. Furthermore, it needs to be truly Free Software, not just open source but encumbered by a patent, for example.

basic sysadmin troubleshooting part 2

January 29th, 2010
  • top
    You’ll probably want to learn some basic top commands. E.g. hit “1″ to see the CPU break-out. Hit “z” to highlight processes that are in state “R”. Hit “O, n” to sort by memory usage. Hit “u” to type in a particular user name. “c” to see full command lines. 15 Practical Linux Top Command Examples
  • ps
    You’ll probably want to learn some basic ps switches. “ps -ef” for a listing that gives you users and commands. “ps auxf” that also adds some CPU and memory information for the processes and shows them as a “forest”. “ps -efL” also shows you threads for multi-threaded processes. “man ps” will tell you way more than you want to know. Another more useful example: ps -eo pmem,pcpu,rss,vsize,args|sort -k 1 -r -n|head
  • iostat
    Iostat will show you some information about the I/O subsystem. I like “iostat -k 5″; it’ll show you updates in kilobytes and 5s increments. The very first screen will show averages since boot, subsequent screens only information over the last 5 seconds. Add “-x” to see information about queue lengths and average request size as well as “service time”, i.e. latency of I/O processing.

basic sysadmin troubleshooting part 1

January 25th, 2010

There are a bunch of things that I look at almost any time I log into a machine.

  • date
    Is the output of this command what you expect? Time synchronization issues are often the cause of odd problems. If the time is wrong, check your ntpd config and output of ‘ntpq -p’.
  • w
    This will show you the uptime, first. Is that value roughly what you expect? It will show you load and users. If the machine has been up for less time than you expect, figure out why it rebooted. (And consider upgrading your sysadmin philosophy towards change management). If there are users you don’t expect to be logged in…
  • vmstat 2
    Check the swap I/O, regular I/O, check if any processes are blocked, check if memory usage changes drastically over a short time, check the CPU usage. collectl can give you more detail, but it needs installation.
  • dmesg|tail
    Are there any unusual message here? What will you do about them?

puppet notes

January 14th, 2010

There is one note that I came across that probably explains some unusual intermittent problems I’ve seen with puppet. “If you update the config and do not restart puppetmasterd and that new config is invalid puppetmasterd appears to serve up the previous version of the config that it knew worked.” So the workflow looks like this: you update the config, restart puppet on the node, nothing changes, you get confused, mess around with the config, then maybe restart puppetmasterd, then eventually it works. The key is to tail the puppetmasterd logs when you see some unusual behavior, and maybe also tail the puppet log on the client. There will typically be some relevant error message, or at least an unexpected change in behaviour (’huh, why does the node think it no longer has a defined config?’).

I also came across a couple of ‘chef vs puppet’ blog posts. One was posted on the puppet-users mailing lists and did not draw any criticism there, not surprisingly. The blog commenters were less forgiving and did a good job of convincing me that the post was not well-written. Some of the comments are worth reading though. The second post is also well written: Puppet vs Chef, and discusses the differing underlying philosophy of the two tools.

I have to say that I agree that the more advanced pre-requisites for Chef make it less appealing to a sysadmin, or even to a non-Ruby developer. It sucks having to know how to configure and run Merb and OpenID just to try the tool out. The comparison of Nagios and Puppet is an apt one, I think; we sysadmins are OK with arcane configuration syntax, so long as it’s well-documented and examples are easy to find.

A new tool that is intended to sound totally awesome is Foreman, but to me it sounds a lot like Cobbler + Puppet, which is what I used to use before I gave up on dealing with “automatically managing” my DNS, DHCP, TFTP configs. Except here you have the additional hassle of having a working Puppet stack and maybe a working Passenger install before you even start up the tool. I’m quite alright with just writing my /etc/ethers and /etc/hosts and /tftpboot/pxelinux.cfg/default by hand.

What is Google Wave?

December 5th, 2009

The big problem with Google Wave is the ambiguous naming. You see statements like “wave is like e-mail, but better”, or “wave is a protocol like SMTP”. Neither is quite right, and people get frustrated by the fact that the basic concepts of Google Wave are opaque. Let’s try to figure it out.

E-mail.
When we say “e-mail”, we often mean a set of technologies and pieces of software that implement those technologies. But sometimes we mean the “e-mail messages” themselves. “wave” is just that ambiguous.

With e-mail, we have the message itself, then the MUA, the MTA, the IMAP server, maybe the POP3 server and the mail store (mbox or Maildir or PST or Exchange). We have the “message headers”, we have the SMTP “message envelope”. And, of course, we have some supporting infrastructure like DNS MX records and URI schemes. The Wikipedia page on “e-mail” does a great job of describing the details.

Wave is at least that complicated.
With wave, we have the wave itself (which is kind of like a “message”), then we have the wave client (which is kind of like a MUA), then we have the wave server (which is kind of like an MTA), and the server is also storing the wave, so it’s also like a mail store and POP3/IMAP server. Google describes waves as “equal parts conversation and document”, and it called its server wave.google.com, and it called its client Google Wave, which doesn’t help!

Let’s start from the beginning again: a “wave” is a hosted XML document that lives on a server. The “wave” consists of “blips”. The “wave” has “participants” in that wave, and the participants may have access only to certain blips and not the whole wave. Each blip can be edited by the participants at any time, and the full revision history is kept.

The above paragraph doesn’t sound too bad, but there are still questions. What’s a “wavelet”? What’s “Wave Federation Protocol”? What wave clients are available? Wave servers? With that, we’re off to read the spec and the guide.

That’s not quite right!
Turns out, “waves” are comprised of “wavelets”. “Wavelets” are comprised of “blips”. The contents of “blips” are called “documents”. Each wave is hosted by a particular server. Each wavelet is hosted by a particular server (not necessarily the same one as the one that hosts the wave). “Federated” is a fancy word for “shared”. Not all wavelets are federated.

A “wave provider” operates a “wave service”. The service consists of a “wave store” and a “wave server”. And if they named it something other than ‘wave’, sentences like this would be easier to parse:

Typically, the wave service serves waves to users of the wave provider which connect to the wave service frontend

There are also “gadgets” and “robots”. A robot is an automated participant, and can do anything that a human participant can do. A gadget is a piece of code that participants can interact with, within the wave.

Here’s another great article that’s one level higher than what I wrote:
An Introduction to Google Wave - Google Wave: Up and Running

In conclusion, it’s important to separate the UI of the Google Wave client from the underlying concepts of the Google Wave platform. Just like your preview pane layout in Mozilla Thunderbird has nothing to do with how “e-mail” works, the way the Google Wave client shows bolding and wave structure is not helpful when trying to figure out the difference between a wavelet and a blip.

extract pages from a PDF

November 16th, 2009

Suppose you have a book or technical manual in PDF format, many hundreds of pages long. You want to send someone just a few relevant pages. the pdf toolkit to the rescue:


sudo aptitude show pdftk
sudo aptitude install pdftk
pdftk Desktop/sg246363.pdf cat 102 103 output Desktop/result.pdf

Ubuntu on HP Mini 2133

September 25th, 2009

I guess it’s officially called “HP Mini-Note PC 2133″ : http://h40059.www4.hp.com/hp2133/

A couple of good sites dedicated to this netbook:

Both have active forums. And of course, ubuntuforums is always a good source of help.

I’ll try installing 9.10 alpha 6 on this machine shortly.

rename (1) is different on Debian vs RH

September 21st, 2009

Crazy! The Debian one is better, of course :)

debian-box# rename --help
Unknown option: help
Usage: rename [-v] [-n] [-f] perlexpr [filenames]

vs

rh-box# rename --help
call: rename from to files...

The Debian one is the one that takes standard Perl regexes (and is in fact a simple Perl script).

need to disable “visual effects” to get some old X apps to work in Ubuntu

September 8th, 2009

This is Ubuntu 9.04 (x86_64). I had Visual Effects turned on because I have a newish nVidia card, using the proprietary nVidia drivers. Trying to use IBM’s “Storage Manager 10″ aka SANtricity, you get just blank windows for some of the windows. The workaround is System -> Preferences -> Appearance -> Visual Effects -> None

Now the windows show up in all their asstastic X11 motif glory.

what is “sustainability”?

April 21st, 2009

Treehugger has a story talking about the over-use of the word. “sustainability” seems to be going the way of “green” and “organic”, vague feel-good terms without a specific definition.

I started this blog in the beginnning with the intent of writing about “sustainable computing”, but of course, that is a meaningless term, thus the number of posts I’ve written on the topic (0).

If you go with the classic definition of “sustainability”

Development that meets the needs of the present without compromising the ability of future generations to meet their own needs.

then computing as we know it today cannot be sustainable until all of the components of our computing environment are fully recyclable and do not use more energy than necessary. OTOH, today’s dirty wasteful computers are a very valuable tool that can be used towards a “sustainable” goal.