Archive for the ‘Technical Articles’ Category

Function return value versus exit code


Wednesday, November 19th, 2008

In projects I’ve worked on in the past few years, I am noticing what seems like an increased confusion toward program exit codes and function return values. In particular, I’ve found that some programmers seem to feel these two separate things should be related somehow. They’re not related, nor should they be. In a nutshell: programs exit with 0 for success; boolean functions return 1 for success. In this article, I’ll discuss some of the motivation behind these separate and seemingly counter-intuitive conventions.

Function Return Values

As far as most computers (and their programming languages) are concerned, the value 0 is almost universally equivalent to “false”, while 1 (or, any positive integer) typically equates to “true”. This is basic Boolean logic. Here is a typical construct in C:

int is_even(int num) {
        return (num & 1);
}

/* Later... */
if (is_even(some_number))
        printf("some_number is even\n");

Most functional languages have similar constructs. Even in languages that support exceptions or other error mechanisms, the tried and true Boolean is still appropriate in a large number of situations–enough for many languages to define a dedicated “bool” or “Boolean” type. In the above code, for example, it would not be at all appropriate to throw an exception if the number is odd. All of this is well-known and seldom questioned.

Program Exit Codes

Program exit codes, however, are a completely different concept. Unlike return values, which occur within a program, exit codes are essentially part of the program’s output. While the semantics differ slightly between operating systems, the basic theory is the same:

  • An exit code of 0 means success
  • A non-zero exit code means some kind of failure

Hence, to preserve the 0 = false thinking, it may be easier to semantically think of exit codes as “error codes”. 0 means, “no, there was no error”. DOS actually got this right; it calls its exit status variable %errorlevel%.

Unix/Linux/Mac

Under Unix-like operating systems, exit codes are pretty well-defined:

0 Success
1..126 Failure (the program itself will decide what the numbers mean)
127 Command not found
128..254 The program did not exit normally. (E.g., it crashed, or received a signal)
255 Invalid exit code

From a Bourne shell, the variable $? holds the exit code of the last command run in that shell. (For csh, use $status). For example, if I run:

$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
$ echo $?
0

As you can see, grep actually found a match, which means it was successful. So, when I check the exit status ($?), the result was 0.

If a command isn’t successful, it goes like this:

$ grep no-such-animal /etc/passwd
$ echo $?
1

If I attempt to run a command which doesn’t exist:

$ grepadoodle foo
-bash: grepadoodle: command not found
$ echo $?
127

Finally, let’s say I ran the following command, and then was forced to kill it from another shell:

$ grep "Jimmy Hoffa" /dev/random
# This command will run forever if I let it, so at this point I go to another shell and kill it
Terminated
$ echo $?
143

Why 143? The TERM signal (the default signal sent with the kill(1) command) has a value of 15. 128 + 15 = 143.

Why This Matters

Unix shells have built-in Boolean operators like && (logical and) and || (logical or) that can be used to compare against the exit status of a command. These operators assume 0 = success (true) and !0 = failure (false). This allows us to write constructs like the following (Bourne shell):

$ grep fluffy /etc/passwd && echo "Your user list may contain a poodle."

$ [ -f somefile.txt ] || cp somefile.txt.defaults somefile.txt

If a program’s exit code does not follow the 0 = success rule, the logic rapidly gets confusing.

Windows/DOS

Here the situation is a little bit different, but not by much. In DOS/Windows batch files (or a command shell) the variable %errorlevel% contains the exit code of the last command executed, similar to Bourne shell’s $?. The semantics are similar:

0 Success
1..255 Error (again, meaning depends on the individual program)

Examples:

C:\> echo "Hello"
Hello

C:\> echo %errorlevel%
0

C:\> type nonexistent-file.txt
The system cannot find the file specified.

C:\> echo %errorlevel%
1

Where People Tend to Get Mixed Up

While virtually every built-in or well-known 3rd party command follows the above exit code rules, I have seen my share of proprietary applications that flip this logic around for no good reason. When speaking with the developers, I almost universally hear “but in [insert programming language here], 0 is false and 1 is true”. If you find yourself falling into this trap, just remember that the exit code of your program is part of its output, not part of its logic. You wouldn’t think of dumping a raw struct or object to the user’s terminal, would you? Instead, you format the output in a way that makes sense. Exit codes are no different.

Occasionally, I even see people flipping the boolean logic within programs for no good reason (and often without documentation), which leads to all sorts of confusing and error-prone constructs like this:

if (did_it_work())
        fprintf(STDERR, "Error\n");  /* Are you sure? */

There are a few reasonable exceptions to this rule, such as strcmp(3), which returns 0 if the strings are equal, < 0 if s1 is less than s2, and >0 if s1 is greater than s2. However, strcmp(3), fork(2) et al., are not boolean functions; they return a range of values instead of a simple truth value.

Summary

When in doubt, remember the following:

  1. Program exit codes are not function return values
  2. Boolean functions return 1 for true, 0 for false
  3. Programs exit with 0 on success, non-zero on failure

Keep it clean!

How to smell a phish


Monday, October 27th, 2008

Today, I received numerous forwards of an email that apparently comes from eNom. If you receive this email, delete it, it is a scam.

The rest of this article will cover a few low-tech tips anyone can use to identify similar emails.

This is fine for me to tell you, but how would you have determined this for yourself?

Dead Giveaways

This email appears to be legitimate; it comes from an enom.com email address (enom is a major, reputable seller of Internet domain names), and has a sufficient amount of convincing technobabble. However, there are a couple of highly suspicious bits, here:

  1. They say the main site (www.enom.com) is going to be down; why would they give you a link to www.enom.com to access your account?
  2. Why would you need to access your account in this case anyway?
  3. The grammar in the email is spotless, except for the line about the account information: “For access your account follow this link” — most likely, the phisher (and apparently, bad grammarian) took a real email from eNom, and added this line.
  4. Most damning, but also least obvious (and the hallmark trait of a phishing email) is that the link appears to go to http://www.enom.com/, but if you hover over the link in most email programs, you’ll see the real destination:

Phishing hover

Instead of going to www.enom.com, you will be taken to www.enom.comsys52.net. Most modern email programs and web browsers have technology to attempt to detect this, but the technology is not perfect. And, phishers’ nefarious livelihood depends on exploiting any weakness in such systems. So, your best defense is always to be aware of the risks, and learn to pick up on these “phishy” characteristics.

If in doubt, always err on the side of caution; avoid clicking links in emails–instead, go to the company’s web site from your browser’s bookmarks, or look them up and give them a call if you feel they may actually need some information from you.

Perl vs. Joe Job


Sunday, October 26th, 2008

One of my domains has recently been suffering a spate of backscatter from an email Joe-job. Basically, spammers are sending out emails with forged From: fields, that appear to be sent from, say abcdef@ry.ca, to random recipients. There is no practical way to prevent or stop these attacks, since those emails never even come near my server before the recipients see them.

A significant percentage of the forged emails are either caught by the recipient’s spam filter, or are simply sent to addresses that don’t exist. Those messages should die right then and there, but some servers will actually accept the message, and then bounce it back to the (apparent) sender, instead of rejecting it immediately at the SMTP level.

It turns out there are enough of these servers to subject my server to over a million bounces per week.

The server runs exim, which uses a maildir format for its local storage, meaning, one-file-per-email. When I returned from my holidays, the filesystem had nearly run out of inodes (Linux/UNIX filesystems have a limit on the total number of files (inodes) they may contain–in this case, about 4 million).

I wrote some filter rules to drop the bounces to forged addresses, but I still had a directory containing a rather inconvenient number of files, to say the least.

Perl to the rescue

My goal was to prune the directory, but keep the most recent samples for analysis. Fortunately, exim stores each email with a filename beginning with the number of seconds since 1970 (a common timestamp method)–this eliminates the need to hit the disk again to get the actual timestamp of the file, which is a relatively costly operation. The following Perl script takes no parameters and gives no output; it simply deletes all exim email files that are more than 3 days old, from the current directory:

#!/usr/bin/perl

use warnings;
use strict;

my $DAYS = 3;  # Keep this much history

my $older = scalar time - 86400 * $DAYS;

opendir DIR, ".";
while (my $file = readdir DIR) {
    next unless ($file =~ /^(\d{10})\./);
    unlink $file if ($1 < $older);
}

I hereby release this code to the public domain. Use at your own risk; after all, it’s designed to delete files by the millions, very efficiently. It will potentially delete any file that starts with ten digits followed by a dot.

To mitigate the risk of accidental deletions (notably, some maildir implementations (e.g., cyrus) use sequential integers instead of timestamps), the script will not delete any files with nine or fewer digits. This corresponds to September 8, 2001, which should be fine for most purposes.

It took just shy of an hour to process ~4 million directory entries, and I was left with a far more manageable directory. Following this, I did some analysis on the remaining messages to help identify patterns and set up additional rules.

Partitioning and Formatting Blank Drives in Vista


Thursday, October 23rd, 2008

Recently, I feared that one of my USB drives might be failing, so, after moving the data off of it, I ran it through the gamut of diagnostics, including writing zeros to the drive, which, of course, blew away the data and partition table.

When I went to the Computer Management -> Storage -> Disk Management console in Vista, the drive was there, sure enough, showing as “Unallocated”. I right-clicked and went to create a New Simple Volume.

Normally, this works. I followed the prompts which asked me for the partition’s size, drive letter, and filesystem format, but to no avail. “The operation cannot be completed because the disk is not initialized.”

On closer inspection of the Disk Management screen, sure enough, the drive says Not Initialized, instead of the typical “Online”.

I could find no way to initialize this disk via the GUI. Since I have a genetic aversion to GUIs anyway (especially the useless ones), I bailed out to an Administrator command prompt.

Update

[2008-Nov-18]: As Brian wisely pointed out in the comments, it is actually possible to initialize a disk from the GUI, by right-clicking the Disk Name (i.e., the column heading to the left, in the above picture). This will actually spare you from DiskPart (below). Thanks Brian!

DiskPart to the rescue

DiskPart allows free editing of partitions. Standard disclaimer, here: if you’re not 100% sure of what you’re doing, don’t do it. It is quite possible to mess up your system and/or lose data using this tool.

So, here’s how to rebuild a disk that has been overwritten with zeros, random bits, or worse.

Step 1: Load DiskPart, and find the desired disk

C:\>diskpart

Microsoft DiskPart version 6.0.6001
Copyright (C) 1999-2007 Microsoft Corporation.
On computer: VISTA-PC

DISKPART> LIST DISK

Disk ###  Status      Size     Free     Dyn  Gpt
--------  ----------  -------  -------  ---  ---
Disk 0    Online       233 GB      0 B
Disk 1    Online       932 GB      0 B
Disk 2    Online       466 GB   466 GB
Disk 3    No Media        0 B      0 B
Disk 4    No Media        0 B      0 B

Here, we see that Disk 2 is the one we want (it is also the only one with free space, which is reassuring).

Step 2: Select the disk

This tells diskpart that any operations we want to run will affect Disk 2.

DISKPART> SELECT DISK 2

Disk 2 is now the selected disk.

Step 3: Create a primary partition

DISKPART> CREATE PARTITION PRIMARY

DiskPart succeeded in creating the specified partition.

DISKPART> ASSIGN LETTER=I

DiskPart successfully assigned the drive letter or mount point.

DISKPART>

And, seconds later, a format prompt window popped up.

Success! After a full format, the drive is back in business.

Microsoft TechNet has a good article on DiskPart here: http://technet2.microsoft.com/WindowsServer/f/?en/Library/19a9ac4d-d151-4fde-b187-9f8dfa09cb351033.mspx

DiskPart itself has some decent online help, by typing “HELP” at the prompt.

Happy formatting!

Transparent editing of GPG-encrypted files in Vim


Tuesday, October 21st, 2008

Markus Braun wrote an essential gnupg Vim plugin for Linux/UNIX users who regularly work with GPG encrypted text files at the command line level. (Or, perhaps, for those of you who probably should be working with GPG encrypted text files more often!) Best of all, installation is a one-liner, and it comes with some security benefits, while allowing transparent editing of encrypted files.

One-liner Installation

wget -O ~/.vim/plugin/gnupg.vim \
http://www.vim.org/scripts/download_script.php?src_id=12200

If wget complains that the ~/.vim/plugin directory doesn’t exist, type mkdir -p ~/.vim/plugin to create it.

Usage

Using this plugin really couldn’t be much easier. Simply edit any encrypted file with a .gpg, .pgp, or .asc extension, and you’ll see something like the following:

$ vim top-secret.txt.gpg

"top-secret.txt.gpg" [noeol][converted] 3L, 1045C
You need a passphrase to unlock the secret key for
user: "Ryan Thompson <email@example.org>"
2048-bit ELG-E key, ID 12345678, created 2008-05-26 (main key ID 09876543)

Enter passphrase:

Bingo! Once you key in your passphrase, you will have a normal Vim session with the unencrypted contents of the file. Upon closing the file, the plugin will re-encrypt the file.

It also supports creating new encrypted files. If you edit any nonexistent file with a .gpg, .pgp, or .asc extension, you will first be prompted (within Vim) for a list of recipients in its own buffer:

GPG: ----------------------------------------------------------
GPG: Please edit the list of recipients, one recipient per line
GPG: Unknown recipients have a prepended "!"
GPG: Lines beginning with "GPG:" are removed automatically
GPG: Closing this buffer commits changes
GPG: ----------------------------------------------------------

It’s even smart enough to detect whether recipients are in your public keyring, and will alert you if any errors arise.

Typing the command :GPGEditRecipients will allow you to edit the recipient list on-the-fly.

Security Considerations

No matter which installation method you choose, I highly recommend you verify the download, especially when dealing with software that you are about to trust with your encryption and key data!

This plugin actually takes some additional precautions that would be difficult to achieve manually. This plugin:

  • Does not use temp files. All editing is done in RAM; the plaintext is never written to disk
  • Automatically disables the Vim swapfile and viminfo, to prevent cached copies of the data from being saved on disk
  • Overrides Vim’s “write” command such that it writes back the encrypted file from the buffer

In short, this plugin makes encryption significantly easier to use on a daily basis, without compromising security (depending on your existing habits, it may arguably handle editing more securely). If that gives you the freedom to encrypt sensitive material that you previously couldn’t be bothered to encrypt, that’s a pretty big win (assuming, of course, that your Poodle doesn’t eat your private key–but that’s a topic for another article). My thanks to Markus for creating this.