yes(1)

Yes, yes(1) is built-in to Mac and Linux (at least OS X Lion and Ubuntu 11.04). And, as you might guess, it repeatedly prints a string of your choice (‘y’ by default) followed by a newline to stdout. Its sole purpose in life is to automate agreeing to prompts. I encountered it recently in a script that was automating RAID array deployment on EC2 ephemeral disks:

# mdadm doesn't let you automate by default, so pipe in 'y'!
yes | mdadm ...

Tagged with:
 

If you find yourself with a terminal and you need a stopwatch:

$> time cat

Cat(1) by default waits on stdin if no arguments are provided, until an EOF is reached (Ctrl+d). Time(1) waits until the run command terminates, so in effect, it’s a stopwatch that runs until you press ctrl+d.

 

Recently I’ve been working on porting some code to Mac, and yesterday I ran into a bug that stumped me for a little bit. Compiling against Boost was raising a bunch of errors, specifically in lines that seemed pretty innocuous (from cstdint.hpp):

using ::int8_t;
using ::int_least8_t;
using ::int_fast8_t;
using ::uint8_t;
using ::uint_least8_t;
using ::uint_fast8_t;

G++ kept giving me errors for each of those lines: “error: expected unqualified-id before ‘signed’,” referring to the line “using ::int8_t.” I’m a little embarrassed that I couldn’t figure it out right away, but eventually I figured out that it was caused by int8_t being #define’d somewhere else. For those of you that don’t know, #define really just takes one term and substitutes every subsequent occurrence of that term with one you provide.

// If you define it as a macro
#define int8_t signed char
// Then this line will be interpreted very differently from how you expect
using ::int8_t;
// Gets interpreted as "using ::signed char"!!!

And this is what g++ had been complaining about. That is not legal C++ syntax; I’m sorry I doubted you, g++! But there still remained a larger question: where were these types getting defined? I didn’t want to be in the business of “patching” a library, and especially the largely impeccable Boost library. Typically these types (int8_t, uint8_t, etc.) are defined in cstdint or stdint.h, but looking at the system’s version, I found to my surprise that they were not macro-defined, but typedef’d, which is the right way to do it.

Sidebar: In general, you should be using typedef instead of #define for this reason, and another very good reason. Because #define macros just go through code and blindly replace references, it can be difficult to trace the origin of a type, but typedefs are carried through, and so even after preprocessing, you can still see what the semantic meaning was (that you wanted int8_t specifically, not just something that happens to be the same type). And when debugging, this extra type information can be helpful. Similarly, you should generally also use const to define constants in your code, instead of #define macros, because while you might remember what a magic number is when you write it into your library, the meaning of that particular constant becomes unclear when you encounter it’s value when debugging. (If you haven’t, read Scott Meyer’s Effective C++.)

Getting back to the morality tale, the library I was porting wasn’t macro-defining int8_t, and stdint.h wasn’t, so where was the culprit? Clearly there are potentially hundreds of places it could be, and I was running out of good guesses. Luckily, SEOmoz C++ shaman Martin taught me a little ninja magic: use the -E flag with g++ to only run the preprocessor stage, and redirect the output into a file! When compiling with ‘make,’ it typically spits out the offending g++ command, so rerun it to just pass that one through the preprocessor, which reaches out and fetches the header files and gloms them in order into one giant input file. Then, step through this file to see where “define” and “int8_t” occur on the same line! In two minutes we were able to find the header that was causing all this trouble, where I had spent two hours learning and reading about where the problem might be.

In the end we found it in a very small library that we happened to use, and on Mac we had just been using a slightly old version and this problem had been fixed in subsequent releases. Still, I’m glad to have added this preprocessor trick to my toolkit.

Tagged with:
 

Rename(2)

I found myself needing to systematically rename a bunch of files, and previously the only way I knew to do it was with a loop, passing each filename into sed, and using the output as the destination filename. Tedious and error prone!

However, it turns out that people have anticipated this use case, and there is a utility for this exact purpose, called “rename.” It allows you to provide a perl-style regular expression whose input is the existing filename found by your search expression, and whose output is what that file should be renamed. There’s even an especially useful option to take no action, but only to print out what the new filenames would be, allowing you to perform a dry-run before you rename 100000 files.

Example:

rename -n 's/^/some.file.prefix./' *.jpg

Happy renaming!

Tagged with:
 

It is extremely easy for computer scientists (well, and the rest of humanity) to get entrenched in their ways. You’ve learned and taken the time to become a master of a programming language, or tool, and it’s a serious time investment. As such, new tools generally have to be very compelling in order to get someone to switch. For example, I’m a recent svn-to-git convert, and am often met with horrified looks when I suggest others give it a shot. A visiting professor I spoke to yesterday said that it would be years before they stopped using svn.

Recently, I’ve begun using sshfs. It mounts a volume over ssh and appears as any normal folder on your computer. It behaves that way at the command line, or your text editor, or as far as VLC is concerned. For development, I had always maintained a local copy and rsync’d changes to the remote machine I was using. It worked well enough, and was often more convenient than using a tool like Cyberduck.

Aron recommended sshfs, and though it took me a while to try it out, I’m hooked. No more trying to remember if I’ve synchronized my code with the other machine I’m using, only editing and saving. It’s really easy when you have ssh keys set up as well. To mount your home directory on a remote machine:

$> mkdir datastore_mount
$> sshfs datastore: datastore_mount

And now, I’ve got access to remote files on datastore as if they lived in the directory datastore_mount. Unmounting is business as usual:

$> umount datastore_mount

Try it, use it, live it. You won’t regret it.

Tagged with:
 

A professor of mine mentioned that he didn’t use bash but Z shell (zsh). I asked him about his choice and he said that it happened years ago when he asked a system administrator how to accomplish a task in bash and the guy replied that it was really easy in zsh. There have been worse reasons to switch, I suppose.

(Incidentally, the best way I’ve found to improve your command-line-fu is to work near system administrators. They have the most amazing bits of command line ninja magic you’ll ever see.)

I decided to give it a shot this evening and after 30 minutes of use, I’m switching. The features I’ve liked thus far as that when you want to search for a command you recently ran, you don’t need any of the ctrl-r nonsense. One simply begins the command as he remembers it and presses up. Old functionality mapped to the intuitive keystroke. You can scroll through your command history by pressing the up arrow key in bash, but to search, it’s a different story.

The feature that clinched it for me is that it has better autocompletion. If you are not using tabs to autocomplete, you are wasting keystrokes and make typos, but something I find frustrating is that when I want to ssh into some computer with some long and difficult-to-type name, I had to search my history for it. Or, in zsh, you can tab complete the name. It remembers that stuff! Imagine that!

Of course, there’s a very real possibility I’ll switch back if I run into too many gotchas, but this many niceties this soon in make mean think that the good will outweigh the bad.

Tagged with:
 

The best way to learn what command-line tools you should be using is to hang out around system administrators. There are a couple in this office space, and they’re always willing to give advice about which utility you should use instead of another.

I’ve been using scp and rsync since I started using bash, and thought they were the bee’s knees. Of course, they certainly have their purpose, but when it comes to sending files over the tubes quickly, I’ve learned of a better one: nc / netcat.

It goes like this: you’ve got a big file to transfer, and the tubes aren’t really your bottleneck. You might not care about security in this instance, but just want to get it done quickly. You start the netcat daemon on the remote machine, it listens on a port, and then things that get sent to that port are output to a file on that box:

my-remote-machine $> netcat -l -p 1234 > ubuntu.iso

And then on the machine I’m transferring from:

my-local-machine $> netcat #remote-ip 1234 < ubuntu-9.04-desktop-i386.iso

With scp for this same task, I was clocking about 28 MBps, but netcat posted 47 MBps pretty consistently. This is a neat little tool I will be using with some regularity.