I recently wrote a new post on the SEOmoz dev blog about our deployment with chef-solo and god on EC2.
It’s a lesson that has now been hammered home repeatedly in my head: never trust callbacks. Just don’t. Go ahead and execute them, but if you trust them to not throw exceptions or errors, you are in for a world of unhappiness.
For me, I first learned this lesson when making use of twisted, writing some convenience classes to help with some of the somewhat odd class structure they have. (Sidebar: twisted is an extremely powerful framework, but their naming schemes are not what they could be.) Twisted makes heavy use of a deferred model where callbacks are executed in separate threads, while mission-critical operations run in the main thread. My convenience classes exposed further callbacks that could be overridden in subclasses, but I made the critical mistake of not executing that code inside of a try/except block.
Twisted has learned this lesson. In fact, their deferred model makes it very hard to throw a real exception. If your callbacks fail, execution takes a different path — calling errback functions. In fact, twisted is so pessimistic about callbacks (rightly so) that you just can’t make enough exceptions to break out of errback functions. However, wrapped in my convenience classes were pieces of code that were mission critical, and my not catching exceptions in the callbacks I provided was causing me a world of hurt.
That whole experience was enough to make me learn my lesson. Then, a few days ago I encountered it again in a different library, in a different language, in a different project, where I was exposing callbacks for user interface code in JavaScript. The logical / functional chunk of code exposed events that the UI would be interested in, but was too trusting, leading to errors in callbacks skipping over critical parts of the code.
All in all, when exposing callbacks, never trust a callback to not throw an exception. Even if you wrote the callbacks it’s executing (as was the case with both of these instances, at least in the beginning). Callbacks are a courtesy — a chance for code to be notified of an event, but like many courtesies, they can be abused.
Python has a pretty useful policy: named arguments. When you call a function, you can explicitly say that such-and-such value is what you’re providing for a particular argument, and can even include them in any order:
def hello(first, last): print 'Hello %s %s' % (first, last) hello(last='Lecocq', first='Dan')
In fact, you can programmatically gain insight into functions with the inspect module. But suppose you want to be able to accept an arbitrary number of parameters. For example, for a printf equivalent. Or where I encountered it in wanting to read a module name from a configuration file, as well as the arguments to instantiate it. In this case, you’d get the module and class as a string and then a dictionary of the arguments to make an instance of it. Of course, Python always has a way. In this case, **kwargs.
This is actually dictionary unpacking, taking all the keys in a dictionary and mapping them to argument names. For example, in the above example, I could say:
hello(**{'last':'Lecocq', 'first':'Dan'})
Of course, in that case it’s a little verbose, but if you’re getting a dictionary of arguments programmatically, then it’s invaluable. But wait, there’s more! Not only can you use the **dict operator to map a dictionary into parameters, but you can accept arbitrary parameters with it, too!
def kw(**kwargs): for key, value in kwargs.items(): print '%s => %s' % (key, value) kw(**{'hello':'Howdy!', 'first':'Dan'}) kw(hello='Howdy!', first='Dan')
Magic! No matter how you invoke the function, it has access to the parameters. You can even split the difference, making some parameters named and some parameters variable. For example, if you wanted to create an instance of a class that you passed a name in for, initialized with the arguments you give it:
def factory(module, cls, **kwargs): # The built-in __import__ does just what it sounds like m = __import__(module) # Now get the class in that module c = getattr(m, cls) # Now make an instance of it, given the args return c(**kwargs) factory('datetime', 'datetime', year=2011, month=11, day=8) factory('decimal', 'Decimal', value=7)
This is one place where Python’s flexibility is extremely useful.
I recently finished reading one of Kevin Mitnick‘s books, Ghost in the Wires. Fantastic. I constantly found it amazing that someone had lived that life, hacking, evading capture, changing identities. Reads like an action movie at many points, and in fact, several movies have been made loosely (and one very loosely) based on his life. Mitnick often talks about how much the “myth of Mitnick” is inflated or distorted, especially in the media and particularly with the movies.
As it turns out, Mitnick lived briefly in Seattle, and with my interest piqued, I figured I might be able to track down his old apartment. He describes going home one day before realizing his was being followed, and in the course of the description he mentions a few street names and the part of town he lived in. And at the end of the book, there’s a photo of the apartment, slightly too grainy to read the name of the building. But clear enough to read the number. A little time with Google Maps and found it! Being so close, I figured I’d drop by to take a picture:
Turns out, there’s a pretty handy package called psutil that allows you to not only gain insight into the currently-running process, but other processes, physical and virtual memory usage, and CPU usage. For example:
import psutil psutil.phymem_usage().percent # 31.2 psutil.virtmem_usage().percent # 0.0
Pretty handy tool if you’re doing any sort of monitoring!
I began work almost a month ago at a Seattle company, SEOmoz. Interesting projects, talented people, and a good place to be. Today I posted my first contribution to their Dev Blog talking about scripting the launching and deployment of EC2 instances with boto and frabric.
Yes, yes(1) is built-in to Mac and Linux (at least OS X Lion and Ubuntu 11.04). And, as you might guess, it repeatedly prints a string of your choice (‘y’ by default) followed by a newline to stdout. Its sole purpose in life is to automate agreeing to prompts. I encountered it recently in a script that was automating RAID array deployment on EC2 ephemeral disks:
# mdadm doesn't let you automate by default, so pipe in 'y'!
yes | mdadm ...
I initial put off upgrading to Snow Leopard until almost a year after its release because I was worried about rebuilding my development environment. It’s amazing how many packages one accumulates over time without thinking about it, and when you have deadlines to meet it can be disastrous to risk your current working setup.
But rebuilding your development environment comes up more than just upgrading your OS. If you need to migrate to that new computer you got, or that work gave you, or help someone else get up and running with a project you’re thinking about releasing. Admittedly, it took me a little while to learn this lesson, but finally it’s drilled into my head: keep build notes!
A couple weeks ago I was trying to install an internal package whose docs hadn’t been updated in a very long time. After struggling and hitting countless snags, I finally got it up and running when I got an email that was along the lines of, “Oh, if you could write down what problems you ran into, that would be great.” Fortunately, I just made notes of what I had done in order to get it built, and I was able to whip off a reply with speed that surprised the recipient.
Even at a system-wide level, I try to make it a habit to record every package I install/build associated with development. It makes it extremely easy to get set up on the next system, even if the instructions have to be updated for a new environment. I call it a manifest and I manage it as a flat file, though I know there are package managers that can do a lot of heavy lifting for me. However, I find that no package manager is perfect and so even if I make use of one for certain libraries, it’s important to me to have everything documented in one place. At a minimum (and you probably don’t need more than this) keep the following:
- Package name and version – Maybe you needed readline 6.1 to get your project running, or you know that such-and-such version is buggy for your purposes.
- Why you installed it – I find that many libraries I install are used for a particular project, and so it’s useful to have the motivation for getting it.
- How you installed it – Whether it was macports or a typical configure, make and install, how did you build it? Did you need special flags to make it go? You will absolutely forget these, so why not write them down? Even just copy and paste from your history!
I can’t stress enough how much easier this has made my development life in a lot of ways, and how little a time investment it is.
Last week I had the (dis)pleasure of porting some code to Mac, and today it came time to merge with the original codebase. As helpful as it was to use macros for different code paths, we needed something in the makefile to optionally add flags when compiling on Mac.
// This is all well and good
#ifndef __APPLE__
// Do your Linux-y includes here
#else
// Do your Apple-y includes here
#endif
Apparently, there are a couple conventions for doing this. First, you can inject a configuration step (à la autoconf, for example) which would detect what platform you’re building on in a robust way and build a Makefile for you. Second, if you’re lazy or autoconf would be like hitting a fly with a hammer, you can use make’s conditionals:
# Ensure that this gets declared in time,
# and fill it with the result of `uname`
UNAME := $(shell uname)
# If the environment is Darwin...
ifeq ($(UNAME), Darwin)
CXXFLAGS = # Something Apple-y
else
CXXFLAGS = # Something Linux-y
endif
Simple enough!
