Saying no

Here’s a short list of things I’ve said no to in the last two weeks:

  • Joining Toastmasters
  • Joining a second developer book club
  • Joining a reading competition at the library
  • Taking a second class
  • Participating in an ongoing lean and agile forum

I’ve come to really enjoy doing things deliberately: focusing on a handful of things I really care about and enjoy.

By contrast, here are some thing I haven’t said no to:

  • Spending time with my wife cooking and playing video games
  • Spending an extra 20 hours a week on my side project
  • Taking an extra 2 hours to practice some problem sets to make sure I really understand the material in the one class that I am taking

Time is finite. Your attention is finite. Spend them both deliberately.

Doing things well takes time

I’m not the world’s fastest programmer, nor am I the slowest, and much like any person who does creative work, there are times when things come easily, and times where it’s a slog. But regardless of the day-to-day ups and downs, I’ve come to appreciate a simple fact in recent weeks: doing things well takes time.

A little while ago, I started working on a computational biology library that I’m calling BCompute. The idea came as I was working my way pretty quickly through the challenges at Rosalind.info. If you look at the challenges there, you’ll see that most of them are pretty easy to solve in a script-y way that works for a narrow set of cases. (I.e. the one you’re working on.) String manipulations, analysis, and so forth are pretty straightforward in most high-level languages.

Somewhere around problem 5 or 6, I got to thinking that it would be fun create a compbio library exploring the domain, and becoming a better developer along the way. I didn’t want to just build a collection of scripts; I wanted to build a real, performant library with concepts modeled at the proper level of abstraction, with a type-safe, unit tested, composable domain model that could be used for more than just toy problems. So I started reworking those scripts into something real.

Now, about a month later, I have a library that is getting closer to being able to Do Stuff, and will mostly keep you from doing the wrong thing. Along the way, I’ve learned a lot, and had quite a bit of fun. But man does it take time to do things The Right Way–even when you understand the domain pretty well, which is an advantage that I certainly don’t have. (Though I’m getting there.)

  • Writing tests takes time
  • Squashing bugs takes time
  • Refactoring objects and interfaces takes time
  • Building composable objects takes time
  • Learning and then accounting for the biochemistry edge cases takes time
  • Applying a growing body of domain knowledge to your object model–which appears deficient in new and interesting ways the more you learn–takes time

After 100+ commits, 130+ unit tests, and more refactoring than I can even remember, the most salient thing I’ve learned is that it all takes time. (And there’s still a long way to go…)

“What would happen if I took a shot of mercury?”

Short answer: probably nothing.

Longer answer: Two guys I work with asked me yesterday what would happen if they each took a shot of mercury. Unfortunately I was in the middle of addressing a production issue, and couldn’t really answer, but it was a fun question, so here’s the answer…

Elemental mercury (quicksilver) isn’t absorbed very well by your GI tract. In fact, it appears that less than 0.01% of whatever amount you consume will be absorbed, assuming you have a healthy GI tract. Conversely, this means 99.99% will be excreted without it reaching your bloodstream. So a shot of liquid mercury is unlikely to do you any lasting harm, though I wouldn’t recommend it.

Much more dangerous is aerosolized mercury which is readily absorbed in the lungs, where absorption rates reach 80%. Also dangerous is methylmercury, an organic compound. (This is the mercury that you’ll find in fish.)

(One of these days, I will import all of my old pharmacy blog posts into the archives…)

Creating an array of generics in Java

I was messing around with creating a generic Bag collection in Java that’d be backed by an array. It turns out that you can’t do this for a number of interesting reasons…

In Java (and C#), arrays are covariant. This means that if Apple is a subtype of Fruit, then Apple[] will also be a subtype of Fruit[]. Pretty straightforward. That means this will compile:

If you’re like me, you didn’t think too hard about this, and assumed you could do the same with parameterized types, i.e. generics. Thanksfully you can’t, because that code is unsafe. It will throw an ArrayStoreException at runtime which we’d have to handle.

Wouldn’t it be great if we could guarantee type safety at compile time?

Generics are safer

Unlike arrays, generics are invariant, which means that Apple being a subtype of doesn’t matter: a List<Apple> is different than a List<Fruit>. The generic version of the code above is illegal:

You can’t cast it, either:

By making generics invariant, we guarantee safe behavior at compile time, which is a much cheaper place to catch errors. (This is one of the big reasons developers get excited about generics.)

So why are arrays and generics mutually exclusive?

In Java, generics have their types erased at compile time. This is called type erasure. Type erasure means a couple of things happen at compile time:

  • Generic types are boiled down to their raw types: you cannot have a Derp and a Derp<T> in the same package.
  • A method that has a parameterized type overload won’t be compile: a class with methods popFirst(Derp<T> derp) and popFirst(Derp derp) won’t compile.
  • Runtime casts are inserted invisibly by the compiler to ensure runtime type safety. (This means there’s no performance benefit to generics in Java!)

Java’s implementation of generic types is clumsy, and was done to maintain backward-compatibility in the bytecode between Java 5 and Java 4.

Other high-level languages (like C#) implement generics very differently, which means none of the three caveats above apply. Generics in full-stack implementations do net performance gains along with those type-safety guarantees.

To recap, in Java:

  • Arrays require type information at compile time
  • Generics have their types erased at compile time

Therefore you cannot create arrays of parameterized types in Java.

Further reading

Understanding the word “semantics” in the context of programming

tl;dr- It’s usually safe to substitute the phrase behaviors and guarantees into a sentence where you see the word “semantics”–and the discussion is about programming.

Longer version: New programmers often come across the word semantics, and wonder what it means. Pretty much every explanation they will read points out the distinction between syntax (form) and semantics (meaning). This is easy to grasp, but not useful for understanding the word in the context of a sentence like: The stylistic choices should typically be driven by a desire to clearly communicate the semantics of the program fragment.

Go ahead and substitute the word “meaning” there. It isn’t much help unless you’re already an experience developer.

So to that end, new programmers… if ever you come across this word, it’s generally safe to substitute the phrase behaviors and guarantees in its place. This may help you understand the semantic intent (ha!) of the writer a little more.

Recursion as sophisticated GOTO?

Suppose you write a simple program to play a number guessing game: “I’m thinking of a whole number between X and Y…” where the user attempts to guess the number in order to win. Failures that require feedback to the user come in three main flavors:

  1. The user guessed incorrectly
  2. The user’s guess was out of range
  3. The user did something silly, like enter a non-integer value

The do..while form

You can do this with a loop, of course. A do..while loop seems like a natural choice here. What I don’t like is the validation and user feedback in the while clause. You’d call out to a method that hides all of the validation logic, and returns true or false. This is OK, but that boolean method will have a side effect, i.e. feedback to the user.

The recursive form

An alternative implementation might use recursion:

I find this second way to be more natural and readable. But it also feels like the wrong thing to do, though I can’t articulate why. Maybe it’s because any time I’ve seen code that followed this kind of pattern, I’ve seen it written using loops.

Thoughts?

Memory access patterns in high-level languages

Like many developers that work in high-level languages, I think don’t spend a lot of time thinking about memory access patterns. This is probably a good thing… for the most part, worrying about this is premature optimization. But there are times when it matters, and the compiler won’t magically “optimize” it away for you, even if you have optimizations turned on:

Code

Results

I tried this out after seeing this discussion thread on Quora this morning.