sunday, 11 february 2007

posted at 10:53
tags:

"I must've put a decimal point in the wrong place or something. Shit, I always do that. I always mess up some mundane detail." -- Michael Bolton, Office Space.

And so it is with me. I consider myself a fairly good programmer, but I always make the most ridiculous mistakes, that usually cost me hours or days in debugging. Case in point: packet.handler creates a process to mimic the environment required for traditional filesystems. The structure of the startup code for these processes is the same as everywhere else in AROS - create the process and have it wait around until signalled by the main (creating process) to run.

For some reason though, whenever my newly created processes called WaitPort() the whole thing would segfault. I chased this around for over two days. Then, in desperation, I started writing a small program to try and capture just the relevant pieces of this setup for testing. In these cases I usually copy by hand rather than doing a real clipboard copypasta, so I make sure my brain sees all the code as I go.

As I was copying, I noticed something that clearly wasn't going to work, so I fixed it on the fly. A few seconds later, my brain kicked in. Sure enough, the same problem appeared there. Same fix, recompile, run. No crash!

The problem? CreateNewProc() returns a pointer to the newly created process. I store this in an (effectively) global chunk of memory. The new process was accessing this chunk of memory to get its process structure, but of course, it was doing this before CreateNewProc() returned in the main process. Invalid piece of memory, crash!

The solution is easy. Have the new process call FindTask() to get hold of its process structure, and all is well.

Avoiding this kind of thing is kiddie stuff for multithreaded programming. I've done this hundreds of times before. Its simple, and thus exactly the kind of thing I screw up.