monday, 27 july 2009

posted at 19:25

The last couple of months have been busy but I've managed to find bits of time here and there to hack on the new AROS hosted port. Last week I really got the guts of the task switching and interrupt code working the way I wanted, which is what I'm here to tell you about today.

Task switching in a typical multitasking system is very simple in concept. Imagine a computer running a single task. There's a big pile of instructions in memory somewhere, and the processor just runs them in sequence. It will keep doing that until something stops it. That something is the most important requirement to make preemptive multitasking work.

What usually happens (again in very simple terms) is that there's an extra bit of circuitry somewhere in the computer that works as a timer. Every now and again (though tens or hundreds of times a seconds), it will prod the CPU. In response, the CPU will stop what its doing and go and run a different bit of code somewhere else in memory. The "prod" is known as an interrupt (or Interrupt Request (IRQ)), and the bit of code that runs is the interrupt handler (or more formally, the Interrupt Service Routine (ISR)). Its the handler's job to arrange for a different task to run.

Something the CPU will do when responding to the interrupt is to save its complete state (known as the context) before it calls the handler. That is, somewhere in memory (typically on the stack) it will save a copy of all its registers, the stack pointer, the program counter and everything else it needs to continue running the program from where it was stopped. This is necessary as the handler will need to use those registers in order to do its work. Many CPUs provide a single instruction to restore the entire CPU state in one go.

To make task switching work, the interrupt handler will take a copy of the context and store it inside the OS task state, which usually contains lots of other info about the running task, such as memory it has allocated, files it has open, etc. Then, the handler chooses another task based on some criteria (this is the scheduler). Finally, it copies the saved context from the state of the task to wherever the CPU needs it, then tells the CPU to reload the context and leave the handler. The handler "returns" to running the newly selected task. This process contiues ad infinitum and you get the illusion that your computer is doing lots of things at the same time.

The existing Unix-hosted version of AROS does fundamentally the same thing, but in a highly convoluted way. The main thing to note is all tasks run inside a single Unix process, which then does some deep magic with Unix signals to make interrupts and task switches are happening. The kind of magic employed is highly OS-specific, and although I don't know exactly why it was done the way it was, I can guess that it was one of:

  • The facilities for user-space task switching weren't available or were incomplete when it was first written (I know this was the case for Linux)
  • Originally AROS was much more tightly integrated with the Linux desktop (eg one AROS window per X11 window, etc)

Times have changed though, and so what I'm trying to do is make a new port that is designed to be much closer structurally to its native cousins. I'm realising this through a number of mechanisms provided by POSIX: threads, signals and the ucontext set of functions (though somewhat ironically these have been removed from the latest versions of POSIX and SUS).

What I do is this. I create a thread to mimic the function of the timer interrupt delivery circuit. It sits in a tight loop, waiting a little while then sending a signal to the "main" thread. This obviously mimics the the interrupt that would exist on a real system, and causes the main thread to stop what its doing and jump to a signal handler.

When a signal is delivered to a Unix process, the kernel saves the current process state (context) onto the stack and then calls a signal handler function. When the handler returns, the kernel reloads the state from the stack and continues from where it was. This sounds like almost exactly what we want, except Unix typically doesn't provide a portable way to get at the saved state on the stack. The existing hosted AROS implementation for Linux uses a bunch of Linux-specific knowledge to dig into the stack and get the data it needs, but thats obviously not portable. These days however, we have the ucontext functions which, while not without their quirks, are far more useful.

The prototypes look like this:

  • int getcontext(ucontext_t *ucp);
  • int setcontext(const ucontext_t *ucp);
  • void makecontext(ucontext_t *ucp, void (*func)(), int argc, ...);
  • int swapcontext(ucontext_t *oucp, ucontext_t *ucp);

For those who've seen setjmp() and longjmp() before, getcontext() and setcontext() will be quite familiar in function. getcontext() takes a copy of the current process state, including the CPU context, and drops it into the memory pointed to by ucp. setcontext() restores the process state and CPU context from whatever is saved in in ucp, effectively causing a direct jump to the point just after the getcontext(). What this means is that you get the appearance of setcontext() never returning, whereas getcontext() can return multiple times. Interesting times indeed.

makecontext() takes an existing context and modifies it such that when setcontext() is called on it it will jump to func with the arguments specified on the on the stack. You actually need to do a bit of fiddling inside ucp before calling it, to setup an alternate stack for the context to run on and so forth. For the most part this call is not particularly useful except when setting up.

Finally, swapcontext() is an atomic context get-and-set. That is, it does this:


except that a later setcontext(oucp) will return to the point after the call to swapcontext().

Armed with this knowledge, we can now take a look at the (slightly simplified) implementation. The task switch "interrupt" handler, is a two-stage process. The first part, which as far as the Unix kernel is concerned is the actual signal handler, looks like this:

ucontext_t irq_ctx;
char irq_stack[SIGSTKSZ];

void irq_trampoline (int signo, siginfo_t *si, void *vctx) {
    irq_ctx.uc_stack.ss_sp = (void *) irq_stack;
    irq_ctx.uc_stack.ss_size = SIGSTKSZ;
    irq_ctx.uc_stack.ss_flags = 0;
    makecontext(&irq_ctx, (void (*)()) irq_handler, 0);

    swapcontext((ucontext_t *) GetIntETask(SysBase->ThisTask)->iet_Context, &irq_ctx);

(irq_stack is initialised during startup as irq_stack = malloc(SIGSTKSZ))

So the signal from the timer thread arrives, and the current task gets interrupted and we arrive here. The getcontext() and makecontext() bit sets up a new context that, when called, will call the actual interrupt handler (ie the scheduler etc) and select a new task.

Its the call to swapcontext() that is most interesting. What this does is save the current context into the current task structure, and switch to the interrupt handler proper. The handler calls into the scheduler to choose another task then calls setcontext() on its saved context to start it up. The subtlety is in the fact that when the saved context is later used to start the task up again, it will return to the point just after the call to swapcontext(), immediately drop off the end of the signal handler and head back to where it was.

You might wonder why the more obvious method of using getcontext() to save the context then calling the scheduler directly isn't used. The problem comes from the fact that when getcontext() "returns", the caller has no way of knowing if it was the initial call to save the context, or if it was as a result of setcontext() being called. Without this knowledge, we're left to this kind of trickery so that the only time we end up after the context being save is when the context is reloaded.

(This is the opposite of setjmp(), which returns zero from its initial call and non-zero after a call to longjmp(). It perhaps makes the code easier to read to just have a call and test to determine what to do next, but its slightly slower and it would also result in the handler being run on the task stack, which means making the handler more complicated to make sure it rewinds correctly when the task is switched back. Or tricks can be played with sigaltstack(), which further complicates things.

The actual implementation is naturally a little more complicated, mostly because it has to deal with so-called "system calls", which is what happens when an application triggers a task switch (eg by calling Wait()). To allow that, each interrupt signal carries a numeric id that allows the trampoline and handler to determine what type of interrupt was requested. Then, when Exec wants to force a task switch, it will trigger the interrupt requesting it, which will make the scheduler with the main task "stopped", as above, but with slightly different semantics. It doesn't add much code though, and the technique is identical.

There's still lots to be done to clean up the scheduler, which so far is a hack job of the hack job already present in the mingw32 port. The next thing to do is continue to work on the boot sequence, which is almost there but is just a tiny bit finicky at the moment (that's a technical term). Next time I think I'll write about the new host module setup which blows hostlib.resource out of the water (if you know what that is)!

monday, 29 june 2009

posted at 22:09

My current bus activity is AROS hacking. I've actually been doing at least an hour a day for the last couple of months, so I'm making plenty of progress, but I'm off on a long and exciting tangent so it all seems quite different to what I was doing before.

I started thinking about what it would take to make cairo a "first-class" graphics system, sitting directly on top of the graphics drivers, bypassing graphics.library completely. This isn't a crazy idea - a major part of graphics.library is providing drawing and font rendering primitives, similar conceptually to what cairo does (though cairo is of course far more advanced). My thought is that we make the graphics system at the bottom of the stack for apps do all sorts of crazy compositing and whatever other eyecandy effects, and the whole desktop benefits. Initially it could operate alongside graphics.library, but it'd also probably be reasonable to implement graphics.library functions on top of cairo at some later time.

From there I started looking at the graphics driver API. What we have works well enough (despite the deficiencies that I've complained about in the past), but its not a particularly good fit to the cairo backend API, and from what I understand, not a great match for a modern 2D hardware interface either. So the next thing I started thinging about was to change the graphics drivers to have the exact same interface as the cairo backend API. From there, a driver and/or the hardware could directly accelerate cairo drawing operations. The cairo software fallbacks are pretty heavily tested and optimised (including some tight assembly versions of things where necessary), so I'd expect that even a graphics card or whatever that doesn't offer a lot of function could still go faster than, say, the current SDL driver (which uses the graphics.hidd fallbacks for just about everything currently).

So now I'm looking at drivers. As you know, I work in hosted, so my two examples are the X11 and SDL drivers. Something I hate about the X11 driver is how closely tied to the underlying kernel implementation. I took some steps to deal with this when I wrote the SDL driver with hostlib.resource, but its not perfect, and lately something has changed in the X11 driver to require it to be linked with the kernel once again. Besides that, the X11 driver is ancient, hailing from a time where AROS windows were X11 windows, and it retains a lot of that structure even though its no longer the way the world works. Also, it relies on the X11 "backing store" feature, which is usually disabled and will shortly be removed from Xorg. In short, the thing needs a rewrite.

So yay, rewriting one, maybe two, graphics drivers. Down a level to figure out what's going on the core, and sure enough, more work required there. In the last few years the structure of an AROS kernel has changed to be a minimal kernel.resource which implements the absolute minimum required to initialise the memory and task-switching hardware and hand control to exec.library. The loader (typically GRUB) can optionally get whatever modules (libraries, resources, devices, etc) into memory and make them available to exec when it starts. This is the basic idea behind the so-called "modular kernel", which has been implemented in the x86_64, Efika, SAM (both PPC), and more recently, mingw32 ports. The only ports that don't do this are the first two - Linux hosted and i386-pc.

The mingw32 port is particularly interesting. Its a hosted port to Windows, and in essence uses the OS threading system to implement a minimal virtual machine, all within kernel.resource. It has a small bootloader that loads an ELF kernel, making it so that stock AROS i386 code can be used even on Windows which doesn't use ELF itself. The other thing it does is neatly split modules into host-side and AROS-side parts. The AROS parts are handled as normal modules, but in their initialisation they call into hostlib.resource (which is now contained within kernel.resource) to load and link the host-side part. These are standard shared libraries (ie DLLs) which can bring in any library dependencies they need, neatly avoiding the problem contained within the X11 and SDL drivers in that its kinda painful to find the needed libraries at runtime. This way, you just find what you need at link time.

And so, after all this, I'm doing a new port of AROS to Linux, based on the structure used for the mingw32 port. I'm improving on it a bit though. There's still too much arch-specific code in exec.library (like thread context manipulation) which I'm hiding inside kernel.resource. I'm also adding a host.resource which will provide ways for modules to hook into the system main loop inside kernel.resource to do things like "virtual" hardware and the like (ie faking interrupts and such). The mingw32 port did this via special architecture-specific calls in kernel.resource, but I want to try to make kernel.resource have a standard interface across all ports, so they can all run an exec.library that is substantially the same.

So that's some kind of plan. I'm currently at the point where the kernel.resource boots and gets exec.library online. The next thing I need to do is reimplement my task switching and interrupt core which I never tested. If you feel like googling something, it turns out that ucontext_t is not particularly easy to copy or cache on Linux due to the Linux people messing up the way they store the floating point state. I need to rewrite it based on the wonderful context_demo.c example, which never requires an explicit context copy and should do much better. After that I should be able to hook DOS up and get something interesting happening.

I'll keep working and maybe let you know some more in another month or two :)

sunday, 28 june 2009

posted at 20:42

Yeah, its been a while. I'm still here, and I've done heaps of stuff since last time, but I just haven't gotten around to writing about it yet. I'll get there.

What I'm here for tonight is to tell you about something new. I know there's people out there blogging about AROS. I'm subscribed to a few of them myself. I'm sure I haven't got all of them though. So I'm putting together a planet to list them all:

If you're trying to follow what's going on with AROS, it'll be good for you to subscribe to this planet, as you'll find out everything that's going on. If you're blogging about AROS, it'll be good for you to be on this planet, as you make sure that everyone is reading your stuff and you benefit from other people's popularity.

If you write about AROS, email me ( or ping me on IRC (fce2 on Let me know the location of your RSS or Atom feed, and I'll add you. Its cool if you have non-AROS stuff in there, this is about AROS people as well as AROS itself.

If this gets big and popular, I'll see what I can do to get a better URL. How does sound? :)

Oh, and I need to do something to pretty it up a bit. If you feel like doing something there, drop me a line.

sunday, 3 may 2009

posted at 22:49

Long ago I wrote a SDL driver for AROS hosted. Back then I wrote about it being slow because of deficiencies in the driver interface that require flushing the output to the screen for every pixel plotted by software fallbacks. Go and read that first.

I never did finish my implementation originally, but in the last week I've resurrected the branch and completed it. Its taken adding an UpdateRect method to the bitmap class and then modifying graphics.library to call it after ever operation. If its running a tight loop to paint a rectangle or something, it will call this once when its finished to push its output.

To test, I removed all the "accelerated" methods in the SDL bitmap class, leaving only GetPixel and PutPixel. Back when I first writing sdl.hidd this was all I had implemented, and it worked fine, but was slow enough that you could watch the individual pixels appear on the screen. With the UpdateRect stuff its now very usable. Its not blinding fast, but its snappy enough to be comfortable.

And the best thing is that no changes are required to existing graphics drivers. For those, the call to UpdateRect will just use the baseclass version, which is a no-op. I've confirmed this is actually the case with the X11 driver, so yay.

I'm not sure what's next for my hacking. I'm really just studying graphics.library and graphics.hidd at the moment, trying to get my head around how it all fits together. Something big is coming, I'm just not sure what it looks like yet :)

tuesday, 28 april 2009

posted at 08:39

It would appear I'm back in the AROS game for a little while. I got a nice email asking for some help with fat.handler so I decided that I'd look into it. In the last 18 months a few things have been broken in things that I care about which were causing my particular configuration to fail to build, so I had to get into the code to fix them. While doing this I started to remember that I actually quite like hacking on AROS and miss it. That and my brain seems ready for a challenge again.

Of course this time around, I'd like to avoid the frustrations that contributed to me quitting last time. So this is my plan:

  • I will only work on things that interest me
  • I will not work for money
  • I will not take on significant commitments (ie "sure, I can take a look at that bug" is ok, but "sure, I'll write you a browser" is not)
  • I will not get involved in any political stuff like arguments about project governance, goals (backward compatibility) or anything else

The last point is key. There was a few times previously that I had to do things the wrong way just so that backwards compatibility would be maintained, a goal that I never agreed with. This time, I won't be arguing about it, I'll just be doing what I want to do. Its a light fork, if you like.

I've got a new repository set up over at "cake" is what I'm calling my mini-project for now. I'll be committing everything I do there, as well as keeping the AROS mainline there (manually updated as necessary). I will commit things to the AROS Subversion repository as appropriate, but when I do something that causes significant breakage then it will live here. In true open source fashion, anyone who wants my stuff can get it from me and build their own, or if demand gets high, maybe I'll provide some builds or something. We'll see.

So here we go, the brave new world. I'm great at changing my mind, so we'll see how long this lasts :)

friday, 23 may 2008

posted at 12:10

I'm in the process of putting a heap of code thats just sitting around on my laptop into git repositories. To make my life easier I've moved all my AROS stuff into a subdirectory. So if you're looking for one of the AROS repositories or you've cloned from me, you'll need to change paths. As usual, cgit lights the way.

monday, 28 april 2008

posted at 22:17

There's no easy or amusing way to say it, so I'm just going to say it. My involvement in AROS, including Traveller and the nightly builds, ends right now.

Over the last few weeks I've been doing a few different things. I played lots of Morrowind, started work on a couple of brand new projects, played lots of the new Advance Wars game that I got for my birthday, read the new Ben Elton book, and a few other things. I've enjoyed every part of it. I've been doing lots of different things, stretching my brain in different ways, and not been beholden to anyone. Since I'm happier, work has been much better, home has been much happier, everything just seems good.

The whole time though there's been a tiny nagging voice in the back of my head. Thats the one that has been telling me that I need to get on with Traveller. Only a couple of months to go. I hate that voice. I've tried a number of times to get into it, but I've only added about twenty lines of code to the loader in that time.

The fact is that I'm just over it all. Every part of AROS that I was interested in I've done enough work on to learn as much as I'm interested in. I wrote a filesystem. I wrote a graphics driver. I ported some minor apps. I hacked on some libraries. Its the same for Traveller. I got it to a point where you could browse the web. Everything else is just a bonus - in these areas, there's not really much left to take my interest.

I've been over this before. This is a major part of my reason for planning to leave after Traveller. But I really started thinking about why its so difficult for me to get motivated. The question I eventually got to was "would I be trying to finish this if there wasn't a nice prize in it for me?". And the only answer I had to that was "no".

That was a rather enlightening moment. I'm a little bit ashamed of myself actually, but not surprised. I've known since forever that money is not really a motivator for me, it never has been. I think I just got a bit dazzled by the possibilities; large amounts of spare cash don't come my way too often and there's at least one neat gadget that I've been hankering for.

So all in all, I have no compelling reason to continue. I realise I've made commitments, and I hate breaking them, but I've made other commitments in my work and personal lives, and I can't do them all, so I have to choose. Once I really looked at it seemed to be a fairly straightforward choice.

It shouldn't take long to remove myself. I've already managed to offload nightly build duties, as there were some issues and recent updates that I've been rather tardy in sorting out and so someone else offered to take the build on. I'll email TeamAROS shortly to let them know that I'm ditching the bounty. I guess I'll spend a little time during the week responding to email and that should about take care of it.

For anyone who wants to take on WebKit and make a browser, feel free to take the code I've already done. Its all available via my git repository under free licenses. Do contact me if you need help; while I'm not working on stuff and won't be paying a lot of attention the current goings on, I'm quite happy to offer support and advice on specific issues.

Finally, thanks to all the nice people in and around the AROS community. I've had a great time getting to know you and working with you. I've no doubt that we'll see each other around the internets from time to time, and I'll drop into #aros when I can too.

This blog isn't closing up shop, of course. Once I'm back from playing games and reading I'll likely be back writing about whatever I end up working on next. Current interests are DS hacking, binary decompilation and RPGs. By the time I write something they may not be interesting anymore in which case you'll get to read about something else :)

sunday, 9 march 2008

posted at 09:24

I started writing a long post about what I'm working on right now, but its really quite disjointed because I realised I don't actually have a point to make. So here's the short version of what I've been doing this week.

To be really useful, WebKit needs to be a shared library. On AROS, we can't support this in the normal way because of issues with global variables. The solution involves large-scale changes to the program loader and execution code.

AROS executables are actually ELF relocatable objects rather than executable objects. This is done so we can relocate programs on the fly without needing a MMU. To implement ELF shared libraries properly though, we need the extra information provided by ELF executables as they contain (among other things) the dependency list.

What I'm doing is to make AROS executables be ELF shared objects, containing both the relocation information and the dependency list, as well as other stuff. This requires a new loader for this object type, but I'm taking the opportunity to merge the existing ELF loaders since there's a lot of overlap of functionality.

Once shared object "executables" are available, I can begin implementing the library side of things. These are essentially the same thing, except that they will be position-independent, and so the loader will have to deal with setting up the GOT and PLT. The tricky bit arranging for each instance of the library to find its GOT. I'm still wrapping my head around that.

Once thats done, we'll be able to Unix-style .so libraries in addition to our standard ones. Not long after, we'll have a properly-sharable WebKit.zcc, with pretty things like and so forth.

I'll post more as I have time, progress and proper brain to describe it.

sunday, 2 march 2008

posted at 12:21

I have a little treat for the adventurous today. [8.1M]

Its mostly unusable, but many many people have requested a demo. Its still quite difficult to build it from the source, so here it is.

This will crash your system. No support of any kind is offered, but feedback is welcome. Send some mail or nab me in IRC :)

monday, 25 february 2008

posted at 15:34
A week later:

The major new things compared to my last post are the addition of the page title (and progress bar), the URL entry bar, and scrollbars. The last one is the thing thats been killing me for the last week, and I'm happy to finally have it over and done with.

What you don't see is that most of the details of integrating WebKit with Zune so that it can request and control native UI widgets. At its core, WebKit is a layout engine. It takes a UI description (in the form of HTML, CSS, etc), creates a bunch of objects, positions them relative to each other and then draws them. Sometimes (eg for a HTML form) rather than handling an object internally, it instead asks the host UI system to handle creating and drawing the object instead. When it does this, however, it expects to have full control over where the object is placed.

Zune allows a custom class to provide a function that will be used to position the widgets rather than using its own internal algorithms. This I have written. All it does is loops over the list of native widgets, asked WebKit what their dimensions are, and then tells Zune how it should draw them. Its the easy bit in all of this.

A typical Zune object is rendered in three parts. Upon receiving a request to render an object, Zune first asks the object about its dimensions, and receives back the minimum and maximum possible sizes it can use, and its ideal size. The object's parent object sets an appropriate size within the ranges and positions it in relation to itself, and then asks the object to do the same for its children, if it has any (most simple widgets do not). Finally, once the object knows its position and everything else is done, it is asked to draw itself in that space. This rendering process happens based on some external trigger, such as the window being opened or resized.

The complication arises from the order that things are done in this process, and when the process is triggered. Once its size is determined, a Zune object is asked to layout its children, if it has any, via MUIM_Layout. Once done, MUIM_Show is called to tell the object it is about to be displayed. Finally MUIM_Draw is called and the object is drawn.

Lets think about what really needs to happen to render a page, and how Zune conspires against us. I'll start by describing the obvious implementation of this mess, which is what I had before this week. In the beginning, we have a pristine WebKit_View object, with no drawing in it and no child widgets. Lets assume though, that WebKit has already loaded a page internally, because the initial page load has a couple of extra twists and this description is already complicated enough.

At the moment the application window appears (or the view is added to the UI in some other way), the magic begins. The view is asked for its dimensions, which are typically "as much as you can give me". Next, the view is asked to lay itself out via MUIM_Layout. This is actually a private method, and not one we're supposed to override, so we let that go through to the view's superclass, Group. It gets its list of sub-widgets, finds it empty, and so does nothing.

Next, MUIM_Show is called on the view. This is the first time the view knows the exact dimensions it has been given by the window, and so we tell WebKit the new dimensions and ask it to layout the page based on this size. Once thats done, the window calls MUIM_Draw, which sets up a cairo context over the view area of the window and tells WebKit to draw into it.

The cake is a lie.

If WebKit, during its layout phase, determines that it needs native UI widgets (form elements, scrollbars, etc), it asks the Zune to create them and add them to the view. Unfortunately, at this point the Zune object layout has already been done (we're in MUIM_Show, which runs after MUIM_Layout), so the new widgets have not been asked about their size, have not been placed on the page, etc. MUIM_Draw fires, the view asks WebKit to draw the page and then calls the supermethod to draw the widgets. These unitialised widgets all get drawn with no dimensions at the top-left of the view. This is not what's wanted.

At this point some way of forcing the entire layout process to run again is necessary. This is harder than it should be. You can't just call MUIM_Layout, even if it weren't a private method, because the new widgets have not yet been queried for their sizings. There appears to be no standard way of forcing the layout process to run. In the end I've abused a feature of the Group class to do what I want. The usual way you'd add widgets to a group is to call MUIM_Group_InitChange on the group, followed by one or more calls to OM_ADDMEMBER or OM_REMMEMBER. Once done, a call to MUIM_Group_ExitChange "commits" the changes by making the whole window relayout and redraw from scratch. To force the layout to happen, I simply call InitChange followed by ExitChange with no widgets added in between.

(Coincidentally, I used to use these methods when adding the widgets to the group in the first place, but stopped because it was causing a redraw every time. Now I simply use OM_ADDMEMBER and OM_REMMEMBER and assume that the layout and draw will be done elsewhere, which is correct conceptually).

The one chink in this method is that ExitChange eventually causes all three stages of the render process to run - sizing, layout and draw. We're already inside the layout section, and so we don't want everything to run again. Specifically, we don't want this secondary render process to cause WebKit to do another layout, and we don't want it to draw either, as that will be handled by the original render process. Some flags in the view object to record and detect this reentrancy are all that's required. So the final process becomes:

  • Render process triggered
  • (internal) setup widget dimensions
  • (MUIM_Layout) widget layout (ignored)
  • (MUIM_Show) WebKit layout
  • (MUIM_Show) force second render process
    • (internal) setup widget dimensions
    • (MUIM_Layout) widget layout
    • (MUIM_Show) WebKit layout (ignored)
    • (MUIM_Show) force second render process (ignored)
    • (MUIM_Draw) draw everything (ignored)
  • (MUIM_Draw) draw everything

Do you see what we did there? We just bent the Zune render process to our will by turning it inside out :) There's a couple of other warts thrown in to the mix to deal with a couple of edge cases, but thats basically it. You can read the dirty details in webkit_view.cpp.

Now I have no idea if this is compatible with real MUI. MUIM_Layout is actually private in MUI, but public in Zune, so I wouldn't be able to override it there, but the override could probably be done well enough in the custom layout function. I'm not overly concerned if its not compatible; I'm not developing for MUI after all, but I am curious.

This all points to what I believe is a fairly major design flaw in MUI, that being that the stages of the render process are fairly tightly coupled. There should be a direct a way to force a single object to relayout itself from scratch, and doing it without triggering a redraw. There should be a way to get dimensions recalculated. I suppose its not unreasonable that these things can't be done directly as its probably not often that an attempt is made to bolt an entirely seperate layout engine to it. I suppose it is a testament to MUI's flexability that I can twist it like this at all.

Next up is to get the scrollbars hooked up to the page. After that is the RenderTheme implementation which gives all the other widgets necessary to view pages with forms. A little input handling after that and then we'll have something usable on our hands!

monday, 18 february 2008

posted at 11:03

A couple of hours work on yesterday's effort, and we see this:

Had I known just how close I was, I probably wouldn't have even bothered posting yesterday.

The wonky text was because a stupid assumption I made in cairo's font code, which I've now fixed. The text still looks crap, mostly because of issues with the renderer, but I've been pointed at TTEngine this morning which looks much more like what I want and would let me remove some of the hacks I've had to do in cairo. I'll be looking at this further this week.

There's still a hell of a lot to do, so don't get too excited. At least now I have a way to see whether or not my changes are actually doing something or not.

I'll be posting many more screenshots as work progresses, but I won't be blogging them all. Things are moving just too fast for that. If you want to follow the screenshots, watch my screenshots set on Flickr or just subscribe to its feed.

sunday, 17 february 2008

posted at 09:46
Current progress:

This is WebKit on AROS rendering a trivial page containing a H1, a H2, a DIV with CSS styles set to force to 100x100 with a green background, and a IMG of a pirate, though thats not working yet.

The text alignment appears to be screwy because my code in cairo is not correctly calculating the baseline on tall glyphs. It works as expected from my cairo tests though, so I'll need to dig a lot deeper to figure this out. Likely I just missed some mundane detail; font metrics are actually quite difficult and I'm not help by the fact that the bullet interface doesn't provide away to get the metrics for the font as a whole, meaning I have to generate them in a rather horrible way.

There's also an issue where if a line of text is wrapped (eg if I resize that window to be really narrow), only the last line is rendered. I still haven't looked into that yet. Oh and of course there's a bunch of internal stuff that really isn't correct but won't noticeably affect the outcome just yet.

All in all, not bad progress so far. Its only going to get more difficult as I really get into the details, I think. Not to mention the many many shortcomings in services provided by AROS, which are going to need to be addressed if this thing is to look nice and not be insanely slow. I'll write more about that lot later.

monday, 11 february 2008

posted at 09:07

AROS work has slowed down over the last week. There's been a lot of email to reply to (I won't mention the topic :P), and I've doing some web work on the side, but I've still had a little time to work on cairo, which I'm now calling finished, at least for the moment.

The big thing I was trying to get going was the shared library stuff; ie having a shared cairo.library. I got this working, but programs are crashing because cairo has globals, a fact that I'd overlooked. It only has a couple, but they're rather important. Once again, I'm not willing to make the large changes required to remove the globals because I want to keep the changes to upstream to an absolute minimum. Once AROS has proper support for global data, then this code can be resurrected.

So cairo works, but is noticably slow. That mostly comes from it doing all its rendering in software and then blitting the results into the AROS graphics subsystem. Working slowly is good enough for me at this stage.

Fonts work, with the following issues:

  • Non-scale transformations (rotate, shear, etc) don't work as the necessary API is not exposed via bullet. Scaling work, buts only vertically - again, missing API. Basically the only cairo API that is of any use for glyph transformation is cairo_set_font_size().
  • The font tests don't pass. The first reason for this is that font sizing on AROS is not the same as on other systems. As far as I can tell the bullet interface to FreeType is recalculating the metrics to better match the traditional Amiga way of talking about metrics, with the downside that it makes the glyphs smaller than they should be. Additionally, there's no way to get the display device DPI under AROS, making it quite impossible to have FreeType adjust the scale appropriately.
  • The other reason the tests don't pass is that spacing between glyphs is wrong. A typical line of test rendered on Linux will have pretty much even spacing between each glyph. The same text rendered on AROS has uneven widths. I haven't been able to determine the cause of this yet.

The font problems shouldn't be an issue for WebKit as it does its own font work, though it will still hit the underlying font system so its likely the same issues will appear in other contexts. Again, I'll just do the best I can.

So this afternoon its back to WebKit! There's been many many changes there in the last month, so the first step will be to just get my stuff building again.

friday, 1 february 2008

posted at 22:28

As you know, I've been at this week. There's a lot of cool stuff happening in the Linux world, and a few of those things really made me ache to grab the code and get hacking on them. But even more than the technology, the best thing about Linux is the community. Even when there's disagreement (and there's plenty) the feeling is wonderful because everyone is working hard on the same thing: making computers awesome.

A shortlist of things I'd like to work on:

  • Martin Krafft's netconf network configuration framework. His design is elegant and this is something that Linux badly needs.
  • Rusty Russell's lguest hypervisor which is just beautiful in its simplicity. I've already done some real hacking on this in the tutorial and its very pleasant to work on. I had a chat with Rusty about adding support for gdb stubs (because I like that kind of thing) and it looks like it could be added quite easily. That sort of gratification is hard to come by. Plus I'm feeling happy because I won the prize in the tutorial for the most progress made (four targets reached in two hours). Its some kind of Brazilian liquor called Chico Mineiro that I'm looking forward to trying at the next gaming night.
  • cairo is still outstanding and from its requirements have come some major redesigns of the 2D graphics core in X and below. By the time the wizards are done with it cairo (and others) will be able to get better performance out of 2D graphics hardware than any other platform (Windows included). This stuff is harder to get into but is by no means impossible.
  • The GNOME crew have got some fascinating stuff coming down the pipe that I'd really enjoy working on. Its mostly integrating different types of application to better support social interactions (ie convenient sharing your stuff), which is something I've always had an interest in.
  • I've been gifted an OLPC XO-1. In the immediate future I've decided to let Francesca at it and document her progress, as a kind of observation project. The thing about these machines is that the are purpose-built for sharing and working with others, and the interface breaks all the rules and thus gives heaps of scope for trying new things. Whether she gets sick of it and hands it back or I buy one for myself so that we can play with them together, there's lots I'd like to do with it.

So there, lots of stuff I could do that I'd thoroughly enjoy, that would produce real stuff that would be used on real computers by lots of real people, and that would keep this community buzz alive for me.

On the other side, there's AROS. Now I like AROS because its technically interesting and there's lots of stuff to fix, but previously I didn't really have anything better to do. I still like AROS, but I've found myself this week doing a lot of soul-searching, trying to decide if AROS hacking is really the best use of my time. As I look at what's happening this at LCA this week, its increasingly apparent that AROS, when held up against just about everything else, is insignificant.

I don't have any delusions about AROS ever becoming a mainstream system, and thats fine, because it doesn't need to be to still be considered successful. In order to be successful, it needs a clear plan and goal moving forward (so we can actually measure our progress), and it needs a strong community of developers around it committed to that goal.

As it stands, we have none of that. The community, such that it is, is fractured, which is unsurprising since its a part of the Amiga community and we all know just how much infighting there is and has always been there. In terms of goals, there basically are none. There are those that would argue that "AmigaOS 3.1 compatibility" is the goal, which I'd answer by either saying we're already there since most source from the 3.1 era will compile and work with no or only minor tweaks, or that the goal is irrelevant since there's nothing from the 3.1 era you'd want anyway.

If we are to be a clone, then we're still a long way away - AROS can't even run on real Amiga computers! We're incompatible in a number of ways, but those ways are only important for binary compatibility, which we don't have. On the other hand, if you have the source, perfect compatibility is not really an issue as you can modify the application for the differences. But like I said above, there's nothing from the old days thats worth bothering with.

In the absence of real goals, I set my own personal goal for my work on AROS, which is to get it to a point where I could run it as the primary OS I use day-to-day on my laptop. That's a huge task, as my laptop is something close to an extension of my brain. AROS would need to at least be able to do the following to supplant Linux there:

  • Web browser
  • SSH client
  • Fully xterm-compliant console
  • Stable and fast filesystem
  • X server (for remote applications)
  • Perl
  • Proper support for my laptop - wireless, idling, suspend, etc
  • Some way to run certain Windows apps (like VMWare, qemu, etc)

It should be clear that there's more to it than just this list - a massive amount of work needs to happen under the hood to support all this.

As you can see, my aims are very forward looking, and make no provision for backward compatibility. This is causing some problems as I try to progress things. An example is my recent work on cairo. AROS graphics APIs are broken in the way the handle certain things related to alpha channels. Unfortunately this can't be changed without breaking backward compatibility. As such, I've implement a particular fix in four different ways over the last two weeks. The first three have introduce compatibility issues and I've had to remove them. I'm hopeful that the current one will not introduce any further issues, but I hoped that last time. Even if it does stick, I still needed a pretty nasty and performance-degrading hack in cairo to finally get what I wanted.

Obviously, this is frustrating. Perhaps it wouldn't be so bad if everyone was at least trying to move forward, just breaking as little as possible in the process (something I agree with), but there is an entire camp that appears to want backward compatibility at the expense of all else.

If I haven't been clear yet, I don't think that this is a bad goal. I have no issue with people wanting things that are different to what I want. The problem that I have in this case is that I don't see that the two positions can ever be reconciled as they're fundamentally opposed.

So I'm frustrated anyway, and then I go to a conference and hear and see amazing things by focused and motivated hackers, and I get even more frustrated because I want what they have. I want to work with these people on code that matters with the confidence that we're all moving in the same direction. This is why I'm starting to wonder if AROS is such a great place for me to be.

I've had some discussions in #aros about this, and the idea of forking the project often comes up. I've considered this in the past, but I've so far resisted for a few reasons. From the practical side its a pain because I'd have to setup the website, repository, etc and do admin stuff, write a lot of email, write a plan and other project governance stuff. Socially it always sucks to split a community. I'm starting to think that if I want AROS to move forward, I may not have much option.

The important thing that would have to happen before a fork is to very clearly determine what I want not-AROS to be. I think "modern" and "Amiga-like", or perhaps "Amiga-inspired" are probably the simplest ways to describe where I think things should go, but we have to specifically define those terms. "Modern" is pretty straightforward: the goal should be that if I put not-AROS on my computer, it will make efficient and effective use of my multiple processors, my piles of RAM, my wireless controller, my USB devices, etc. I should be able to use my computer to do any task that I can do currently in Linux or Windows. That of course requires a lot of applications to be written, but there should be nothing inherent in the system that prohibits those applications being made.

"Amiga-inspired" is a little more difficult to define. I've asked a few of the AROS crowd, and nobody seems to really be able to quantify it, which I find surprising since they're usually the advocates for it and came from Amiga in the old days. Perhaps its one of those cases where its difficult to define what you know because its been obvious for so long.

I don't have an Amiga heritage, coming from Unix as I have, so perhaps I can do better. Since I have no issue with changing the internals, we should start by looking at the Amiga from a user perspective. The major thing is that the system is designed to be GUI-driven, and as such the primary interface is a light and functional GUI. Unix of course is the other way around, where the command line reigns supreme.

The next major thing is the fast booting time. An Amiga system was typically ready for use within seconds of starting. Interestingly, if you measure the boot time as being from the time when the bootloader first hands control to the system to the time when the primary interface can be used, Linux actually only takes a few seconds too. The standard Unix boot sequence generally readies all system services before giving control to the user, whereas the Amiga was more likely to load things it needed on-demand. This made sense given the small amounts of memory available to the system, but that does not mean that its not a good model even for a modern system (though more options exist given the available resources, like starting services in anticipation of their use).

Much of this is enabled by the extremely light microkernel architecture. There's so little structure that system processes actually run much closer to the metal than they wood on other systems. I'm not sure how sustainable this would be as more features and system services are added, but neither have I had much chance to think about in detail. I see no particular reason why it couldn't be kept light if it was always being considered at every stage of development.

So to summarise, not-AROS would:

  • Boot fast
  • Assume a GUI (but see below)
  • Not keep stuff around that isn't needed
  • Keep the microkernel vibe
  • Let you do what you want without getting in your way

A word about the GUI. I'm a command line junkie. I type fast but am really uncoordinated when it comes to using the mouse. So my personal requirement (and I get to have them if its my project) is that everything you can do in the GUI you can do via the command line, and vice-versa. That requirement is fairly straightforward to achieve by seperating function from form - guts in a library, with UI of any type that calls it. Remotely controlling GUI applications is also something that Amiga has a history of with AREXX ports and the like.

And so then we get to backward compatibility. The fact is, I don't care. My not-AROS would not be an Amiga clone. It would try to follow roughly those points above but would be happy to break the rules when they don't work. It would aggressively steal from other systems past and current, both in ideas and in code. Additionally, once implemented, I would not be afraid to gut the internals and redo it if it became clear that we did it wrong the first time.

So there's high-level goals. They're deliberately nonspecific, which is what you want at that level. For the actual development cycle, I'd probably aim for regular releases (depending on available developers) each focusing on one or two specific areas. There'd be no nightly builds. You either get the source and build it yourself, or you wait for a release. I have ideas already about what I'd work on and change and in what order, but I'm not going to write about that here because the tasks are actually somewhat irrelevant.

From where I sit right now, AROS is in an untenable position. In my opinion it cannot get to where I think it could by continuing to be managed the way it is.

What will I do? For now, I'm committed (and still enjoying) my work on WebKit and cairo. I will complete the Traveller bounty. At that time, I'll consider my options, which will be three:

  • Abandon AROS development altogether and go and work on Linux stuff, and enjoy myself, but always wonder what might have been.
  • Continue work on AROS and likely continue beating my head against the wall until I finally explode.
  • Fork AROS and see what happens with the high likelihood that it will go nowhere and wast a lot of my time, with the guarantee that a good amount of my time will be sent managing the project rather than writing code.

What would be great would be if the AROS crowd managed to make a hard decision one way or the other before I have to decide properly. It won't happen, but it still would be very nice.

So thats it. Thats about the sum total of my thinking this week. If you're going to add a comment, please make a good argument for or against what I've said. This is actually a serious post, and I'm not interested in hearing from the fanboys this time around. If you post "I agree!" or "don't ruin AROS for everyone!", expect to have your comment deleted. And if you are going to disagree, make sure your have a pretty solid argument to back up your position because you'll be wasting your time if you don't - I've agonised over this stuff this week and I'm quite sure of my own position.

tuesday, 29 january 2008

posted at 10:12

I'm at this week and because I'm so well practiced at listening to people talk while doing something unrelated on the laptop (thanks dayjob), I've got a hell of a lot of code done, making up for the nothing I did over the weekend.

Yesterday I finally got text rendering happening via cairo:

There's not really a lot to say about it. The hardest part has been converting the glyph metrics that come back from the bullet glyph engine into cairo's glyph metrics, as they haven't a slightly different view of the world.

The code is still rather messy and incomplete. I still have to handle the font matrix which will allow arbitrary scalings, rotations, etc. Smarter font selection is needed as well as using the algorithmic emboldening/shearing stuff to provide fonts that don't exist. At least its all downhill from here.

tuesday, 22 january 2008

posted at 13:29

Things got a little slow in the last week. I spent last week tweaking bits of graphics.library and graphics.hidd to force the alpha channel to be set when calling ReadPixelArray() on a surface that has no alpha (so it can be fed directly to a cairo surface with alpha). Each attempt worked, but also introduced subtle incompatibilities into the Cybergraphics API. I still think its important to have (along with software alpha compositing, which is an entirely seperate issue), but it can't be done comfortably via the current API, so for now I just post-process the returned pixel data and force the alpha channel on before handing it to cairo. I don't like it, but it will do, and it makes it possible to use any system bitmap as a source. So now you can use cairo to take a snapshot of the entire screen with this simple code:

    struct Screen *screen = LockPubScreen(NULL);
    cairo_surface_t *surface = cairo_aros_surface_create(&screen->RastPort, 0, 0, screen->Width, screen->Height);
    cairo_surface_write_to_png(surface, "snapshot.png");

I've now turned my attention to the font backend. Its taken me a while to even begin to understand it, because I know basically nothing about font rendering, but I think I'm at least starting to see what's going on. I began an implementation based on the graphics.library functions for font rendering, but it really felt wrong as the interface really doesn't seem to support much - very few facilities for controlling rendering options, limitation to ASCII, etc. It seemed that there must be something more powerful available, as its clear from just loading up AROS that we support TrueType fonts and non-ASCII characters.

After a lot of digging, I found out about the existence of the bullet.library interface for outline fonts, and our implementation of it in freetype2.library. From there, to Google, where I discovered that there's next to no documentation out there for it. I did find reference to a document in the AmigaOS 3.1 development kit, and a quick ask-around in #aros gained me a copy of BulletLibrary, which I offer here for reference.

The interface is complicated, but appears to have most of the features I need to map to cairo font functions. I have no idea how it will go, and I imagine our implementation is deficient, but I will write some tests this afternoon and see what I can do with it, then start hooking it up to cairo.

friday, 18 january 2008

posted at 20:06

As far as cairo is concerned, its backend buffer get/set methods are only required to store and retrieve pixel data in the format requested by the cairo core. It does not have to do fancy conversions. It does not have to do alpha stuff. Presumably you'd want it be convertible to the host graphics system, but cairo itself doesn't care about that.

wednesday, 16 january 2008

posted at 12:21

Cairo is working! So far I have RGB and ARGB surfaces working, and so still have alpha-only surfaces and fonts to do, but that is enough to make the majority of the test suite work. I actually had the basics working on Thursday, but the colours were all messed up, and it took five days to track down all the issues and fix them. I won't go into the process, because its peppered with dead ends and misunderstandings, but here's what I've learnt:

  • CyberGraphics is a big-endian interface. That is to say, when you request ARGB, you will always get the same byte ordering on little and big-endian machines. This is different to cairo, where specifying ARGB will get you the ordering of the local machine. What this means is that on little-endian machines when converting from AROS bitmaps to cairo surfaces, I have to request BGRA format from ReadPixelArray() but then tell cairo its ARGB, and vice-versa.
  • When AROS converts from a bitmap with no alpha channel (eg RGB) to one with alpha (eg ARGB24), it will set the alpha in the target bitmap to 0 (fully-transparent). When feeding the target into cairo, which knows about alpha, it basically does nothing as it sees that all the pixels are fully transparent. I've already done a rather naive fix in AROS for one case, but there's still a case where the graphics library, realising that a conversion from a non-alpha format to a 32-bit with-alpha format is requested, rewrites the target format to be 32-bit no-alpha (eg 0RGB), thus leaving the alpha set to 0 again. I'm working on a more generic fix.
  • WritePixelArray() has no support for software alpha compositing. That is, when using it to blit a 32-bit with-alpha bitmap to another bitmap without alpha, the alpha component is ignored rather than computed in software. Ironically, alpha compositing code exists for WritePixelArrayAlpha(), so I'll also be looking at factoring this code out into a generic function and having both calls use it.

Once I get this sorted, I have a very cute piece of eyecandy in the works to demonstrate to you all just how powerful cairo is, and just how easy it is to use. Hopefully I'll have something to show in a few days, then I'll get back onto the font support.

monday, 14 january 2008

posted at 12:23

A year ago today I made my first commit to the AROS Subversion repository. It feels like I've been doing this forever, not only a year. I've been digging back through the history to see what I've been up to over the last year.

Here's a list of things I've done that I think are worth noting:

  • 21 January: tap.device network driver for Linux hosted AROS
  • 4 March: DOS packets and initial FAT filesystem support (minimal read-only)
  • 27 April: FAT write support
  • 7 May: Fast bitmap scaling, made Wanderer startup faster and made FFE usable in hosted and boosted it from 8 to 20FPS in native
  • 16 May: FAT notifications
  • 20 May: PA_CALL and PA_FASTCALL taskswitch and lock-free message ports for speed
  • 8 June: GetDeviceProc() and ErrorRerport() rewrite and internal DOS refactoring
  • 17 June: Pipe() DOS function
  • 21 September: hostlib.resource for calling host libraries from inside AROS
  • 2 October: Converted X11 driver to use hostlib and moved it out of ROM
  • 3 October: SDL driver
  • November (and ongoing): C/POSIX library improvements
  • 17 November: Math library upgrade
  • 3 December: thread.library
  • 13 December: ELF loader support for large objects

There's also a pile of tweaks and fixes that don't feature in this list. According to git, I've made 269 commits to the core AROS repository, adding 23182 lines and removing 12741 lines.

In addition to this, I've got plenty of work-in-progress stuff that hasn't (or won't) hit the AROS repository:

And of course, the Traveller-related work:

2008 should be a bit more focused for me, as most of the first part of the year will be working on getting Traveller out the door, and then on a few big supporting things like SysV shared object support. I don't think it'll be any less interesting as a result :)

Thanks to everyone who has helped and guided me through the many many mistakes I've made, particularly the freaks in #aros. The major reason I'm still here and wanting to work is that is fun, nothing more. Cheers lads :)

wednesday, 9 january 2008

posted at 09:17

With the help of WebKit developers I finally sorted out the crasher that plagued me over Christmas, and now I see WebKit making network requests, receiving data and calling into the graphics code to get it on screen. The next step is to begin implementing this graphics code.

As far as I can tell I need support for both blitting objects (like images) to the screen, but also need drawing primitives, both simple stuff like lines, circles and rectangles as well as complicated things like Bézier curves and arbitrary paths. It needs to be able to apply a transformation matrix to both paths and images. It needs compositing support. It also needs to be able to operate on screens of arbitrary size and depth.

AROS (and the standard Amiga API) can't support this. Some of it exists, just not enough. graphics.library has basic drawing primitives but not advanced stuff like splines and such. Its primitives don't operate reliably on TrueColor screens, which is what pretty much everything is these days. CyberGraphics provides access to higher-depth modes, but only really for blitting. And we have no support for affine transforms, compositing, or other advanced features.

To Commodore's credit, its pretty clear that they were moving in this direction. They had these concepts on the market in a time where they were barely even considered elsewhere. I'm quite sure that were they still around today we'd have these features available. Sadly, we don't, so we must find another way.

I've studied the problem in some depth, and I've decided to port the cairo graphics library to AROS. Their description sums it up well enough:

The cairo API provides operations similar to the drawing operators of PostScript and PDF. Operations in cairo including stroking and filling cubic Bézier splines, transforming and compositing translucent images, and antialiased text rendering. All drawing operations can be transformed by any affine transformation (scale, rotation, shear, etc.)

A port will be a good thing for everyone. WebKit already has a cairo graphics target, so I'd get my rendering for free. The library is extremely portable, with a great target abstraction. Indeed, I already have the thing compiling and the AROS backend stubbed.

More controversially, I think cairo could actually become the core 2D graphics system for AROS. graphics.library could be trivially implemented on top of it for compatibility, so there's nothing to worry about there. We'd implement a cairo backend that talks to a stripped-down version of our graphics HIDD layer (as much of their functionality would no longer be necessary). Once it place it would give easy support for eyecandy like real transparent windows or something like Exposé. Combine that with the plan to get 3D via Gallium, and AROS could become the shiniest thing out there.

My port will be a proper AROS-style shared library, cairo.library. Cairo's code is clean enough that I think I can do this without requiring the API to change and while still making it possible to contribute all the changes upstream without adversely affecting them.

Port repositories: cairo and pixman. These will be combined in the final library.

monday, 7 january 2008

posted at 15:41

Christmas and New Year festivities are over, and I enjoyed them thoroughly. I spent some awesome time with both sides of my family, played some cricket and soccer, played some Wii, ate way too much several times, and scored a nice pile of DVDs and t-shirts. In the long drives between various parties and dinners I've had a lot of time to ponder a WebKit problem, which I document here :)

WebCore has some functions that arrange for a simple timer to be implemented. Its very basic; there's three functions: one to set a function to call when the timer goes off, one to set an absolute time that the timer should go off, and one to disable the currently set timer. This simple interface is used by the higher-level Timer class which can be instantiated multiple times. It handles coordinating the current timers and making sure the system timer is requested at the proper interval.

I did a first implementation of this using timer.device directly, but it really didn't feel right. The interface has no provisions for initialising or finalising the timer, so I hacked it such that the first call would open the timer device if it wasn't already open. I ignored the finalisation for the time being, and started looking at how to arrange triggering the timer.

We're back to the old problem that AROS basically does not have any provisions for signals/interrupts that preempt the running process in the process context (actually, task exceptions can, but they're too low-level for our purposes and don't work properly under AROS anyway). When timer.device fires, it pushes a message onto the IO request port, which either raises a signal (MP_SIGNAL port) or calls a function directly from the scheduler context (MP_SOFTINT port). There's also MP_CALL and MP_FASTCALL ports; these are the same as MP_SOFTINT for our purposes.

Having a soft interrupt that calls the timer callback doesn't work, as it would cause us to do large amounts of work inside the scheduler which is bad for system performance. Having a signal requires the main process to Wait() for that signal and then call the timer callback. The main loop is controlled by the application and by Zune, both things we have no control over.

I confirmed via #webkit that the timer callback is indeed supposed to be called from UI main loop. Studying the MUI docs and the Zune code, it seems that it is possible to have the Zune main loop setup a timer and trigger the callback itself using MUIM_Application_AddInputHandler. This is perfect for our needs, as it removes any need for initialisation and finalisation in the shared timer code itself.

The only thing that has to be arranged then is for the shared code to get hold of the application object to setup the timer. The application object is created and controlled by the application, of course, but there is only ever supposed to be one of them per application, and I can't think of a good reason why there should ever be more than one. Its easy to get hold of this object from any Zune object inside the application, via the _app() macro, with the slight quirk that its only available when the object is actually attached to the application object. We can detect that well enough though and defer calls into WebKit until we're attached, so all that remains is to grab the application object, stow a pointer to it in a global variable, and then have the shared timer code use that variable.

This all took me a few hours to work out, and then I happily went off to do Christmas things. Over the next couple of days, the nagging seed of doubt that I had in the beginning grew into some kind of spooky pirahna flower thing. This morning while hanging clothes out to dry I finally understood the issue. Its all to do with how global variables work, and its has much greater implications for this project than just getting hold of the Zune application object.

Lets think about what happens when you load a program into memory. Forgetting about the details of the loader doing relocations, setting up space for variables, etc and the program startup making shared libraries available, effectively you just have the system allocating a chunk of memory, loading the program from disk into that memory, and then running the code within it. Space for and initial values for global variables are all held within that chunk of memory, and only the program code knows where they are and what they're for. Nothing else on the system can reasonably access them so there's nothing to worry about.

A shared library is essentially the same as this, except that it is only ever loaded into memory once. When a second program requests it, the systems checks if the library is already in memory, and if it is arranges for the program to use it. This is where things can get complicated. The big chunk of memory contains some things that are sharable because they can be considered read-only - things like program code, const data, and so on. Regular global variables are generally not sharable, as you generally don't want changes made by one process to be seen by another.

In systems that have a MMU, the usual way that this is dealt with is to make a copy of the global data somewhere else in memory, and then map it into the process address space at the appropriate location. That is, process share the read-only parts of the shared library, but have their own copies of the writable areas. (In practice its quite a bit more complicated, but this is the general idea).

AROS, like AmigaOS before it, has all processes, libraries and anything else coexisting in the same memory space. Shared libraries pretty much don't use global data. There is no support for MMUs so the kind of copying and remapping descibed above is impossible. If per-process data is required, then various techniques are employed explicitly by the shared library author - per-opener library bases, data access arbitration using semaphores, and so on. That works fine, because the author is fully aware of these limitations when he designs and implements the library.

Its worth noting that this problem is not isolated to AROS, but to every system where a MMU is not available. uClinux has had the same issue in the past and dealt with it in a couple of different ways.

Now lets look at what I'm trying to do. My goal is and has always been to make WebKit a shared library (actually a Zune custom class, though as far as the OS is concerned its the same thing). WebKit and its dependencies all make use of global variables as necessary, and assume that their globals are isolated to a single process, which is a reasonable assumption given that basically every system out there that WebKit currently runs on works this way. For AROS though, this is a huge problem.

The cheap way out is to just ignore the whole mess by producing a static libWebKit.a and requiring any applications to link it statically. This is essentially what I'm doing now. It works well enough, but currently the (non-debug) library weighs in at a touch under 18MB, and thats with barely any AROS-specifics implemented. For every WebKit-using application you have running, thats at least 18MB of duplicated code that you have to hold in memory. There's also all the usual issues with static linkage: greater disk usage, no ability to upgrade just the library and have all its users get the update, and so on.

The least favourable option would be to rewrite all the parts of WebKit and its dependencies that use global variables and either find a way to remove them or otherwise move them into a per-process context. This is horrendously difficult to do and would pretty much remove any hope of contributing the code back to its upstream sources, which I consider an imperative for this project. So lets say no more about it.

The only other option is to add support to the OS to do the appropriate remapping stuff. This is no small undertaking either, but I think as time goes on, its a very good thing for us to have. I haven't investigated it in depth, but in addition to actually implementing the stuff in the loader, its also necessary to make some changes to the way modules are held in memory and shared between users.

Currently a module can exist in memory and be used as-is by multiple users without too much effort. Because there's no global data, sharing a module is as simple as incrementing a use count, so that the module isn't purged from memory ahead of time.

When sharing an object with global data, in the absence of a MMU, its necessary to allocate new global data for each opener and do its relocations each time. This requires keeping a record of the required relocations. There's also the issue of constructing the global offset table and the procedure linkage tables, and making sure the pointer to the GOT is carried around the application appropriately. Work that will be usefel here is Staf Verhaegen's current project on library bases and preserving the %ebx register. Of course this will all have to integrate nicely with that.

Then there's also the matter of detecting when to use all this new stuff over the standard loading and linking code. I think I can make that as simple as requiring all code to be shared in this way be position-independent (ie compiled with -fPIC). Code compiled in this way is incompatible with the standard load method anyway, and for this type of shared object its far simpler to implement this whole mess if PIC is enabled. If it is, then detecting which type to use should be as simple as looking for the presence of the .got section in the object.

Thats about as far as my thinking on the matter has come. The shared timer stuff that originally provoked all this is working happily, but if WebKit is ever to be a shared object on AROS, all this will need to be revisited. Because its such a huge undertaking I'm going to leave it until after the WebKit and Traveller are in some kind of usable state. At that time I'll look at handing off care of the web browser to someone else for a little while and work on this stuff instead.

tuesday, 1 january 2008

posted at 22:53

Hi. I have lots to tell you, but haven't had time to write it all down yet. But I wanted to share this, the very first web request ever done by WebKit on AROS:

GET / HTTP/1.1
Accept-Encoding: deflate, gzip
User-Agent: WebKit AROS
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

I'll post more details sometime in the next couple of days. Happy new year :)

wednesday, 19 december 2007

posted at 22:06

This is week is insanely busy, as is typical of the week before Christmas, so I've had very little time to think about code in the last couple of days. I therefore opted for something that wouldn't steal too much of my brain, and began stubbing the Zune View class.

The interface will be typical Zune stuff. To get a web renderer into your app, you'll include a WebKitViewObject in your widget tree, and go from there.

The launcher is just a fairly standard Zune application setup. It will get a little more code before the end, mostly adding basic navigation buttons and location bar, but the basic structure won't change. This will serve as both a test program and an example of how to use WebKit in your own applications.

tuesday, 18 december 2007

posted at 11:17

Now that I've (apparently) fixed the loader, my mammoth WebKit test binary loads and runs, and so I've begun implementing the stub functions with earnest. To start my method has been to run the program until it crashes, find out where the crash happened, which is usually a NULL pointer dereference, and then provide a basic implementation of the class that that thing is supposed to be pointing to.

The current problem is a crash that occurs inside a regular method call, for no apparent reason. The offending method, in its entirety:

void DocumentLoader::setFrame(Frame* frame)
    if (m_frame == frame)
    ASSERT(frame && !m_frame);
    m_frame = frame;

Good old printf() tracing shows that the crash occurs after m_frame = frame but before attachToFrame(). That is, that method is never called. This is highly unusual, and tedious to debug, because it means we have no choice but to drop down to assembly code, which I can muddle through well enough but can't really wrap my brain around.

Disassembling the last two lines of the method, we get this:

    mov    0x8(%ebp),%edx
    mov    0xc(%ebp),%eax
    mov    %eax,0xc(%edx)

    mov    0x8(%ebp),%eax
    mov    (%eax),%eax
    add    $0x8,%eax
    mov    (%eax),%eax
    sub    $0xc,%esp
    pushl  0x8(%ebp)
    call   *%eax
    add    $0x10,%esp

The pointer to the current object, this, is on the stack, 8 bytes in, as is the frame pointer, 12 bytes in. So we see the value of this being dereferenced through the stack and stored in %edx, and then the same for the frame pointer, being stored it in %eax. Then the location 12 bytes into the object proper is computed (which is where m_frame is stored), and %eax (the location of the frame object) is stored in it. Thus, m_frame = frame.

The next chunk, predictably, is the call to attachToFrame(). The important thing about this method is that its what C++ calls a virtual method. It wasn't until Friday that it was actually explained to me what that meant, and I found it hilarious. Consider:

    Object *o = new Object;

    o = new SubObject;

(where SubObject is a subclass of Object).

Now, if method() is a virtual function, this will do what you'd expect from most other OO languages: the first call will call Object::method(), the second calling SubObject::method(). If its not virtual, then both calls will go to Object::method, because its taken from the type of the pointer, not the type of the object itself.

I don't know if this was considered counterintuitive when it was first designed, but its certainly not the way most OO languages work these days. Usually you have to be explicit when you want to call a superclass version.

In any case, the code generated is different. In the simple non-virtual case, the call can be done via an absolute address, as the compiler can know exactly where the method() function is for the type. The virtual case is more complicated as the object itself needs to be interrogated to find out where its function is.

To do this, a table for each class that the object inherits from is placed inside the object, containing pointers to the functions that the object wants to use for its virtual methods. A virtual method call might then be rendered in C as:


That is, go through the table of implementations of methods defined in the Object class to find the method, and call it.

So, getting back to our disassembly. attachToFrame() is a virtual method. The code gets this from the stack, 8 bytes in, and puts it in %eax. Then it dereferences the pointer to find the actual memory location of the object. It then adds 8 to that to get the location of the virtual method table, and dereferences that to get a pointer to the attachToFrame() function, which goes into %eax.

Then it does the usual function call setup, making room on the stack for the arguments and return address, and then calls the function at the location in %eax. It is here that the crash occurs, because %eax has 0 in it.

I was floored when I first saw this. I checked a number of times in different places, finally checking the constructor itself. And sure enough, the virtual table contains all zeroes. To me this smelt suspiciously like a relocation problem - if the the ELF loader is not correctly doing the relocations for virtual tables, then they'll point to garbage memory, causing a crash.

I'm not entirely sure how this can be, and haven't figured it out yet. I need to check the place where virtual table is normally initialised, but I don't know where that is! I can theorise by thinking about the structure of an object and the virtual table internally.

The first critical thing is that the virtual table inside the object is a pointer. That is, when the memory for the object is allocated space is not allocated for the virtual table too. A pointer needs to be to point to a valid virtual table. There's two ways this could be done: setting a pointer to some known static data that contains the data for this class, or allocating some more memory and copying the pointers from same known static data.

The former seems the more likely to me. The extra allocation and copy seems unnecessary as the table for the object will not change during the lifetime of the object. There are seperate tables for each class the object inherits from, so there's no need for a group of tables to be mixed into a single one.

So given that as a theory, we should be able to find some code somewhere around the constructor that sets up the virtual table pointer. It'll probably be the first thing after the memory allocation is done. This code might not exist in the binary itself though but may be part of a language support library (libgcc or similar). Regardless, the thing that will need to be there is the virtual table location.

I'm expecting to find that the location of the virtual table is not being relocated properly by the ELF loader. Basically, I trust GCC to produce correct code than I trust our loader to do the right thing. The problem could also be within our linker, collect-aros, but its so simple that I'm happy to rule it out initially.

Stuart, get back to work!

Update 3pm: Found it. I missed one section header table index conversion when I was updating the loader for large numbers of sections. Stupid, but it never hurts to exercise my brain on the really low level stuff.

thursday, 13 december 2007

posted at 22:04
  • mood: hobbitish

I just now have the extensions to the ELF loader implemented such that my gargantuan WebKit test binary loads. It took me a lot of reading and experimenting to figure out what was going on but I got it.

In my last post I talked about how files with large numbers of section headers store the count of headers elsewhere in the file. I'd taken care of that just fine. The other important thing that I missed is that every entry in the symbol table has a section header index that points to the section that the symbol is relative to. Of course this is a 16-bit field also, and has the same problem as the header count does.

The solution to this one is even more crazy. Basically there's an entire extra section in the file that is just an array of 32-bit values. If a symbol refers to a section with an index that is too large, you basically go fishing into that array to find the index instead. This of course means that I have to have that array loaded and available before I start doing symbol table work.

Finally, something that confused me until I put together some nice macros to deal with it was that there's a "hole" in the range of possible section header index numbers. What used to be the top 256 values (0xff00 to 0xffff) are reserved as special control codes, markers and other such things. Now that the header number is fudged into 32 bits, we get the situation where the header at index 65279 (0xfeff) corresponds to section 65279, but the header at index 65280 actually corresponds to section 65536 (0x10000). So basically, anywhere that a section number is found in any of the ELF structures, it has to be massaged into a header array index number taking the hole into account. This caused no end of issues, particularly since my file has hundreds of effectively unused sections - it was hard to even see when it was going wrong!

So now ArosLauncher loads and runs and I get some debug output before it crashes:

(../../../WebCore/platform/aros/TemporaryLinkStubs.cpp:42 void WebCore::setFocusRingColorChangeFunction(void (*)()))
(../../../WebCore/platform/aros/SharedTimerAros.cpp:10 void WebCore::setSharedTimerFiredFunction(void (*)()))

Before I get back into WebKit though I need to cleanup this code and commit it. I still need to merge the other two ELF loaders. As far as I can tell from a cursory glance the elf64 version is basically the same but using 64-bit definitions, macros, etc as defined by the ELF specs. The other, elf_aros, I'm not entirely sure about but its certainly much simpler. Its possible it just hasn't been looked at for a long time (the changelog certainly appears to show that). I'll continue to try to figure out what its for, but my feeling is that it can probably go, and elf64 and elf can be comfortably merged with a little restructuring of the code.

One thing that has become apparent is that our loader is incredibly slow and rather naive. As we start implementing more features (debug support, memory protection, shared objects) I don't think its going to cope well with its current structure. And its certainly got its work cut out for it - I've been reading Ulrich Drepper's paper "How To Write Shared Libraries", and it goes into a lot of detail about the kind of pain the dynamic linker has to go through to make things work. The glibc loader is something I'll have to study, I think.

wednesday, 12 december 2007

posted at 13:56
  • mood: elvish

I wrote a simple launcher for WebKit that creates a WebCore::Page, attaches it to a WebCore::Frame, then tries to load the Google homepage with it. Unsurprisingly, when I ran it it crashed, as most of my factory methods just return NULL. I fired up the debugger and figured out where the crash was coming from, and found it was in FrameLoaderClient::createDocumentLoader, one of my factory methods. Curiously, this function calls notImplemented(), and so should have printed something to the console. A little poking revealed that I had been done a release build, not a debug build, so I recompiled with --debug.

The resulting binary was almost three times the size, up around 300MB, which makes sense because its now carrying almost the entire source code for debugging as well. I had to start AROS with -m 512, to give it enough memory to actually be able to load the thing. I started AROS, opened a shell, started ArosLauncher, and then the amazing fireworks began.

On my debug console, I got a line of output:

[LoadSeg] Failed to load 'ArosLauncher'

Thats a problem - LoadSeg() is the program loader/linker. More exciting though was the line after line of pure binary appearing in my AROS shell. Do something like cat /bin/ls to see what I mean.

My first thought was that the awesome size of the binary was trampling something in memory, but a bit of poking around revealed the answer. When you type a command into the shell, it tries to load it as an executable file. If that fails, it checks if the file has the script flag enabled. If it does, it calls C:Execute with the file as an argument. Execute is the script runner, and it simply feeds the contents of the file into the shell's input buffer to be executed as though the commands were being typed.

Execute doesn't have any smarts to determine if what its being passed is really a script; that would be a useful feature for it to have. The real issue though is that the ArosLauncher binary had the script flag. I never set it, it shouldn't be.

Closer inspection revealed that the hosted filesystem driver, which maps Unix file permissions to AROS file permissions, was setting the script flag for every file without exception. That was perhaps a reasonable choice at the time it was written, as Unix does not have a script flag or anything similar it wouldn't have been immediately obvious what to map it too and it was never used in AROS anyway until recently (the shell gained support for testing for it and calling Execute a couple of weeks ago). Clearly though its not write, so I had to do something. I modified the permissions mapping code in emul.handler to map the AROS script flag to the Unix "sticky" (t) permission bit. I also implemented FSA_SET_PROTECT at the same time, so now typing protect +s file in AROS acheives the same as chmod +t file in Unix, and vice-versa.

So with that fix in hand, ArosLauncher was rerun and the far simpler error was returned:

ArosLauncher: file is not executable

So the next step was to dig into LoadSeg() and find out why it couldn't load the file.

A tiny bit of background: Any program, library or other "executable" thing under AROS (and most Unix systems) is stored in a format called ELF. It is split into a number of "sections". Each one contains some information. It might be program code, data, symbol names, debugging info, there's lots of different types. Its up to the OS loader/linker to pull all these together into a runnable program.

So, with the ELF specs in hand I started stepping through the loader code, and quickly found the problem. When you compile something with debugging information, it adds many extra sections to the binary object, containing what amounts to the entire source code for the program, so the debugger can give you the proper context and so on. Because it includes all of WebKit, ICU, cURL, libxml and SQLite, it has a lot of sections. Somewhere in the order of 75000 in fact.

The field in the ELF header that stores the count of sections is a 16-bit field, which means it can count up to ~65000. Clearly there are too many sections in the file to fit. In this case, the number of headers is marked as 0, and the loader should try to load the first header. In there is the real count, in a 32-bit field that normally is used for something else (the header size) but is borrowed just for this special case.

So I implemented this, and it works - it finds the headers correctly and does the relocations as it should. Its still not at the point where it will run ArosLauncher. It would appear that there's a symbol type that the AROS loader doesn't know about and is interpreting as being invalid, rather than handling/ignoring it. I'm not sure what's appropriate yet; I'll take more of a look on my bus ride home today.

More todo items: There are three ELF loaders in AROS currently, elf, elf64 and elf_aros. elf is the main one that I'm working on, elf64 is a copy of it taken recently with support for 64-bit symbols, and elf_aros is an old one that I have no idea of what its for or where it came from. I have no desire to make my modifications in three files, particular when I have no 64-bit system to test on, so I'm going to look at trying to merge these three files back together.

monday, 10 december 2007

posted at 14:46

Just a followup about the whole _GLOBAL_OFFSET_TABLE_ thing. Apparently this symbol is provided by the linker, so it makes sense that it doesn't work since the AROS linker is actually a custom ld script known as collect-aros which doesn't handle position independant code at all.

If we were to ever have real ELF-style shared libraries, this is one thing we'd need to implement. The other thing we'd need is a whole load of stuff in LoadSeg(), which is our "runtime linker".

Nothing to see here, just some notes for posterity.

sunday, 9 december 2007

posted at 11:27
  • mood: sucks
  • music: machinea supremacy - sidology 2

I'm a little stuck. Last night I wrote a trivial startup program to make sure linking and calling into WebKit was working correctly:

#include "config.h"
#include "Logging.h"

extern "C" {

void webkit_init(void) {


int main(void) {
    return 0;

It compiled fine, but the link failed:

There are undefined symbols in 'ArosLauncher':

All the bits of information I need to resolve this are scattered around (if they exist at all), but what I've learnt is this. WebKit is compiled with -fPIC, which produces position-independant code. This is what you want when producing an ELF shared library. Essentially what it does is setup an offset table to hold the locations of all the global symbols in the library, and causes the generated code to access those symbols through the table instead of going direct. Later, when the library starts, the runtime linker fills in this table with the correct locations to all the symbols. This allows the OS to place the library anywhere in memory it wants, rather than at the location the library was compiled for initally. This is all great stuff than doesn't make the slightest impact on AROS, as our shared libraries don't work this way. Well, they do conceptually, but thats a topic for another time.

I'm compiling all this code into a static library, but because it was compiled with -fPIC it has lots of references to _GLOBAL_OFFSET_TABLE_. Here's where I'm unsure of what's happening. Either GCC is not setting up the offset table because our patches to make it work on AROS don't enable it (reasonable, since we don't have support ELF shared libraries), or its just implied that if you're linking with a static library you won't need the offset table and are expected to compile with -fPIC. I spent a lot of time last night believing the former, but after being completely unable to find anything in the GCC code that supports this, I'm really starting to lean towards the latter.

Which brings us to the next problem. Currently AROS WebKit is build using qmake, the build system for QT. I chose this because it was the easiest way to get a cross-build running at a time where I had no real idea what I was doing. It would seem that its currently setup to build a shared library, which I'm hacking around at the last stage to make it output a static library. I haven't found an obvious way to disable -fPIC yet.

This highlights the next issue. qmake is not going to cut it going forward. Actually, none of the existing WebKit build systems are really suited to cross-building - its all hacks so far. Before long its going to need a real build system. I'd like to use plain GNU make so that there won't be an issue with compiling the mess on AROS, but there's still going to have to be some stuff copied from the AROS build system to support setting up the Zune stubs, for example. That suggests just using mmake directly, except that I have my reservations about its long-term suitability for anything. The build system is not something I want to debate here, I've said my piece about it elsewhere and I'm deliberately not discussing it until I have time to do my own experiments.

So here I am at a bunch of apparent deadends. I'm going to spend a little more time right now trying to bend qmake to my will, but this whole mess is rapidly getting out of hand. I believe a sigh is the appropriate action at this point.


Update 12:53: Figured out how to turn -fPIC off, and I now get why it wasn't working. I now see logging output on the console, awesome! A better build system is still required.

saturday, 8 december 2007

posted at 18:25

Today marks a major milestone for the WebKit port. It compiles!

-rw-r--r-- 1 rob rob 24782208 2007-12-08 18:25 libWebKit.a

It doesn't do anything yet, but it compiles. I have 298 stub methods across 41 AROS-specific files. Each one calls the notImplemented() macro, which simply prints the name of the method that was called.

The plan of attack from here is to write a tiny main() that tries to create a frame object and hand control to it. That should yield several million lines of output from notImplemented(). I implement those methods, one at a time, until I get something on screen.

Once I get a decent way into that process I should start to gain some understanding of how WebKit is actually bolted together. Once I have that I can start to think about the design of the Zune API.

The real fun starts now. I'm looking forward to writing some actual code rather than just stubbing functions :)

saturday, 1 december 2007

posted at 09:04

This week I've been working on another WebCore dependency, though a little different to the previous ones. To work well, it seems that WebCore needs a threading system. AROS doesn't have one. I think WebCore can work with just its stubs, but I don't want to. I want this done properly.

I started looking at how I might implement threads, and it seemed that the interface was general enough that it could be useful to as a shared library for other things besides WebCore to use. And so thread.library was born.

Its almost ready. Threads work, mutexes work, conditions work. The only thing I'm dealing with is what to do when the main task exits while threads are still running. There's a bunch of bad things that can happen, which I don't have time to go into right now, but the best thing I can do is to simply detach the threads and allow them to continue running. See here for more details, though the description is out of date - the code is now doing option 1, and the ThreadBase issues have been dealt with. The last thing to take care of is a small memory allocation issue that is causing a crash, but once thats done I'll check it in to AROS subversion for anyone to use.

Update 2007-12-03: Code is now in AROS SVN, and will be available for use in tonight's nightly and SDK builds. Be sure to read the README before starting.

sunday, 25 november 2007

posted at 20:50

Late Friday I reached a minor milestone when I got the platform-independent part of WebCore fully compiling and linking. Next up, the tricky bit: the platform dependent stuff, otherwise known as the actual port.

I spent a couple of hours staring at various WebCore classes trying to make sense of them, and eventually I started to get a feel for the structure, though I'm a long way off really understanding it. Basically, WebCore has classes for common GUI elements, like fonts, menus, and so on. To do a port, you have to reimplement these classes to wrap the same functionality in whatever graphics system you happen to be targeting. It was around this point I realised that I know basically nothing about the AROS GUI toolkit, known as Zune.

I had a look around for examples and documentation, and I started to see what was going on, but a lot of the code is a mess and its hard to get a clear picture of what's happening in my head. The only option left to me is to write a small application using some of the Zune features that I'll need to get an idea of what makes it tick.

I thought about it a bit on Saturday and today spent a couple of hours implementing a this little app that I call fonty:

Its a font preview program. You give it a font name and point size, and it'll render some text in that font. We already have stuff like it, so its not particularly useful, but so far I've learnt about the basic structure of a Zune application, how to make a Zune custom widget class (I have a seperate FontPreview custom class), and how the Amiga font system works. It'll soon have a context menu that allows selecting different styles, and changing the text. Again, not really great in terms of usability, but lets me see how everything works. And kinda fun to write too :)

wednesday, 21 november 2007

posted at 06:35

Michal writes about his continuing pain with the Amiga LONG/ULONG types on 64-bit AROS. Some guidelines for types:

  • If you're writing new code, just use the normal C types, and if you need types of a specific width, look to C99 uint32_t, etc. On AROS, LONG is always 32 bits, even on 64 bit systems. The C type long, however, can be 32 or 64 bits. Don't assume they mean the same thing.
  • Don't use ULONG, BYTE, IPTR, etc except when calling a system API that uses them, and then take care to make sure your type conversion is spot on.
  • The possible exception to this is BOOL, but only ever assign TRUE or FALSE to it, and never explicitly test its value; that is use if (flag), not if (flag == TRUE).
  • Don't store pointers in non-pointer types. If you want a generic pointer, use void *. If you need to convert between an integral type and a pointer, use intptr_t/uintptr_t.
  • Don't do clever bit things with bit fields, like Michal describes for FHF_WRITE. Just say what you mean.

This community service announcement brought to you by the variables i, tmp and foo in the hopes that it helps the general health and wellbeing of people like Michal who have to decipher yours and my bad code by themselves years after it was written :)

tuesday, 20 november 2007

posted at 13:45

I finally finished my dependency porting stuff, with libxml2 coming to life late last night. I haven't tested it properly yet, as its test suite requires glob(), which we of course don't have. I'll look at integrating a version of it soon so that I can run the tests. For the moment I'm totally over dependency porting, and eager to get onto WebKit proper.

Before bed I wrote the first line of AROS-specific code in WebCore. Ready? Here it is:

typedef struct BitMap *DragImageRef;

I have no idea what it does yet, but it was enough to get the relevant file (WebCore/platform/DragImage.h) to compile, and thats all I care about right now.

The build is going well. So far I'm just stubbing platform-specific classes to get the thing to compile. Once its compiled, I'll start implementing those stubs.

One thing that was missing that would have been difficult to work around inside WebCore itself was the reentrant time function localtime_r(). A bit of hacking on the bus this morning and AROS now has this function, along with its friends ctime_r(), asctime_r() and gmtime_r(). Phew.

Tonight's work is adding stubs for PlatformMenuItemDescription, whatever that is :)

sunday, 18 november 2007

posted at 21:16

Today I finished porting cURL, a library for getting things from the internets (or actually, anything with a URL). Its probably the dirtiest port I've done so far, both because the configure script is a mess (it knows enough to know that I'm cross-compiling, but then doesn't know enough about cross-compiling to do anything other than get in my way), but also because of the bsdsocket.library madness (which if you've been in #aros at all in the last couple of days you'll have heard my opinions on).

Obligatory screenshot:

Code at /git/aros/curl.git.

In other news, I committed my mlib patches this morning after a little testing and tweaking by Markus Weiss to get them working on PPC. I'm quite proud of it - it was a big, unknown thing and it came off without a hitch.

So now, armed with the 20071118 nightly (available in just a few short hours), you can (theoretically) possible to built all of the Traveller stuff done so far. If only you all had some build instructions .. :P

friday, 16 november 2007

posted at 21:53
Quick one before bed: a port of [OpenSSL](, which is needed for [cURL](, which is needed for WebCore.

It was actually a pretty easy port to make. OpenSSL is ported to so many platforms already that it was pretty much just a case of copying stuff from similar platforms. Amusingly, the platform most similiar to AROS as far as OpenSSL is concerned is Netware :)

Code available at /git/aros/openssl.git.

thursday, 15 november 2007

posted at 14:09

The results are in. The browser will be called "Traveller" (that's British spelling, with two ells). I had already thought of this as a potential name before asking for ideas, and when a couple of people suggested it too I knew it was good.

The reasons I like it are threefold:

  • Its a good companion for Wanderer.
  • It carries on the tradition of giving browsers a name related to finding the unknown: Navigator, Explorer, Konqueror, Safari, etc.
  • It references an in-joke among the members of my team at work, so its just a little bit personal too.

So thanks everyone for your input. I enjoyed hearing all your ideas :)

Relatedly, Paul J. Beel asked me a bunch of questions about the project and has just posted my answers over at the AROS Show. That should pretty much cover what exactly it is I'm doing and what you can expect.

I'm looking for someone who can is savvy with graphics to produce some art for the browser - icons, throbber, about screen, etc. I have ideas, but need someone who knows how to produce art and animations to give me a hand. Contact me via email ( or grab me on IRC (fce2 on

Now that all the excitement and administrivia is out of the way, time to do some actual hacking.

wednesday, 14 november 2007

posted at 16:02
  • mood: coy

Just a few quick updates.

First, thanks all for your name suggestions. I hated some of them, I loved some of them, and I've finally decided on the name. Its one that I had thought of beforehand, but a couple of people suggested it here too. I'm not revealing it yet though; Paul J. Beel of The AROS Show has sent me some questions for an interview and I've promised that I'll reveal the name there. Of course I'll post it here shortly after, but of course you all read his stuff so you won't need it :)

I've started porting the WebCore dependencies. First up is the easy one, SQLite, which I finished porting this morning. Its a horrendous port, with no file locking and hacked up path handling, but it was the cheapest and fastest I could do, and will suffice for what I want. I don't want to get bogged down on tangents; WebCore itself is going to take enough time and brain to do without being distracted.

I'm now publishing my work as I go. The repositories for WebKit and its dependencies will appear at Feel free to clone from them and do whatever you want with the code. I'll post some build instructions soon; its quite hairy. I've also put my AROS repository up, which is where I'll publish stuff that hasn't made it to AROS SVN yet (usually becaused its unfinished and/or broken).

Thats all for now. Heading home now :)

sunday, 11 november 2007

posted at 21:01

Ever wanted to name a web browser? Here's your chance. I need a name now so that I have a way to refer to the whole project, rather than "WebKit port" (accurate until I start work on the chrome) or "browser bounty" (duh).

I have a couple of ideas, but feel free to post a comment with names of your own devising. I'll choose the one I like the most, or if they all suck, I'll choose one of my own. Its not a democracy, you know ;)

Update: Name has been chosen. Thanks all for your suggestions :)

saturday, 10 november 2007

posted at 10:07

Its been a big week of AROS coding, with a milestone being reached last night: JavaScriptCore, the JavaScript engine inside WebKit, is now compiling and running inside AROS. As such, I'm satisfied the a full port of WebKit to AROS is feasible, and as such, I've taken on the bounty to produce a browser.

My process for building WebKit has been simple. I made minor changes to their build system to use AROS crosscompilers, and then let it build until it breaks. Then I go in, figure out what died, and fix it. Often this is easy, requiring only some platform defines and such. Sometimes its been a little harder, which is where posix_memalign() came from. The really fun thing happened at the start of the week when the build failed because a couple of math library functions were missing.

Our math library (known as mlib or arosm, depending on where you look, though every other platform calls it libm, go figure) was originally taken from a math library written at Sun way back in 1993, and released for free. We got our copy from FreeBSD in 1999, and it was updated again in 2003. Its missing a lot of stuff though, notably things from C99.

I had a look through the FreeBSD code and found the functions I needed, but on noticing just how much stuff was missing I decided it might be better to do a full refresh of libm. As is usual when I start on something, it rapidly got out of hand.

I had to make a few changes to our core headers to provide all the necessary defines and types and such to make it work. The new code also has an amount of architecture-specific code for using the FPU. Fortunately FreeBSD supports all of the architectures that we have active ports for (i386, x86_64 and ppc), so it was just a matter of getting the right code into place.

In any case, lots of tweaking and merging has been going on such that I now have about 20000 lines changes spread out over 21 patches. I haven't committed them yet as I'm waiting on some build macros from Staf to allow me to build the architecture-specific files into the library correctly. My hacked version seems to work well, and passes a couple of tests from Fred Tydeman's C99 FPCE test suite. I'll run all the tests soon, but I expect them to pass without issue.

Once the patches can compile cleanly, I'll try to get some other AROS devs to review them, as they're big and I'm scared. Once its all deemed good, they'll go in, and we'll be doing fancy math forever. Hurrah!

Anyway, after shoring up the holes in AROS, it was back to JSCore. The code is exceptionally well written, and easy to port. Apart from adding #ifdef here and there, the only actual code I had to write was stuff to help the garbage collector find the stack base, and thats two lines in kjs/collector.cpp:

    struct Task *task = FindTask(NULL);
    return task->tc_SPReg;

The JavaScript engine test program testkjs runs properly. The only issue is that the garbage collector is not fully cleaning up all the objects at script exit, which I think may be a memory management issue. I haven't fully tracked it down, but the folks in #webkit (particularly bdash) have been very helpful and I'm expecting to have it sorted out soon.

So thats my progress so far. My plan for the browser proper is to implement it in two stages. The first is the port of WebKit proper, which is a porting JavaScriptCore and WebCore, writing a trivial launcher application, and porting libraries it depends on and otherwise fixing things in AROS. Once thats done, the second stage begins, which involves integrating WebKit into AROS proper. I haven't thought this through fully yet, but I expect at this point that I'll be writing a Zune widget to allow applications to embed WebKit, and from there writing a Zune application to be the browser proper.

I'll be making my git repositories available shortly, so the brave can track my progress. And you'd better believe that only the brave need apply - you need to be willing to track AROS and WebKit SVN repositories and regularly recompile AROS, gcc and WebKit. Oh, and there's a 20-step build process for ICU as well, one of the WebKit prerequisites. Its early though, this will be made easier after I'm finished so other people can hack on this too.

saturday, 3 november 2007

posted at 12:40

I just finished implementing posix_memalign(). It will help with JavaScriptCore porting, as its allocator/garbage collector wants to do lots of memory tricks, including unusual alignments. I'll write more about my WebKit porting progress later.

I love doing pointer arithmetic. It spices up C so that I feel like I'm writing Perl one-liners:

int posix_memalign (void **memptr, size_t alignment, size_t size) {
    UBYTE *mem = NULL, *orig;

    /* check the alignment is valid */
    if (alignment % sizeof(void *) != 0 || !powerof2(alignment))
        return EINVAL;

    /* allocate enough space to satisfy the alignment and save some info */
    mem = AllocPooled(__startup_mempool, size + alignment + AROS_ALIGN(sizeof(size_t)) + AROS_ALIGN(sizeof(void *)));
    if (mem == NULL)
        return ENOMEM;

    /* store the size for free(). it will add sizeof(size_t) itself */
    *((size_t *) mem) = size + alignment + AROS_ALIGN(sizeof(void *));
    mem += AROS_ALIGN(sizeof(size_t));

    /* if its already aligned correctly, then we just use it as-is */
    if (((IPTR) mem & (alignment-1)) == 0) {
        *memptr = mem;
        return 0;

    orig = mem;

    /* move forward to an even alignment boundary */
    mem = (UBYTE *) (((IPTR) mem + alignment - 1) & -alignment);

    /* store a magic number in the place that free() will look for the
     * allocation size, so it can handle this specially */
    ((size_t *) mem)[-1] = MEMALIGN_MAGIC;

    /* then store the original pointer before it, for free() to find */
    ((void **) &(((size_t *) mem)[-1]))[-1] = orig;

    *memptr = mem;
    return 0;

wednesday, 31 october 2007

posted at 14:10

I've been working on a few small things over the last week or two, trying to tie up some loose ends.

First, I fixed the long-standing file notification bugs that, among other things, have caused the preferences apps to not be working correctly. Back in May I was doing lots of work on DOS, and I fixed the file notification calls so that they followed the same semantics as they did in AmigaOS. I made a mistake though: I didn't fully implement volume name expansion, such that if you requested a notification on a file in the root of a filesystem (eg RAM:foo), it would be taken as-is rather than having the volume name expanded (eg Ram Disk:foo). This caused ram.handler to setup the notification on a different file to what all other DOS calls (which did expand the volume name properly) would use. As a result, no notifications were ever sent. This didn't come up for fat.handler, that I was also working on at the time, as it does its own name expansion internally. This is all fixed in Subversion revision 27105, and in nightlies 2007-10-28 and later.

Next, I got AROS compiling under Ubuntu. Recently GCC has included a nice feature called "stack-smashing protection". When enabled it causes walls to be placed around the processor stack before a function is called, and then checks that the walls are intact after the function returns. If they're not, an OS-provided function is called to take action, which usually involves killing the offending process. Stack smashing is a good source of security flaws (the classic buffer overflow, for example), so this is a good thing.

AROS doesn't have support for the feature, so the compile will fail if this option is enabled. GCC on Ubuntu enables it by default, so the build would always fail there. I've checked in configure changes that detect if the compiler supports the option and disables it. It took me three tries as there's some complications - the option doesn't exist on GCC 3.x, and we also use the host compiler to fake a cross compiler, so the option has to be disabled via the specs file as well. To get the kinks out I installed Ubuntu into a virtual machine and messed with the config files and rebuilt over and over until the entire tree built. This is available in r27116, nightlies 2007-10-30 and later.

Then, yesterday, I stubbed a few missing clib functions needed by ICU (the major prerequisite for WebKit). Of course they don't do anything, but at least ICU can now link correctly. I suppose at some point I'm going to need to fully implement them, but I really don't want to muck about with character conversion functions just now. Functions mbtowc(), wctomb(), mbstowcs() and wcstombs() are now available in r27120, nightlies 2007-10-30 and later.

And finally, last night I added definitions to our headers to define sig_atomic_t. We don't have working POSIX signals yet, so its kinda pointless, but with the definitions in place libstdc++3 (and thus g++) can now be compiled without needing to hack the AROS headers half way through the build process. Available in r27121, nightlies 2007-10-30 and later.

So thats my brain clear of the odd jobs, so I can now concentrate properly on WebKit. The next step, which I probably won't really get chance to start on until Friday night, will be to get JavaScriptCore compiling. Whee!

sunday, 28 october 2007

posted at 19:20

Ahh software licensing, a topic I try really hard to avoid. mausle raised a concern with my previous post about ripping code from the Linux kernel for use with AROS:

... you can't just rip code under GPL license out of the linux kernel tree and link it with AROS code. I know there might be a grey zone with modules already loaded by grub.

Its a comment that needs addressing in more depth than a simple reply will allow, so here we go: my position on GPL code in AROS.

Obvious disclaimer: I'm not a lawyer, and have no access to one. I can make some guesses based on my own reading of licenses and precedent in other software projects. Of course, if the FSF or whoever want to tell me I'm wrong, I'm happy to listen.

First, lets deal with an obvious case. Distribution of GPL'd source code alongside other source code with incompatible licenses is no problem. The GPL only has issues with otherly-licensed software when it comes to linking with them and distributing the result.

And now the gritty bit. The way the GPL places requirements on other software is entirely based on the mechanism by which the pieces of software are combined and interact. The GPL itself is vague on this, but the FSF have a FAQ item in which they acknowledge this grey area and provide their perspective on it:

What constitutes combining two parts into one program? This is a legal question, which ultimately judges will decide. We believe that a proper criterion depends both on the mechanism of communication (exec, pipes, rpc, function calls within a shared address space, etc.) and the semantics of the communication (what kinds of information are interchanged).

If the modules are included in the same executable file, they are definitely combined in one program. If modules are designed to run linked together in a shared address space, that almost surely means combining them into one program.

By contrast, pipes, sockets and command-line arguments are communication mechanisms normally used between two separate programs. So when they are used for communication, the modules normally are separate programs. But if the semantics of the communication are intimate enough, exchanging complex internal data structures, that too could be a basis to consider the two parts as combined into a larger program.

In many ways this description is tied to Unix-like systems, which makes sense as that is the context in which the GPL was originally developed. Its reasonable that these guidelines are not in the license itself, but since these aspects have never been tested in court all we have to work with is what the original authors were thinking.

In another FAQ item on plugins, we get more relevant detail:

It depends on how the program invokes its plug-ins. If the program uses fork and exec to invoke plug-ins, then the plug-ins are separate programs, so the license for the main program makes no requirements for them.

If the program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single program, which must be treated as an extension of both the main program and the plug-ins. This means the plug-ins must be released under the GPL or a GPL-compatible free software license, and that the terms of the GPL must be followed when those plug-ins are distributed.

If the program dynamically links plug-ins, but the communication between them is limited to invoking the `main' function of the plug-in with some options and waiting for it to return, that is a borderline case.

So now we need to look at how software running on AROS is combined.

Lets break these descriptions down from an AROS point of view:

  • AROS modules are either executable programs, communicating via message ports, or libraries of code which are accessed via a vector table.
  • Dynamic linking usually means loading two or modules into memory and then updating pointers within them to arrange for them to directly call into and access each other. AROS modules do not do this; library calls are instead done via function pointers held in a vector table attached to the library, which is fetched via a call to OpenLibrary().
  • If modules (usually executable programs) communicate via message ports, then those message ports are accessed from known locations, and data is sent in a format defined in the OS headers.
  • For all practical purposes, AROS modules exist within the same address space - we have no memory protection. Messages passed between ports are merely pointers to some area of memory.
  • Packet-style filesystems are executable programs, invoked via the equivalent of fork-and-exec.

From this it seems to me that AROS modules that exist standalone (ie loaded from disk) aren't really linked with the OS core itself, and thus can't be considered as part of the same program.

There's a small spanner though when it comes to filesystems: how the filesystem driver is loaded if no filesystem is up and running yet. I think these can be argued around with only one grey area. Lets use a hypothetical GPL ext2 driver as an example.

In the simplest case (license-wise), the AROS boot partition is AFS or something else that is included in the boot image, and you want to have a seperate ext2 partition mounted. The driver would simply be loaded from the AFS partition (say DEVS:ext.handler) and the filesystem mounted that way.

If you wanted to boot from an ext2 partition, then we have two options: including the driver in the boot image, or including it in a seperate module set to be loaded by GRUB.

Background, for the unitiated: GRUB has the ability to load one or more files from any filesystem that it has a driver for and place them in memory somewhere for the kernel to find when it starts. AROS on x86_64 uses this already, with the "core" boot image only including Exec, DOS and a few other absolute necessities and the rest coming from the module set.

These modules are loaded from disk and aren't included in the main boot image. The kernel has to go to a little extra effort to find and use them, so they clearly aren't a dynamic link. I'd argue that there's no grey area: the modules are standalone programs.

As for including the filesystem in the core boot image, that I'll concede as a grey area. However, one could argue that since there's still no linking in between the modules included (the kernel searches through memory for the ROMTag structures at the start of each mdoule), the boot image is actually some kind of archive format (perhaps a self-extracting archive) and therefore still not linked. I think I'd lose that argument, and I won't contest it, but I still find it to be an interesting angle. The point is moot however: as time goes on we'll find x86, hosted and every other platform port of AROS moving the the x86_64 model with a minimal loader and startup image and then rest of the modules (including filesystems) pulled in from disk, whether that be by GRUB or some other mechanism.

So hopefully I've made some kind of argument there. No doubt plenty of people will disagree with me. Good - this is the time to sort this out before I start writing code :P

thursday, 25 october 2007

posted at 10:42

I've been thinking and poking at AROS a lot this week, so I have heaps to write about but haven't had time. I'll try to find more time today to get it all down. Here's the first installment.

Recently AROS has obtained a new installer, but its still hamstrung by the fact that we don't really have a proper filesystem to install AROS onto. We currently have three filesystem options:

  • AFS: This is our current default filesystem. Its an implementation of the Amiga Fast File System. Our implementation is buggy and frequently trashes the disk. Even if it did work well, it would suck is it just wasn't designed for the massive disks we have these days.
  • SFS: The so-called Smart File System. We have a port of the open-source release that was made a few years ago. It works well enough, and actually performs and scales nicely if you can get it set up. It still has a few bugs, isn't really maintained, and most significantly, GRUB has no support for it, so we can't currently boot from it, meaning that a minimal AFS boot partition is needed, with all that that entails.
  • FAT: The newcomer :) Although noone has done it yet, it should be possible to boot from it. Its stable, and GRUB has support for it. Its two failings are that it has no support for Amiga-specific file attributs (eg script and pure) and it doesn't scale well to massive disks.

So, what to do? We can fix the bugs and stablise the filesystems. We could implement support for SFS in GRUB. We could add extended attributes to FAT, either by making our own incompatible extensions or by using "magic" files. Ultimately though, we're left maintaining our own filesystems. As disks become larger and larger and new innovations in filesystem design appear, we're going to be left behind. We're not filesystem designers. We don't have enough people to commit resources to it. The best thing we can do in this situation is to steal something :)

I propose taking ext2/ext3(/ext4). Literally taking it - ripping it from the Linux kernel and porting it. Here's why:

  • Its maintained by a bunch of people who actually know something about filesystems.
  • It has some advanced features (like journalling, online resizing, etc) and is getting more all the time (see ext4).
  • Its superblock and file structures have space for OS-specific data, so we have a place in the filesystem proper to store Amiga-specific attributes without having to fudge it.
  • It also has support for arbitrary extended attributes and forks, so file comments and other large metadata will have a home.
  • The tools for creating, manipulating and validating filesystems already exist in the form of e2fsprogs, and should port with a minimum of fuss.
  • Read/write support is available on the three major operating systems (Linux, Windows and OS X). Of course they won't know about OS-specific data, but if the defaults are done right it actually won't make a difference for most files.
  • GRUB knows about it, and thus can boot from it.

I've done enough research to know that this can work. I have no immediate plans to implement it, but it'll be something I look at eventually. Of course, if you want to work on it I'm quite happy to help out.

thursday, 11 october 2007

posted at 14:56

Compile stuff is on hold for a little while as I wait for answers to my questions:

If you want to play with this stuff, I'm making my git repository available here:

As the name suggests, the goal is to get a standalone AROS SDK up and running. This will be a long-term project.

If you do try to use this, you'll need to edit $prefix/i386-pc-aros/sys-include/signal.h and declare sig_atomic_t. I have notes from Iain Templeton (our headers guy) and I'll be trying to do this properly sometime soon. For now just uncommenting the existing declaration is enough to make GCC compile.

I don't recommend shipping actual binaries from this yet though. I can't guarantee anything other than brokenness at this point.

For now I've installed 4.1.2 into /usr/local and will now start trying to compile JavascriptCore (a WebKit component). Focused time will be limited for the next week at least because my wife and new baby are coming home in a couple of hours :)

Update 10:00pm: I just got 4.2.2 to build and compile things correctly. The --enable-version-specific-runtime-libs switch to configure takes care of it. I don't know what changed in 4.2, and don't much care at this stage.

sunday, 7 october 2007

posted at 10:54
  • mood: compiling

Quick one. Short story: I want to port WebKit to AROS. It needs GCC 4 to build. It uses C++. AROS doesn't have any of that available.

I've just finished refreshing our GCC patches to get g++ and libstdc++3 cross compilers from GCC 4.0.3:

rob@plastic:~/code/aros/gcc$ ls usr/bin/
collect-aros            i386-pc-aros-gcc-4.0.3  i386-pc-aros-nm
i386-pc-aros-addr2line  i386-pc-aros-gccbug     i386-pc-aros-objcopy
i386-pc-aros-ar         i386-pc-aros-gcj        i386-pc-aros-objdump
i386-pc-aros-as         i386-pc-aros-gcjh       i386-pc-aros-ranlib
i386-pc-aros-c++        i386-pc-aros-gcov       i386-pc-aros-readelf
i386-pc-aros-c++filt    i386-pc-aros-gjnih      i386-pc-aros-size
i386-pc-aros-cpp        i386-pc-aros-grepjar    i386-pc-aros-strings
i386-pc-aros-fastjar    i386-pc-aros-jcf-dump   i386-pc-aros-strip
i386-pc-aros-g++        i386-pc-aros-jv-scan
i386-pc-aros-gcc        i386-pc-aros-ld

I've only compiled basic "hello world" programs for C and C++ yet, but everything seems to be working properly I'm not totally happy with the setup yet - in particular you have to explicitly setup paths to help it find collect-aros and libstdc++, so I'll need to fix that. Also its currently only GCC 4.0.3. I'll soon be patching GCC 4.1.2 and GCC 4.2.1, then do some test releases.

And yes, that is gcj you can see up there, though we'll have to port GNU Classpath before it can be useful :)

thursday, 4 october 2007

posted at 23:18

I committed sdl.hidd today. Its not finished by my own standard but I'm pretty much out of time and frustrated by the fact that to make it go any faster I have to keep copying code from the bitmap baseclass. Better to commit it now, let people play with it, and work to fix the driver interface instead.

I can't remember if I wrote about it previously, so here's the summary. The bitmap baseclass implements its methods by repeatedly calling GetPixel() and PutPixel() in the driver proper. This is reasonable; that way a graphics driver only needs to implement those two methods to get something on screen.

The problem is that in many setups, its necessary to do a second "flush" operation to actually get any changes to appear on the screen. I don't know if this is a problem with real hardware, but it at least has to be done with X11 and SDL. This sucks - with X11 (and for me, SDL, since I develop with it on top of X11) that means a full request/response into the X server. This makes large operations like area filling very slow, as every pixel gets flushed individually.

The way around this, obviously, is to override some the of the higher-level operations to do their work and only flush once. This sucks though if the underlying hardware/library does not actually support the operation natively. At this point, you're left with two options - copy code from the baseclass implementation but have it operate directly on the framebuffer, with one final flush at the end, or don't implement it at all and just take the slow fallback.

If there was only a couple of functions I wouldn't mind copying code from the baseclass, but pretty much every method gets passed something called the graphics context. This contains all sorts of information about how the driver should perform the operation. Should pixels be opaque or transparent, should lines be full, dashed, etc, should area fills be solid or patterned, etc.

x11gfx.hidd can go fast because because X11 too has a concept of a graphics context which has largely the same semantics (in fact I suspect the concept was copied, given that X11 was the first graphics system AROS used), so X11 can accelerate nearly all graphics operations it receives. (Notably one that it can't handle directly is BitMapScale(), which was horribly slow before I fixed it).

Alas for SDL, which has no idea about graphics contexts; indeed, it (by design) has no drawing primitives at all. Libraries like SDL_gfx exist to help with this, but they don't do enough to be useful.

I don't want to implement my own drawing primitives and context stuff, because the baseclass implementation is prefectly good. Its just hamstrung by the fact that plotting millions of individual pixels takes a lot of time, because of the flushes. So I began to look for a way around this.

SDL has the right idea here. All operations you perform on a surface won't appear on screen immediately. It takes a call to SDL_UpdateRect() to make stuff appear. It seems reasonable to add something similar to the bitmap interface. The baseclass implementations would do their plotting and call the flush method when they're done. The baseclass implementation of this method would simply be a no-op, and existing drivers would not implement it, so they'd continue to work as normal. For something like SDL, it wouldn't implement any flushes in PutPixel(), but save all its updates for the flush method (which I've called UpdateRect() also, because it seems to make as much sense as anything else).

The only problem with this is that if you really really wanted to put a pixel (ie somewhere up in graphics.library), then you have to do a 1x1 rectangle flush. I don't consider that a problem really - if you're doing single-pixel plots up at higher levels you're almost certainly doing it wrong.

Finally, every method should be implemented this way. That is, if a driver implements the UpdateRect() method, it should not do flushes anywhere else.

I've already started work on this - I have UpdateRect() stubbed and (no-op) implemented in graphics.hidd in one of my private trees. Next I have to modify all the baseclass fallback methods to call it, then modify graphics.library to call it also, and finally, change sdl.hidd to take advantage of it. Once thats done I should be able to delete a large amount of code from sdl.hidd.

The only thing to realise is that these fallbacks will still be slower than native operations, but the overhead will be the normal method call overhead for each GetPixel()/PutPixel() call, not the flush. Thats a good thing.

tuesday, 2 october 2007

posted at 19:32
  • mood: circular

Not much to report over the last few days, but I have been hard at work. sdl.hidd is functionally complete, the code just needs a little cleaning and commenting before I'm ready to release it. What I have been working on since Friday is to get the existing source tree ready to accept it. The major piece of that has been to convert x11gfx.hidd to run off the back of hostlib.resource.

This turned out to be far more difficult than I had thought it would be. x11gfx.hidd requires four libraries: libX11 for core functions, libXext and libc for shared memory extensions (optional), and libXxf86vidmode for fullscreen support (again, optional). So hostlib.resource gets quite a workout.

The real difficulty has come from bring X11 headers into the same files as AROS headers. For example, extensions/xf86vmode.h defines BOOL and BYTE, which are also defined by AROS' exec/types.h, though incompatibly. It has been quite an effort to get all the pieces working together happily, but I seem to have got there.

The next thing to update was way the driver was built. Previously it existed in the hosted "ROM", whereas now it must be compiled standalone. Because of the use of hostlib, there's now no need to link against X libs, and in fact they're not even required, so configure need some serious work to make this happen. That file was quite poorly structured, but I've at least cleaned up part of. There's lots left that can be done to make it good.

Once all this was done I had to test to make sure that in changing the build setup I hadn't broken the other architectures. I did a successful pc-i386 build with the updates in place, then installed FreeBSD into a virtual machine to test its port. It didn't build past MetaMake, so I tried pristine sources and found the same thing. It seems the FreeBSD port is broken independently of my changes. I've done my best to make sure that my changes at least won't contribute to the breakage if anyone ever tries to bring it up to scratch.

Here's the diff according to Git. There's been a couple of small tweaks since I dropped the code but nothing significant. If you are able to build hosted, please give it a try and let me know how you get on. These changes are complicated and although they work well here its likely that I've screwed something up.

I do find it amusing that in order to commit sdl.hidd - that I hope will one day soon obsolete x11gfx.hidd - I've had to learn the X11 stuff inside and out to make it possible for it to work together with SDL. And for even more irony, remember how my original motiviation for this was to get a "real" mouse cursor for FFE, but I couldn't figure out how to make it work in x11gfx.hidd. Well I now understand the X11 stuff well enough that I could implement it if I wanted to. Figures.

Next up is the sdl.hidd commit, which I think will happen on Thursday at my current rate. I'm happy that I'm going to hit my target, which was the end of this week :)

wednesday, 26 september 2007

posted at 22:23
  • mood: blitblitblit

On the way home today I started poking at the bitmap class, looking for fallback methods that were calling PutImage() or PutPixel() multiple times, triggering multiple flushes and slowing things down. I found that most images are drawn by a function called PutAlphaImage() which, as the name suggests, blits a rectangle to the bitmap taking the alpha channel on the source data into account. The fallback method does its usual line-at-a-time operation, applying the alpha itself. As usual, works, but is slow.

The one thing I didn't want to do was copy the superclass code wholesale, just changing the PutImage() and GetImage() calls to direct pixelbuffer access. It would work, but that kind of code duplication is really bothering me (and I'm considering a more permanent fix for that problem, which I'll write about later). So I started to read through the SDL documentation to find out what it could do with blits and alpha things.

The way I ended up implementing it was to create a SDL surface out of the source pixelbuffer passed to PutAlphaImage(), using SDL_CreateRGBSurfaceFrom(). This function is pretty simple - you pass a pointer to the raw memory data, the dimensions of the data, and the RGB and alpha masks, and you get a surface back. For PutAlphaImage(), the masks are fixed, so they can be hardcoded. Once the surface is obtained, it can be blitted to the target surface using SDL_BlitSurface(), and then discarded. Creating a surface from existing data is an exceptional lightweight operation, as the original data is used - no copying done. Freeing the surface leaves the original data intact, so really its just allocating and destroying a SDL_Surface.

By letting SDL do the copy, you get all the benefits of the SDL video engine, which at is core is a hand-optimised blitter, with special SSE2/Altivec/etc versions that get used where appropriate. Basically, its faster than any code I'm ever going to write, and it shows - icons and decorations, the two big users of PutAlphaImage(), now appear instantly.

So I committed that and went looking for more speedups. I noticed that windows were drawing a little more slowly than I liked. When a window appears, the window outline is drawn first, then the theme pixmaps, scrollbars, etc blitted over the top. The outline draws a line at a time (which you can see with debugging on), the pixmaps go fast due to the above changes. I traced this code and, as expected, found multiple calls to PutImage(), but this time they were coming from .. PutImage() itself.

This threw me for a moment until I looked at my PutImage() implementation. Currently it does what most of the other drivers do. It checks the pixel format for the data to be blitted, and if its Native (same format as the target surface) or Native32 (same format as the target surface, but with every pixel in 32 bits so they need to be "compressed" as they're copied to surfaces with lower depths). Anything else gets referred to the line-at-a-time superclass method, which will do the appropriate format conversion. This is what was happening in this case.

My great revelation was that SDL does pixel format conversion natively when blitting, and its almost certainly going to be faster than graphics.hidd can do, even without the line-at-a-time overhead. All I have to do so is supply the appropriate masks for the pixel formats, which are easily obtained from the bitmap object's PixFmt attribute.

Time to stop waffling and write some code :)

wednesday, 26 september 2007

posted at 15:55
  • mood: sdl

I have no motivation to blog at the moment, but I'm getting lots of code done. Latest screeny:

I'm opening windows and typing things. In other words, mouse and keyboard are working.

There's a bit left to do - implementing a few more bitmap methods to speed up the graphics (fancy things with gradients and alphas, like Zune, are still sluggish) and fixing up mode selection stuff, then doing a bit of code cleanup and reshuffling, adding some comments, etc - then its done! I expect to be committing it next week sometime. First I have to fix up the build system to support it and move x11gfx.hidd to use hostlib, so you can take one without the other, and build for SDL without requiring X on your system.

sunday, 23 september 2007

posted at 10:41
  • mood: content

In keeping with my recent theme of showing a picture with no actual content, here's my results from five minutes ago:

Yep, thats a fully-working SDL video display. Its a little slow as I haven't implemented all the fast blitting methods yet, they will come soon though. Seeing this in all its glory makes me happy.

I'll try to write later about some of the details of this code. I just wanted to throw something up quickly as I'm out for the rest of the day. If you're really interested, I'm posting screenshots ever step of the way. more often than I blog about them. Feel free to follow along.

saturday, 22 september 2007

posted at 01:00
  • mood: zzz

With a few minutes to kill after my shower and before sleep, I hardcoded the masks and shift values to get this:

Obviously still has bugs, but now I can see what I'm doing. Goodnight!

friday, 21 september 2007

posted at 23:59
  • mood: green

So after a couple of days of wondering why every pixel I got handed by graphics.library was either 0x0 or 0x1 (ie black or so close to black that it might as well be black), I looked at my pixel format setup code and on a whim, removed the stuff where I tried to compensate for SDL and AROS colour component masks being different and used them as-is. This is what happened:

Which I guess means that green is somewhere in the middle and thus coincidentally has the same mask, but red and blue are toast, and my fixup code was totally wrong. Some weeks I'm completely moronic.

wednesday, 19 september 2007

posted at 09:16
  • mood: satisfied
As seen on a morning bus trip:

Pixel format conversion isn't implemented yet, so every pixel is white. The weird thing in the top corner is the mouse pointer, and the other bits are disk icons with their names underneath.

monday, 17 september 2007

posted at 22:27
  • mood: wtf

Read this comic. Go on, I'll wait.

There's something the comic doesn't tell you. The process that makes photosynthesis happen is exactly the same process was that used to design and implement the graphics hidd interface.

SDL work continues. I have it at the point where its creating both on and off-screen bitmaps, though there's a third type ("non-displayable") that I haven't done yet. I can see calls being made to my bitmap PutPixel() methods, so I know that something semi-correct is happening. As yet though, drawing isn't implemented, and AROS still crashes on boot because I haven't written everything yet.

The development process for this has pretty much been:

  • Stub some functions/methods.
  • Run AROS in the debugger.
  • When it crashes, find out where, and go and look up the corresponding source file.
  • Break my head against the poorly structured and mostly uncommented code within.
  • Look at the existing hidds to figure out how to implement it, but give up because all the native hidds are based on vga.hidd and the only non-native hidd, x11gfx.hidd has piles of cruft left over from the days before layers.library, where every AROS window had its own X11 window.
  • Get a vague sense of what's going on.
  • Implement enough of the function to stop the crash happening, even though the code itself is probably incorrect.
  • Rinse, repeat.

So things are moving at a glacial pace, but at least they're moving, which is something.

The hidd interface works well enough, but is really weird in some places. For example, when graphics.library wants to create a new bitmap (which is the lowest-level structure in the graphics system), it calls Gfx::NewBitMap(). Confusingly, this method doesn't create and return a bitmap, but rather returns a reference to a bitmap class that can handle the type of bitmap the caller requests (displayable, hardware, etc). The caller then instantiates that class to get its bitmap. This is rather peculiar from an OO standpoint.

Oh, I've just had an epiphany about the bitmap classes. All the existing hidds implement an "on-screen" and an "off-screen" bitmap, which are basically the same but with slightly different initialisation. Most of the common functions are in a single bitmap_common.c file which is #include'd into the class source (a horrible idea no matter where you see it or what the justification).

The on-screen bitmap constructors typically make the bitmap appear on the screen as well, which has really been confusing me as there's also a Gfx::Show() method that gets passed a bitmap to display. This wasn't making sense. What if two on-screen bitmaps were created? Would their constructors cause them both to be displayed? What if an off-screen bitmap is passed to Show()? What if steak knives were actually made of cheese?

Anyway its just now clicked. The distinction between on-screen and off-screen bitmaps is entirely internal to the hidd. One of these types is chosen based on the values of the Displayable and Framebuffer bitmap attributes. For SDL though, they're pretty much all the same. I don't need a seperate class for each. So all thats needed is to create a bunch of bitmaps, and when Gfx::Show() is called just arrange for that one to be shown.

That last point is slightly more complex. Under SDL, you have a single on-screen surface, and then as man off-screen surfaces as you like, which you can blit to the screen. You can't choose to display an off-screen surface on a whim, you have to blit it. So what this means is that when Gfx::Show() is called on a bitmap, I have to make sure that the current surface matches the bitmap's resolution, and recreate it if not. Then we make a note inside the bitmap object that it is currently on-screen.

When something makes a change to a bitmap, this flag is checked. If the on-screen bitmap is being written to, then the update must be followed with a blit to the current surface. I haven't tested this yet, but I think the idea is sound.

I estimate another four hours of code before I have some graphics displaying. One day, when all this is finished, I'd like to write a "how to write a graphics driver" doc, and/or fix the damn interface. Yeah, right.

wednesday, 12 september 2007

posted at 18:45
Another quickie: `hostlib.hidd` in action:

What this is is a test program that uses hostlib.hidd to load, get function pointers for SDL_Init(), SDL_SetVideoMode() and SDL_Quit() and then calls them to open a window (that big empty one at bottom-right).

The debug output you can see in the bottom corner is just method calls and their arguments so I can follow the flow:

[hostlib] HostLib::New
[hostlib] HostLib::Open:
[hostlib] HostLib::GetPointer: handle=0x0826eaf0, symbol=SDL_Init
[hostlib] HostLib::GetPointer: handle=0x0826eaf0, symbol=SDL_SetVideoMode
[hostlib] HostLib::GetPointer: handle=0x0826eaf0, symbol=SDL_Quit
[hostlib] HostLib::Close: handle=0x0826eaf0
[hostlib] HostLib::Dispose

There's probably a million things you can do with this code. I'll be using it to write the SDL hidd. I wonder what others will come up with.

monday, 10 september 2007

posted at 09:33
  • mood: hacker

So after the compiler shenanigans of last week I finally managed to write some actual code on Friday. I started with just calls to SDL_Init() and SDL_Quit(), but the compile blew up in my face. The problem came from the fact that I was linking with -lSDL, which would have been fine except that AROS has its own libSDL for SDL apps running inside of AROS. The linker found that first, which is entirely not what was wanted, though even if it had found the right one I guess we'd be looking at namespace clashes for anyone who wanted to run a SDL app inside AROS.

After a bit of thought, it seemed to me that the only way out was to not link to the system libSDL at all but instead load it runtime using dlopen() and friends. This can work but isn't without its problems, as loading a library is not the same as linking.

When you write code, you call lots of functions that exist somewhere than other in your .c file. When you compile your .c file, it leaves placeholders for all the functions in the resultant .o file. Linking is the process of pulling in a pile of objects (.o), object archives (.a, also known as static libraries) and shared libraries (.so) and updating all the placeholders to point to the right bits of code.

When you link with a shared library, the link process replaces the function placeholders with stubs that refer to a library file that exists on disk somewhere. When you run the program, a program called the runtime linker (known as on Linux) looks through it, finds all the stubs, loads all the needed libraries and then fills in all the pieces to make a fully working program.

The idea is simple. By not having to carry a full copy of every required library with every program, program binaries are smaller and so use less disk space. Additionally, its possible for the runtime linker to only keep a single copy of a shared library in memory and point all programs to it, so you save memory when there's lots of programs running. The downside to the whole mess is the increased complexity in linking, the runtime linker needing to find all the pieces (/etc/, LD_LIBRARY_PATH and ld's -rpath option), the fact that programs can't be as easily copied around because they have libraries that they need, etc. You don't notice this most of the time because we have smart tools to take care of all this stuff.

So back to AROS. dlopen() is not a linker. It merely opens a shared library and allows you to get at pointers inside it. You can obtain a pointer to a function, and then use that pointer to call the function inside the library. So this is possible:

    void *handle = dlopen("", RTLD_NOW | RTLD_LOCAL);
    void *fn = dlsym(handle, "SDL_Init");

The problem here is that the library does not contain prototypes, so we have no idea how to pass arguments to the function. We could build the stack by hand (assuming we knew the arguments), but then you don't get the benefit of the compiler doing type and prototype checking.

The normal home for prototypes is in the header files that come with the library. The problem here is that they define functions as real "first-class" functions. If we used them, it would cause the compiler to leave a placeholder for the function which would never get resolved because we never link -lSDL. Thats a build failure. Obviously though, we need the headers as they have all the prototype information, as well as other things we'll need like structure definitions.

Another problem we have is that we're going to need many, many functions from this library. libSDL has almost 200 functions. While we won't need all of them we can expect to need a fair few, so we need prototypes and calls to dlsym() for each one.

All this really has to be bruteforced. The method is to create a giant struct which has space to store many many pointers, and then, for each wanted function, call dlsym() and populate the list. Function pointers can be declared with the same name as a first-class function (as they're not in the same namespace) and with a prototype. An example is SDL_SetVideoMode, which has the prototype:

    SDL_Surface * SDL_SetVideoMode (int width, int height, int bpp, Uint32 flags);

We can create storage for a function pointer with the same prototype like so:

    SDL_Surface * (*SDL_SetVideoMode) (int width, int height, int bpp, Uint32 flags);

Once we have a struct with all the function pointers declared and initialised, then we'd call a function in it like so:

    struct sdl_funcs *funcs = <allocate and initialise>;
    funcs->SDL_SetVideoMode(640, 480, 16, 0);

The "allocate and initialise" portion of that is a loop that runs through all the function names (stored in a big array), calls dlsym() on each and stows the returned pointer in the struct.

All this is heaps of setup, but it works very well. To help with the setup, I've written a script called soruntime. It takes a shared library and one or more header files as input. It scans the library (using nm) and extracts the names of all the functions that the library provides, then expands the headers (using cpp -E) looking for prototypes for those functions. Once it finds them, it outputs a header file with the library struct (ie all the prototypes), and a code file that has functions to setup and teardown a library.

I'm currently integrating this into my source tree for the SDL HIDD. It could (and probably will) be extended to the X11 HIDD as well, which will provide some uniformity and make it so that if we ever do get an X server ported to AROS, there will be no clashes.

Another thought. With a HIDD that provides facilities for a AROS program/driver to ask the host to load and provide access to a shared library, the graphics HIDDs would not have to be compiled into the kernel anymore and instead could just be standard pieces "inside" AROS. If the UnixIO HIDD was extended to provide better file access features, the other HIDDs (parallel, serial, and the emul filesystem handler) could be modified to use it and thus also be moved into AROS-space. This gives a tight kernel with basically no dependencies. I've started stubbing a hostlib.hidd which will expose dlopen() and friends to AROS for just this purpose.

saturday, 8 september 2007

posted at 13:34
  • mood: stats
Allow me to reveal part of my motivation for getting a public AROS repository available:

Ohloh is some sort of social networking site for open-source projects and contributors. It works by analysing the complete source history of as many open-source projects as it can get hold of, then building links between people and the different projects they've contributed to.

Its fascinating looking at the graphs that it generates (try the Contributors tab), particularly for a large project. The commit history graph is funky (as seen on my contributions to AROS and jabberd2).

I'm not really a fan of these kind of sites, but this one has me intrigued. I hope more projects I'm familiar with start to get mentions in here.

friday, 7 september 2007

posted at 23:22
  • mood: cautiously optimistic

At the start of the week I began writing a SDL HIDD for AROS. Currently it does nothing, just prints a debug message when its init is called to show that its compiled and running. This was working on Sunday night.

On Monday I started modifying the build system to support disabling the X11 HIDD in favour of the SDL one. My plan is that you'll be able to compile in one or more HIDDs for hosted, and select between them using a command line switch. No more X11 dependency if you don't want it (and if you were doing, say, a Windows or OSX port, you don't want it), and no more hidd.prefs, which is retarded. I finished the build stuff on Tuesday.

Once I'd confirmed it was working properly, I then recompiled with the X11 HIDD to ensure I hadn't broken it. Something strange happened. The kernel booted, but then startup failed with a "illegal instruction".

I figured I'd made some silly mistake (as we know, the depths of AROS contain much deep magic and many dragons), so I gradually backed out my changes, one at a time, testing as I went. No change. I tweaked and experimented over Wednesday and Thursday, with no luck. Finally, in desperation, I pulled a fresh untouched tree from Subversion and built it. It crashed.

The nightlies were working problem, so that pointed to a problem with my build environment. After some discussion with aros-dev and some poking, I finally found out today that GCC 4.2.1 is producing broken code. GCC 4.1.3, which is what the nightlies are compiled with and what I've now reverted to, works properly. I don't know if its an actual GCC bug or if AROS code is actually wrong but used to work because of some edge case. AROS has some pretty spooky macros which could very well be at fault.

For the moment I'm happy to sit on the older GCC. I've lost days on this, and I'm just glad its over. I'm looking forward to getting some code written now.

Something I have been able to do while waiting for endless builds to complete is to read Git docs. I really like the look of it and am eager to give it a try for AROS development. I've decided I will develop the SDL HIDD using Git, so I should get the chance to see it in action. I'm not yet sure how to commit from Git to SVN yet, but I'm sure I'll figure it out soon enough.

I've also made a public Git repository of the AROS sources available. These are available here: They're updated from AROS Subversion every hour on the hour. Feel free to clone and pull from these repositories; I have bandwidth to burn.

This weekend should be a good one for code, I hope.

monday, 3 september 2007

posted at 15:43
  • mood: shiny

Had a great Father's day weekend. Saturday I went out and bought my AVR and a 74HC573 for the memory latch. I have a couple of 8K RAMs that I picked up on eBay last year and some "ladder" LEDs and other interesting lights, so I should now have everything I need to start experimenting. I'm short a power supply though: it'll be a race to see whether I hack up an old plugpack or drive over to Rosanna to pick up my bench supply from my mate's place.

Sunday I awoke to Francesca awkwardly trying to climb into bed holding her Father's Day loot. I helped her up and she helped me unwrap a copy of Settlers DS (a port of Settlers II). Its got some pretty lousy reviews, and I can see why - the interface is clunky, the gameplay is sluggish and there's obvious bugs. Its still Settlers though, which was a game I was addicted to back in the day, so I'm happy. The girl also gave me a nice picture book about a Daddy bear and his kid bear and all the things they do together, and we had a great time reading it together. I do like being a Dad :) Today I found this presentation about Git, which I've been hearing lots about but decided was too much of a leap away from Subversion for my brain to handle. At the time I opted for SVK instead, and I love it, but lately I've found its starting to run out of steam which seems to be tracable back to its Subversion roots. The presentation was fascinating and enough to convince me that Git is worth my effort, so right now I have git-svn running to pull in the AROS repository. It won't be done before I go home so it'll probably be tomorrow before I can really experiment with it properly. I hope its a good as everyone claims.

saturday, 1 september 2007

posted at 09:14
  • mood: grumpy

So I spent my bus rides and my evening yesterday writing a whole new mouse input setup for FFE. It works properly, in that pressing the right mouse button stops the system receiving mouse events. As is typical for AROS, there's a problem. Admittedly its specific to the hosted environment but thats where I live so its frustrating.

The input system for AROS (and AmigaOS) is layered. At the bottom is the hardware - keyboard, mouse, game ports. Above that is input.device, which hits the hardware (actually it has drivers to do this, but lets just say it does). Higher layers register with input.device and arrange to be called when events come in. Each thing registers with a priority, and when an event happens, the thing with the higher priority gets informed of it first, then the next, until they've all been called. The higher levels can modify the received events so the lower ones don't see them.

Its worth noting that the Commodities Exchange registers with priority 56, with Intuition at 50. console.device is at 0, so it picks up the dregs. CX appearing before Intuition is how it is able to intercept mouse clicks and do fancy things.

On native, Intuition is responsible for managing the mouse pointer, moving it, etc, so if you stop it from receiving mouse events (eg by doing what I do with FFE with a priority 100 handler), it doesn't move. On hosted though, its a different story. X11 controls the mouse there, and the hosted environment fakes mouse hardware stuff to tell input.handler what's happening. The mouse continues to move though - there's no way to stop it.

This is incredibly frustrating, of course. AROS should do what other emulation things with their own mouse (eg VMWare) do - capture mouse input from X, releasing it only when some magic key combination is pressed.

To make this happen means hacking on the X11 HIDD, which is some of the worst code ever. So as usual, implementing some tiny feature (making my spaceship fly properly) means learning and rewriting some major OS subsystem. And people wonder why AROS is hard for devs to get into? All I want to do is write my app. I don't want to have to fix every damn mistake in the OS to get there.

Yes, this is something of a rant. I've been here before though - remember that whole DOS trip I went on a few months back?

So FFE is backburnered for a little while, and I'm considering writing a new HIDD based on SDL (I stubbed one a while ago). If I did, it would be clean and pure and incredibly well documented so that graphics HIDDs aren't deep magic any more.

thursday, 30 august 2007

posted at 23:37
  • mood: pointy

So my interest for this week (I'm fickle so you can guarantee that next week I'll be doing something else) is porting games to AROS. My focus therefore has been on two things: getting an up-to-date C++ compiler working, and hacking on the JJFFE port.

A C++ compiler is needed if I'm going to port Battle for Wesnoth, but also if I'm going to have a shot at getting a WebKit port happening (yes, I haven't forgotten). The last C++ compiler we seem to have available is GCC 3.3.1, which is quite old now. I'm attempting a port of the latest release of GCC, which is 4.2.1. The C compiler seems to be fine; just some minor changes to the already-extant 4.0.0 patch we have. I'm having some trouble regenerating autoconf files though - seems that GCC is very finicky about its build environment, and I'm not meeting its high standards yet. I will keep at it - its a good side project because it mostly consists of make a couple of small tweaks then waiting half an hour while the build runs and fails. If I succeed, then I'll be trying to keep the patches updated as each new version of GCC comes out (assuming I can't convince the developers to accept the patches into mainline).

Meanwhile, JJFFE has been getting some love. So far I've cleaned up the existing code and properly merged Kalamatee's window scaling code and the other changes he made, which were great work but left the source in a real mess. Things are now looking much nicer so its time for some features.

In FFE you orient the ship by moving the mouse while holding down the right mouse button. In the original version and the SDL and Win32 ports holding the right button causes the mouse pointer to disappear and be locked in place. Currently in the AROS port this doesn't happen - right hold still moves the ship around, but the mouse pointer moves too and if you move it out of the window then all the movement stops.

I've been scouring docs, newsgroups and forums for an way of disabling the mouse pointer and have come up with nothing. This evening I figured out a way that I think will work. The idea is simple - when the right button is clicked, open input.device and eat up all the incoming mouse events before Intuition can get hold of them. I'll have to process raw mouse events myself, but that shouldn't be too hard. When the right button is released, I remove my fingers from the pie and Intuition continues as normal. A couple of calls to ClearPointer() and SetPointer() should provide the vanishing pointer. I'll have a crack at implementing this tomorrow.

Its really nice to be writing proper code again and nutting out tricky stuff. I missed it.

monday, 27 august 2007

posted at 14:10
  • mood: distracted

Another week of not much. The weather is glorious at the moment; yesterday I spent a couple of hours outside mowing the grass, which is a pretty huge undertaking. It looks fantastic and has got me motivated to tidy the rest of the garden and finally get the garage sorted out, which I'll probably try to do a bit in the evenings this week, particularly if it stays warm(er) after the sun goes down.

I've finished reading the AVR book, and have most of the design for the graphics driver done in my head. I'm pretty much settled on the ATmega162 for starting out, as it should have everything I need - too much program memory, plenty of internal RAM, a JTAG mode and enough pins to hook up an external SRAM. Jaycar have them for $20, so I have a cheap supply without having to do crappy mail-order stuff. I still have to sit down and write down the whole design to produce a parts list, but once thats done I can go shopping. The plan is to do that on Saturday morning, taking the girl to Ringwood on the train. She's been begging for a train ride for a while now, so that should take care of both.

Gub recently our services to MOPS Australia (of which she is the coordinator of group at our church) to bring their website out of 1996. To this end I've installed Joomla! and am trying to learn a bit about what makes it tick. PHP is horrible, and the community is weird, but it looks like I'm not going to have to write too much code, which is good - this really has to be off the ground before the baby arrives.

I got my AROS tree up to date and building this morning, as I really need to write some code again soon, my brain is ready for it I think. At the moment I'm just fiddling, getting a feel for the code again and seeing if there's anything that I really feel like playing with. I'm not committing to anything yet, nor am I soliciting suggestions, gentle reader ;)

Back to work. We're on a tight deadline. Don't ask.

thursday, 5 july 2007

posted at 15:56

Fast update, just about to head out of the office for another day.

Work has been psychotically busy, leaving me with very little brain space at the end of the day, so I haven't done much code in the last few days - its just too hard to think. What I have been doing is updating my entry and the System entry over at the CSDb. Finally, you can get some insight into where I came from. There's still a bit of stuff to upload and cross-reference, but its much more complete than it was a week ago.

I'm also idly pondering this demo that we're wanting to make. I've started experimenting with the assembler portion of cc65, because it seems to have pretty much everything that I might want. I'm not yet sure how to get it to link things using a custom memory map rather than it just assuming it has the run of the place, but I think I know where I'm going wrong. I'll have another crack at it on the bus, using the vector part I wrote TEN YEARS AGO this month. Good grief.

After that I guess I'll poke at something AROS again. I am unfocused at the moment, but I'm fortunate that AROS is too - there's plenty of places where I can fiddle for an hour, fix or add something tiny, and it still counts as progress.

monday, 2 july 2007

posted at 10:40

I've been back working on fat.handler this weekend. I had to look at the code for something and actually found it kind of interesting, which I thought was long past.

First thing was to add 64-bit support, so that it can handle partitions larger than 4GB. This was pretty easy, just new code in the cache to probe the underlying device to see if it supports 64-bit extensions, and then later if a request comes in for data that is over the 4GB bound ary, use a 64-bit read or write operation rather than the standard one (or error, if the probe didn't find any 64-bit extensions). There's three commonly-used 64-bit extensions in the Amiga world - TD64, New-style TD64, and DirectSCSI. The first two are supported, but DirectSCSI shouldn't be hard to add.

I haven't done any testing yet. Its basically impossible to test in hosted, as fdsk.device doesn't have 64-bit support, but adding would mean that DOS would need 64-bit support too (since its a loopback-type device). ata.device for native has support, but that means needing a large FAT partition installed on a real box, or in VMWare, and to do that I pretty much need to install an OS that uses it. So far I've tried FreeDOS which crashed, and DR-DOS which created the partition but couldn't write the partition table for some reason. The next thing to try is Windows 98SE/ME/2000, all of which could use large FAT partitions. The code should be available in tonight's nightly build, so if you want to test before I get chance let me know how it goes.

This morning I started implementing write-back caching. The concept here is pretty simple - when the handler asks the cache to write some data, the cache reports success immediately but just marks the data as "to be written". Then at regular intervals (eg five seconds) it writes all of these "dirty" blocks out to disk in one go. This makes things feel faster for the user, and has the potential to reduce disk activity (== less wear and lower power consumption), at the risk of losing data in the event of a power failure or loss of the device (like pulling the disk out). Typically removable media uses write-through caching (ie write immediately), while fixed disks use write-back.

Since this requires a seperate task that sits and waits and flushes the dirty blocks when called, it means the cache needs locking. Locking will also be needed in the future if a filesystem wanted to be multi-threaded (and the cache is actually in a cache.library, available to all). I've partially implemented the locking - so far there is locking around cache operations, but not block operations.

I hate that there's no way (in most locking schemes, not just AROS) to promote a read lock to a write lock. Usually you have to drop the original lock before taking the write lock, which means there's a moment where you're not holding any lock and someone can come and steal it out from under you. I have a workaround for POSIX threads that I'm using in production code, but it requires condition variables which we don't currently have for AROS semaphores. I think for the cache it won't be a problem, but I'm thinking carefully about it because deadlocks are just too easy.

tuesday, 26 june 2007

posted at 23:34

Status update.

I have a mostly-working test.library that implements a TAP producer. So far it does TAP output, has basic success/fail counters and a couple of other bits. The idea is that it will support the minimum amount of generic primitives to build just about any kind of test on top of, with some smarter stuff (ok(), is(), etc) built as macros on top of that. See Perl's Test::More and my own Tests for C for an idea of where I want to take this.

Nothing much else is happening code-wise, mostly because I'm in the middle of a hardcore development project at work (a virus scanner for Lotus Domino) which is on a tight schedule and isn't leaving me much brain space for other code. I did bootstrap a SDL HIDD for AROS; it doesn't work yet but its something I'd like to take another look at when I get chance.

saturday, 23 june 2007

posted at 01:50

Its nearly 2am, and I'm sleepy. Just a quick one to give you an idea of where this is going:

extern TESTFUNC(init);
extern TESTFUNC(files);
extern TESTFUNC(cleanup);

static TestFunc tests[] = {

int main(int argc, char **argv) {
    struct TestSuite *ts;
    struct TagItem tags[] = {
        { TS_Plan,        (IPTR) 3     },
        { TS_Functions,   (IPTR) tests },
        { TAG_DONE,       0            }

    ts = CreateTestSuite(tags);

    return 0;

This is a harness bootstrap. The details are squirreled away in test.library. Note how simple the harness is - it can be generated from your test sources.

Bedtime. More tomorrow, perhaps.

friday, 22 june 2007

posted at 21:09

In typical fashion, I got bored with the pipe handler. It'll still have to be done, of course, but a couple of things about it became non-trivial, so I had to start thinking and designing, and thats not interesting or fast, so I couldn't be bothered anymore.

What did interest me is a couple of bugs that appeared in the queue: #1740715 and #1740717. Both were described very well and so were easy to reproduce. The first was clearly related to work I'd been doing, as it involved ErrorReport() requesters. The other I just had a hunch about.

Both have now been fixed and committed, which was surprisingly satisfying. Part of my dayjob is triaging and working on calls from users, but I'll admit I try to avoid it and let underlings read it (and I know at least one of them reads this, and I really appreciate your efforts!). These were fun to work on and fix though, and looking through the queue I see a few others that could be taken care of with ease also. I've asked for access to the bug tracking system so I can close calls. They might make a nice diversion when I'm busy and distracted like I have been this week.

Tonight I'm doing a few more odd jobs. I'm poking at Staf's ABI v1 branch, which I've offered to help out with on the DOS and BPTR/BSTR conversion. I'm messing with a test harness which I hope to use to make a test suite for DOS. I'm poking at GCC 4.2 and trying to get an up to date C++ compiler/runtime so I can fiddle with WebKit, various games, etc. And I'm sure there's other stuff I'll poke at before the evening is out. Drop into #aros on and ask me how its going :)

monday, 18 june 2007

posted at 21:06

Today I checked in my pipe code. It consists of the Pipe() system call, the FSA_PIPE and ACTION_PIPE definitions, the implementation in pipefs.handler, and the changes to shell and the C library to use it. Its nice to have it finished off, but then I got on the bus on the way home and realised I didn't really know what to work on. I've sort of forgotten what I was doing before I started on this stuff, but I've also seen a lot of bad code while implementing this stuff and it bothers me to leave it alone.

So, I've decided to reimplement pipefs.handler. I can justify it because it will have to be reworked for packets eventually, and it doesn't actually implement the features and interface of the AmigaOS Queue-Handler which provided the same facilities. It'll also be an excuse to clean up the horrendous code in pipefs.handler.

The 3.9 SDK has a pretty good description of what the interface is like. Basically, you open some PIPE: object and write stuff to it. The data you write gets buffered in the handler. When something else opens the object and reads from it, it gets everything that was read into the buffer. If the buffer is empty when the last user closes it, the object is destroyed. Otherwise, it hangs around in memory until someone else opens it and reads from it (or you reboot).

Pipes are named and system-wide. PIPE: alone without a "name" (or "channel" as the SDK likes to call it) is still named, and will still do the same thing as a pipe where the name is specified. The name can have CON:-style options to specify buffer size, which of course we can extend in the future.

I should be able to copy a fair chunk of code from fat.handler. The number of packets that need implementing is minimal: FINDINPUT, FINDOUTPUT, FINDUPDATE, READ, WRITE, END, IS_FILESYSTEM and of course PIPE. packet.handler will also need a translation for ACTION_PIPE. In theory this seems simple - I hope it turns out that way.

sunday, 17 june 2007

posted at 21:34

A pipe created with Pipe() has two handles. Therefore, the initial use count should be two, not one. Obvious stuff, really.

saturday, 16 june 2007

posted at 15:48

I'm having a wonderful time delving into lots of bits of AROS that I haven't seen before. What started as simply wanting to make IN: work has lead me into the depths of the shell and beyond.

As discussed, I've implemented a new system call:

LONG Pipe(CONST_STRPTR name, BPTR *reader, BPTR *writer);

It returns two pipes attached to the same "thing", denoted by name. I expect that this will be a simple volume specifier with no details (eg PIPE:), but could potentially have console-like options attached to it (say PIPE:4096 to set the buffer size). It could conceivably also take a full path for use with named pipes. Whatever, its mostly there to allow flexibility for the future.

Underneath, I've defined a new IOFS action and a new packet:

#define FSA_PIPE 45
struct IFS_PIPE {
    STRPTR       io_FileName;
    struct Unit *io_Writer;

LONG                      (dp_Res2)
ACTION_PIPE(BPTR lock,    (dp_Arg1)
            STRPTR path,  (dp_Arg2)
            BPTR reader,  (dp_Arg3)
            BPTR writer); (dp_Arg4)
#define ACTION_PIPE 1800

Of course, only the IOFS version is used so far, and I haven't implemented the translation in packet.handler yet. I'll probably wait until the time that its needed - its about eight lines.

I've modified pipefs.handler to handle this new action, and its doing well. You call it with just PIPEFS: as the name, and it creates a single pipe, two handles on it, and returns them. When both handles are closed, the pipe disappears.

Next, I taught the shell about it. Its internal Pipe() function which Open()'d PIPEFS:__UNNAMED__ and did some duplication and mode-changing magic is now gone, replaced by a call to the new system one. To test, I wrote a tiny program called minicat that acts like Unix's cat - opens the named file and puts it onto standard output, or if you don't specify a file, reads from standard in:

int main(int argc, char **argv) {
    BPTR in, out;
    char buf[256];
    LONG len;

    if (argc > 1) {
        if ((in = Open(argv[1], MODE_OLDFILE)) == NULL) {
            Fault(IoErr(), "minicat", buf, 255);
            fprintf(stderr, "%s\n", buf);
            return 1;
        in = Input();

    out = Output();

    while ((len = Read(in, buf, 256)) > 0)
        Write(out, buf, len);

    if (argc > 1)

    return 0;

Running minicat somefile.txt | minicat does exactly what you'd expect. minicat somefile.txt | minicat IN: does the same thing. This confirms it - pipes are working, as are the standard streams. Hurrah!

Something I did notice when watching the pipefs.handler debug output is that when using a shell pipe, the shell actually seems to be closing a side of the pipe that it already closed. I haven't looked into it in depth, but it seems that it closes both halves of the pipe when one of the commands complete, but it spawns the second command with PRF_CLOSE* flags so it tries to close the pipe on shutdown. It can be plainly seen in the pipefs output as the usage count of the pipe drops to -1. Of course at that point the pipe doesn't even exist any more, and memory has been freed. I can only assume that its the lack of memory protection that has allowed this to go unnoticed for so long. I'll dig down into that a little this afternoon.

And the point of all this hacking, if you remember, was to make it so that Type something.txt | More IN: would work. Well, after all this, it doesn't. From what I can tell, it never could have worked, because of the incorrect way it allocates its internal buffers. It tries to allocate enough memory to hold the entire file in memory, but if the file isn't a "real" file (ie its a console), then it just allocates a 64KB buffer instead:

    if (IsFileSystem(filename)) {
        Seek(fh, 0, OFFSET_END);
        new_filelen = Seek(fh, 0, OFFSET_BEGINNING);
        new_filelen = 0x10000;

The problem here is that PIPEFS: is a filesystem (it has directories, named files, etc) and returns 1, but files aren't seekable. Thus, new_filelen becomes -1, which causes an error further down, and the program aborts.

The right way to do is is to test if the handle is seekable. If it is, then More should read the files in chunks and Seek around as the user moves through it. If its not then the only option is to read it from start to finish, so More should maintain its own in-memory buffer, growing it as necessary. That is more than a trivial change though and I'll need to study the More code in depth before deciding whether its something I want to work on right now.

friday, 15 june 2007

posted at 20:54

Last week I got a new workstation at the office. Its a lovely piece of kit - 2GHz Core 2 Duo, 4GB RAM, dual 19" LCDs, etc, etc. I'm very happy with it - its fast and shiny, and its always nice to start with a fresh OS install. Around the same time, the box that does the AROS nightly builds died. Staf Verhaegen, our buildmaster, had expressed an interest in getting out of the nightly build game, and with a grunty box that sits idle for sixteen hours of the day and is hanging on the back of one of the fastest networks in Australia, it seemed that I was in the perfect position to help out.

Last week I got the nightly build process up and running, and I've been doing a full build every night. Its been working very nicely, so from tonight it will start uploading builds to Builds start at 22:00 my time (currently +10, so 12:00 UTC) and usually take 80 minutes to run. So you should all be able to get your fix soon :)

wednesday, 13 june 2007

posted at 08:51

Named pipes are like normal files in that they can be created and deleted. The difference is that when you read from it, nothing happens until something else writes to it. Then the reader gets a copy of whatever the writer wrote. Usually there can be multiple readers and writers.

We need to create an "unnamed" pipe. This is different from a named pipe in the following ways:

  • When the last reader/writer closes it, it disappears.
  • Because it has no name, there's no way to open it again onces its created.

AROS implements it by opening PIPEFS:__UNNAMED__ for read, then using the "duplicate with new mode" misfeature of Open() on this handle to get the write half. Internally, pipefs.handler recognises __UNNAMED__ and sets a flag to tell it to automatically destroy the pipe when the last opener drops off.

This is currently not working because I changed Open("", mode) to be implemented as (sans error checking):

    lock = LockFromFH(pr->pr_CurrentDir);
    newfile = OpenFromLock(lock);

As you can see, it entirely ignores the new mode. The way it used to work before I broke it was to dig the device and unit out of the "lock" (actually a filehandle on AROS), then call FSA_OPEN_FILE with the new mode. I could simply revert to this behaviour, but longer term this won't work becasue locks and filehandles won't be equivalent any more, which means the handle can't be assigned to pr_CurrentDir, and so Open("", mode) won't know what its supposed to be duplicating.

OpenFromLock() (or its 1.3 Open("") counterpart mentioned previously) also can't be used because not all filesystems use locks - there's no guarantee that a lock can be obtained from a filehandle. Of course we could just make sure that the pipe handler does use locks, but that places a fairly big restriction on its internals, making it harder to be replaced.

I've done some research it seems that there's no standard interface for unnamed pipes (where they've even been available). The usual way seems to be to generate a unique filename (based on a timestamp) and use that. It works well enough, but it does require that the pipe be deleted afterwards. There was also something called ConMan that had a ACTION_DOUBLE packet that would return a pair of handles (like the POSIX pipe() system call). I really like that approach, but would prefer to not have to extend the API.

On the other hand I can't see a way to do it without extending the API. For the pipe to be truly unnamed, you need to be able to return two handles from the same call (like ACTION_DOUBLE). Its not a terrible approach.

Do we really need unnamed pipes? The only place its currently used is in the shell (to implement '|'), so could the shell just have it built in? Of course it could, but the implementation would be almost as complex as a handler anyway, and it makes sense to have the function available to other things, like POSIX emulation (pipe(), popen(), etc). Obviously they'd be a good thing to have.

I can think of other ways to do it where the handler could infer the requirement for a private, auto-closing pipe (like two open calls on the same name immediately after each other, one for read and one for write, followed by some other call, but that kind of thing is too easy for a programmer to get wrong, and doesn't read well. I think a direct call is what we want.

I'm going to look at two approaches: An AROS-specific Pipe() call in DOS (LONG Pipe(BPTR *reader, BPTR *writer)) and if that can't work (eg not enough space in the table), a new "mode" like ACTION_DOUBLE. I'll start experimenting tomorrow, most likely.

tuesday, 12 june 2007

posted at 15:38

I started writing an entry about Open() and its edge cases this morning, but it made very little sense, which I suppose is fitting. I got to the end of the day just now, and read back over it, and decided that it was dumb. My brain is mushy anyway after the madness of today, so I'm not going to go into it much.

In short, Open() works like you'd expect (turn a name into a filehandle) except when you call it with an empty name, which makes it open a file using a lock in pr_CurrentDir, unless you're on AROS where handles and locks are the same thing, and then it duplicates a handle. Since the mode flags are taken into account, its not a pure duplication, yet its called on the original object, so the semantics are slightly different to just opening the file again with its original name. This subtlety is why pipes are currently broken.

Further, FSA_OPEN (the IOFS action underneath Open()) with a filename of "/" actually means "open the parent directory", ie ACTION_PARENT under the packet system. Our console handler didn't know this, causing More IN: to crash the system.

I'll shortly decide on the One True Way, and fix everything. Technically its an API break, but this is a non-trivial corner case and it needs to be fixed - its not something that will break most (all) existing programs.

sunday, 10 june 2007

posted at 20:46

So Pavel went on holiday this weekend but didn't want to hold me out, so he graciously offered for us to switch places - I check in my GetDeviceProc() and ErrorReport() patches, and he'll take them on holiday with them and update his code to match my changes. This was done, so the current nightly build should have the new code in it. Type Dir FOO: into the shell to see!

The next DOS work for me is to sort out the whole mess with STDIN: etc. I think I've figured out a way to deal with it. GetDeviceProc() will be updated to return a valid struct DevProc with a new flag set in dvp_Flags, DVPF_STDIO. It should be possible to work out a valid value for dvp_DevProc from the handle - only a maybe under IOFS (due to the device/unit problem), but definently with packets. Most of the time things won't care, but should a calling function need to know the difference then it can. I've already started on this.

I'm also due for a break from DOS, so I'm thinking I might see if I can track down the huge memory leak in Wanderer at the moment. I've got to be careful though - I have an idea for a memory tracker and debugger, but it could take quite a while to design and implement. I can't afford to get too distracted from DOS. I just need to stick to the goal - fix the leak, don't spend more than a few days on it.

thursday, 7 june 2007

posted at 22:32

As mentioned previously, AROS DOS has some special magical device names that don't correspond to any underlying device - IN:, OUT:, etc. Because they're AROS-specific, I get to choose how they're implemented, but I also need to make sure that the right thing happens.

Before my changes recent changes Open() and Lock() knew about them explicitly, and other calls like IsFileSystem() did their own simplified DOSList traversal and so knew how to handle not finding them there.

My recent changes have made these names only known to Open(), much like CONSOLE: and NIL:. Everything else uses GetDeviceProc() which by design only knows about what is in the DOSList. And with my ErrorReport() changes, we now get this:

This happens because of the way More accesses the file. First it calls Open(), which is fine. Then it calls IsFileSystem() to decide if it should do its own buffering or not (which is actually not the right way to test this, but thats not relavent here). Later, it calls Lock() on the name, then NameFromLock() to generate the window title. The calls to IsFileSystem() and Lock() both yield requesters because both those functions use GetDeviceProc() underneath, and those "devices" aren't in the DOSList, so it requests them.

My first thought was that these cases should be like CONSOLE:, and not work. But then I thought a little more and realised this was no good. CONSOLE: always points to a console (unless you've really screwed something up, but then your software is broken), so you can assume its always interactive, never a filesystem, not lockable and nameless - simple. Standard I/O can be redirected however. After you Open("OUT:", MODE_OLDFILE) you don't know if the handle you got back is directed at the console, a file, a pipe, or whatever else. So it is entirely reasonable to do IsFileSystem("OUT:"), etc. A solution is needed.

To fix this requires either teaching GetDeviceProc() about these names or adding tests to every function that takes a name to deal with them. The former sucks because we can't really build a struct DevProc for these names without being able to return its DOSList entry in dvp_DevNode, which might not exist if the filehandle is pointed at CONSOLE: or NIL:. The latter sucks because we need special-case code everywhere - more clutter, harder to read, harder to maintain, etc.

There's a third option: making a new handler and having it handle these names. I do like this idea, but I'm not sure its practical. I'd need to add entries to the DOSList for each name (so six total), but I'm not sure of the best way to approach that. Adding DLT_VOLUME entries is out because then Wanderer will display them. DLT_DEVICE entries could be fine but then we'd be violating the "rule" of one DOS device to one handler task. Unless we made a separate task for each, but then there's six barely-used tasks kicking around in the task list.

There's also complications in the fact that these names need to provide access to the in/out/error streams of the calling process, so they can't really run in a separate task as we need to extract the handles from the process context. Thats not so bad - a PA_FASTCALL port could take care of that.

The whole thing really is fraught with danger, but there doesn't seem to be an easy way out. I haven't thought about it much though, so I'll ponder it some more and see if there's a clean way to add the special-case checking to the requisite functions. And I'll probably need to add "fix More" to my list too, because I need more to do :P

thursday, 7 june 2007

posted at 20:34

Today I rewrote ErrorReport() . The previous implementation didn't handle most of the errors it was supposed to (which aren't actually many), and was in need of the same kind of general cleanups as everything else has needed.

Following that, I started adding error loops into a few DOS functions so that they'd bring up requesters at the proper time. Obligatory screenshot:

So far GetDeviceProc() will ask you to insert volumes and Read(), Write() and Seek() will report nicely if the operation couldn't succeed for some reason (low-level error, disk removed, etc). I've done a few other functions as well and will gradually implement this throughout DOS.

monday, 4 june 2007

posted at 10:25

Just found another problem with IOFS. There's no really good way to determine if two files are on the same filesystem, which you need to know to safely rename files and create hard links.

Under packets, every mounted filesystem has its own handler task, and so has its own message port. You just get the port pointers for the source and target files and compare.

With IOFS, two filesystems of the same type will have the same device pointer, even if they're different filesystems. Internally the device will usually have a seperate task for each, but there's no way for DOS to get that information. If you have handles for the files, then their unit pointers will be different, of course, but these are context for the file, not the filesystem and so they'll be different even on the same filesystem.

The only way I can think to do it is to use NameFromLock() to on each file (and actually lock the passed names first if they're all thats available), then compare the volume names in the returned string. NameFromLock causes many calls into the filesystem, which would make these operations hopelessly inefficient. Admittedly renaming and/or linking is not something you do often, but thats doesn't remove the fact that this interface is broken.

friday, 1 june 2007

posted at 17:16

Quick notes, as I'm nearly home. Open() handles a few "magical" filenames: NIL:, PROGDIR:, CONSOLE: and *. The last two are identical, and should return a handle on the console that the program is running from, regardless of whether or not standard input or output have been redirected. The Guru Book calls it the error channel, but of course it should be valid for input too.

Under AmigaOS, this was (probably) implemented by calling GetConsoleTask() (which grabs a struct MsgPort * from pr_ConsoleTask in the process context) and sending an approriate ACTION_FIND* packet to it to get a handle. Every console has a seperate task, so a single pointer is all thats required.

Under AROS, each console has a seperate task, but due to the fact that IOFS handlers are actually devices and so have a single global entry point, you need two pointers - one for the device pointer to console.handler, and the other for the unit pointer that represents the individual console task. Obviously two pointers can't be stored in pr_ConsoleTask, so AROS introduces a new handle pr_CES that complements the input and output handles pr_CIS and pr_COS, and two new functions Error() and SelectError() to complement Input()/SelectInput() and Output()/SelectOutput().

This arrangement works well enough but still sucks - any time you have to add a new field to a struct it sucks. Of course, this is no different to the myriad other places that this has been done in DOS to support IOFS over packets. There are a few broken bits though: Opening CONSOLE: or * for input will always use pr_CIS (ie standard input), regardless of whether or not its been redirected. Opening for output will always use pr_COS if you open CONSOLE: (same issue), but curiously will use pr_CES for *, falling back to pr_COS if its undefined.

There's also some AROS-specific magical names: STDIN:, STDOUT: and STDERR:, and their short forms IN:, OUT: and ERR:. As far as I can see they're only used by clib to provide Unix-style /dev/stdxxx compatibility. Its redundant though - we have Input() and Output() for exactly this purpose.

Further, Lock() also knows about CONSOLE: and * (but not PROGDIR: or NIL:) and about STD*:, when it shouldn't - Guru Book says these names are only magical for Open(), noone else (except GetDeviceProc() which knows about PROGDIR:).

Oh, and AROS has a real nil.handler to support NIL: (a bitbucket handle), rather than just swallowing data internally.

Thats all. My intention is to fix all this, though I don't know what order it'll happen in. I'm more just noting it in passing while I work on removing DoName().

thursday, 31 may 2007

posted at 22:26

In our last episode I was implementing FileLocks into dos.library. Well its done now. It took a couple of rewrites (if you can call mass search-and-replace operations a "rewrite") as Pavel kept pointing out problems with my implementation (and rightly so), but its done. Trouble is, I can't check it in. It turns out the aliasing of lock functions (like UnLock()) to their filehandle counterparts (like Close()) was actually getting hard-coded into binaries by the compiler. Adding the lock functions back to support locks is fine, but all existing programs are hardcoded to call Close() when they should call UnLock(). Thats fine if your locks and handles are the same, but as soon as they change, Close() suddenly finds itself being handed a lock. With no way to tell the difference, it pokes in places it shouldn't causing a spectacular meltdown. And this only affects, oh, every program ever built for AROS that accesses files. All of them, in other words.

Staf Verhaegen is working on a pile of changes that will break the ABI and API, with a view to marking the ABI "stable" when he's done. I'll include these changes as part of that so that its only one round of user pain, but it does mean that I have to hold off on releasing it, which also means that I can't really do much work on integrating packets. I can, of course, but I really don't like holding on to uncommitted changes for too long - they tend to be painful to merge.

What I can do while I'm waiting though is start cleaning and restructuring our DOS to make supporting both packets and IOFS a breeze. Things like adding appropriate abstractions and such. I've started on this, with the first object of my affections being GetDeviceProc()

This function is the one that I consider to be the heart of DOS. Its quite simple: you give it a full file path, and it returns a structure that contains handle on the filesystem device (or task message port for packets) that has the file on it, and base lock that the file path is relative to. The real magic is that it automatically handles assigns, resolving late- and non-binding assigns on the fly. As if that wasn't enough, it also has the somewhat minor task of getting new filesystems online on demand.

Thats a succint description of how its supposed to work. Reality rarely matches though. In AROS, it does assign handling all right, but doesn't load filesystem handlers. It also crashes if you try to ask for a path without a volume specifier (ie something relative to the current dir) or use PROGDIR: (a magic name for the directory the program was started from). To cap it all off, the code is quite difficult to read, which makes it hard to fix.

Furthermore, there's an internal function called DoName() which pretty much every handler call goes through that does basically the same job, but is much smarter. This duplication is completely redundant, particularly when the more advanced (and correct) functionality isn't accessable to the user. So, I set for myself an initial goal of fixing GetDeviceProc(), then update all of DOS to use it and getting rid of DoName() completely.

I've just now finished the implementation, and as far as I can tell its working very well. It seems to produce the same results as the old version, but also handles the relative dir stuff without crashing. It expands and resolves assigns as expected. It also has placeholders for calling out to a function that can get the filesystem online if its not already. This code is coming - Pavel Fedin has been working hard on some DOS changes of his own to fix up our mount process. These should be appearing in SVN soon, and once its available I'll merge my code and we'll be well on our way :)

Lots to do while I'm waiting though. The next thing is to start moving functions away from DoName(). I'll start small - Open() should be nice and simple.


wednesday, 23 may 2007

posted at 15:48

As noted, I've started hacking on DOS. The first thing on my list is to make it use struct FileLock correctly.

AmigaOS has two types for referring to some kind of on-disk object - struct FileLock, which can reference any type of object (file or directory) and struct FileHandle, which is only used for files, but contains extra information such as a buffer and current position, allowing I/O. Internally it contains a lock to the underlying file as well. For the most part, a filesystem handler only operates on locks, leaving handles to dos.library. (There's a couple of minor exceptions where handles are manipulated by the handler, but its not really of any consequence and so I'm not going any more detail).

When AROS was given its own filesystem API, it also did away with locks as well, using handles for everything. The main functions of the lock - providing a pointer to the handler, a pointer to the underlying file context, and the current access mode - were all added to struct FileHandle, reusing undefined DOS-private fields (fh_Func1, fh_Func2 and fh_Func3). Since the pointers accepted and returned by DOS functions are opaque BPTRs, its not actually an issue for most programs, and so life has continued happily for the past ten-odd years.

Where this system falls down is with the DOS packet functions SendPkt() and WaitPkt() (and indirectly, AbortPkt(), DoPkt() and ReplyPkt()). The problem is simple: under AmigaOS these functions don't deal with locks or handles, but with the message port the filesystem handler uses to receive packets on. That handler is usually obtained by using BADDR() on a BPTR returned by Lock() to get a struct FileLock, and then getting the port from its fl_Task.

This used to be completely impossible, as until my recent work with packet.handler struct FileLock didn't even exist on AROS, so your code wouldn't even compile. Now it does, but if you try to fish fl_Task out of a "lock" you end up with some random stuff that patently isn't a port, and so sending to it just won't work. Of course, AROS filesystems don't take packets and don't use ports anyway, which is why SendPkt() and ReplyPkt() try to do packet conversion (which doesn't really work), but some programs also like to send their own packets. If anything tries to send to the "port" obtained from the filehandle its likely the system will crash (that position in the struct is held by fh_Buf, which is the I/O buffer.

One of the other issues here is that even if locks are used, AROS filesystems don't use ports, so even if we did use struct FileLock properly the fl_Task won't be anything useful, unless populated with a port owned by packet.handler that can do packet->IOFS conversion.

The goal to remove IOFS has come from a few things. Its not adding any real value to have it, we have a handful of minor devices that use it directly (CDVDFS and SFS are both packet-based handlers with IOFS wrappers that coule be easily removed), and source compatibility isn't there. Replacing it however is a big job, so we're taking an incremental approach. The initial goal is to support both IOFS and packets natively inside DOS. The first step is to bring struct FileLock back to life, which I started on yesterday and is nearly done.

To do that these structures have been updated such that struct FileHandle no longer holds stuff to reference the filesystem, but instead contains a pointer to a struct FileLock which does have this information. The lock is held in fh_Arg1, as was always the way under AmigaOS.

With normal packet handlers, the lock contains two fields to reference the file: fl_Task which is a message port for the handler, and fl_Key which is some random data set by the handler that it can use to find the file on disk. IOFS handlers had a similar pair of fields held in the filehandle - fh_Device which is a pointer to the Exec device of the handler, and fh_Unit which is the opaque data. Pavel Fedin, in a stroke of genius that now seems completely obvious, suggested simply storing the IOFS device and unit into fl_Task and fl_Key as a fast and cheap way of bringing FileLock back.

This is fine if you only have one type of handler, but we have two, and so need to be able to tell the difference. Pavel came to the rescue here too - give struct FileLock and extra field, fl_Device. Put the device pointer there, and the unit pointer in fl_Key, and use fl_Task as a flag to determine the type - when its NULL, its a IOFS handler fl_Device is valid, if its NULL then its a packet handler and fl_Device has no meaning (and in fact shouldn't even be accessed, as locks are then entirely allocated by the handler which may have something different here (like struct ExtFileLock in fat.handler) or nothing at all if the handler is using original AmigaOS sizings).

So far I've set up these definitions and reworked the DOS internals (and a few other bits of code around the place that were using IOFS directly) to match. Its mostly a case of renaming FileHandle to FileLock, bouncing through fh_Arg1 to get the device and unit pointers, and of course allocating and deallocating lock structures in the right place. There's a small build issue to figure out (wiating for a reply from aros-dev) but once thats fine AROS should at least start, and then I can begin tracking down the several million edge cases that will probably arise from this.

If you want to see the code, ask me - not checking it in yet because I have no desire to break the tree.

monday, 21 may 2007

posted at 10:10

I implemented SET_FILE_SIZE late last week. I don't want to talk about it, read the code if you need the details. Its a nightmare and I had to rewrite it three times before it was right. It shouldn't have been difficult, as its just an exercise in linked-list management, but as usual it took me a little while to realise this.

Work on FAT is really winding down now, so I'm starting to move into hacking on DOS. The eventual goal for me is to remove IOFS and use packets natively, and to fix up all the boot sequence and volume management stuff. I did the first bit this morning.

The only real stumbling block for using packets over IOFS is the addition overhead of using messages over Exec IO calls. IO calls (via DoIO()) simply calls a device's BeginIO vector - no mess, no fuss. On the other hand, sending a message (via PutMsg()) disables task switches and interrupts, adds the message to the ports' message list, then triggers the interrupt or signal. Later, when the thing listening on the port (ie the filesystem) receives the message, it calls GetMsg() to get the message, which does another round of disabling task switches and interrupts. This overhead was deemed unacceptable by advocates of IOFS.

It is alleviated slightly by an undocumented port type. A disassembly of the Amiga 1.2 exec.library reveals that when the port type == 3, no signalling or interrupt is done but instead a handler function is called directly. I've implemented this as PA_CALL. Its good for compatibility, but still not quite what we want to replace IOFS, as it still disables task switches while it adds the message to the list.

I had a brief discussion with Staf Verhaegen a couple of weeks ago, and we came up with a solution - a new port type that doesn't disable task switches but simply calls a handler function (like PA_CALL) with the message as an argument to the function. This makes it equivalent to DoIO(). You really need to know what you're doing to use it (in particular you don't get your messages from WaitPort() and PutMsg() any longer), but it allows filesystems to be called without any additional overhead (assuming they've been written to support this) and doesn't require any changes in DOS or applications - they just call PutMsg() (or SendPkt()) like normal. I've implemented this this morning as PA_FASTCALL.

I wrote a test program, tests/timeport that sends one million messages to each of the different port types and times how long they take (including a standard PA_SIGNAL reply). The timings are for comparison purposes only, but its still revealing:

8.System:> tests/timeport
testing with 1000000 messages
PA_SIGNAL: 15.10s
PA_SOFTINT: 7.220s
PA_CALL: 3.940s

Now to commit. Hopefully there won't be too much fallout :P

thursday, 17 may 2007

posted at 10:25

After implementing the global lock stuff last week, I spent a couple of hours on the weekend making renames work. Its a naive implementation, which as we know contributes to fragmentation, but it works and was trivial to implement, which is all I care about now.

Since then I've been building support for notifications. DOS has a pair of functions, StartNotify() and EndNotify() that allow an application to receive a message when a file changes. The application passes a structure to StartNotify() that contains, among other things, the name of the file and a port (or signal) to notify when something happens to it. The most interesting thing about it is that the file is specified by name, not by lock or anything like that. Additionally, the file doesn't have to exist at the time StartNotify() is called.

struct NotifyRequest, which gets passed to StartNotify(), has two filename fields in it. The idea is that the caller sets up nr_Name, which is the name of a file relative to the current directory, and DOS then build nr_FullName to contain the full path and volume of the wanted file (expanding any assigns) for the handler to use. nr_FullName is off-limits to the application, and nr_Name is off-limits to the handler. Looking through our code, I found that DOS wasn't setting up nr_FullName at all. We only have two filesystems that support notification, SFS and ram_handler. SFS, being ported from AmigaOS, did the right thing trying to use nr_FullName and so notifications didn't work. ram_handler read nr_Name and built nr_FullName incorrectly itself, such that notifications worked.

The first thing I did was reimplement StartNotify() and EndNotify() to do the right thing. This involved doing calls to GetDeviceProc() and NameFromLock() which apparently is a standard procedure in AmigaOS for building a full path. It isn't used anywhere in AROS however, with work instead being performed by the IOFS code (DoName()). That will change when packets finally replace IOFS inside DOS, so it was good for me to learn.

Once that was done, ram_handler got changed to do the right thing and just use nr_FullName as it should. That worked, and SFS notifies magically came to life too. The stage was set for notifications in FAT.

I setup a new list in the superblock structure to hold notification requests. Each list entry holds a pointer to the struct NotifyRequest that was passed in, and a pointer to the global lock for the file (or NULL if the file isn't currently locked). When a global lock is created, we traverse this list looking for entries with no global lock. If nr_FullName matches the name of the file being locked, a link is created.

This matching process is interesting. Inside fat.handler files a referenced by two numbers - the cluster that holds the directory that references them, and their entry within that cluster. Converting a path to a cluster/entry pair is pretty straightforward - you break up the path, start at the root dir and look for each piece recursively. (The GetDirEntryByPath() function does this). Going from a pair to path is much more difficult - you start in pair directory, get the parent dir, search that dir for the subdir to get its name, then go up and do it again until you can assemble a full path from all the name pieces.

Because of this complexity, it actually works out to be faster when we want to see if a name matches a pair to convert the name to a pair (using GetDirEntryByPath()) and then simply comparing with the wanted pair. Its a shame there's no good way to make it efficient, but fortunately it doesn't have to happen too often.

A notification can be sent by cluster/entry pair or by global lock. The global lock case is easy; we just traverse the notify list and if its lock pointer matches ours, we send the notification. For the pair case, we traverse the list and compare against the cluster/entry in the lock, or if there is no lock, expand the name and compare with that. Both types are needed - when opening, closing or writing a file, there is a lock available (because the application is working with the file). When renaming a file, for example, there no locking occurs, and all we have at that time are cluster/entry pairs.

That pretty much sums it up. The actual implementation was quite simple, again suggesting that the internal APIs are spot on :)

Today I got my laptop back, so got to code on the bus again, which is very nice. I implemented code to actually checked the state of the READ_ONLY flag before allowing anything that might write. I still to need to have it check the disk write protect stuff and make C:Lock work, but now we're getting down into the minutae of this thing. Nearly done :)

saturday, 12 may 2007

posted at 23:14

I got a new mobile phone this week. My old phone (a O2 XDA II mini (also known as a HTC Magician)) has been steadily degrading over the last few months, and got to the point where both the internal speaker and the headset would not produce any sound, of course making it impossible to receive calls. The PDA aspects of the phone I loved, but I still need it to be a phone.

I've now sworn off anything from HTC, as I expect something that costs over $1000 to last longer than 18 months. Looking around a little, I settled on a Nokia N80, which arrived on Thursday. Its a sweet little piece of kit, and its really nice to carry a phone around again. I got a 1GB miniSD card as well, as I'd gotten used to similar capacity on the XDA for my music, which I really need for the bus trips.

I'm now in posession of three ARM devices: The N80 (ARM9E/ARMv5TE), the XDA (XScale/ARMv5TE), and my Nintendo DS (dual-processor, ARM7TDMI/ARMv4T and ARM9E/ARMv5TE). All three have the ability to have additional software installed, the XDA works well (just no sound) and isn't being used, and the DS has some awesome homebrew options (Daniel is showing off some fun stuff in the office). I'm running out of excuses to not port AROS to ARM.

The other cool thing(s) about this phone is the fact that it has built-in 802.11 wireless and a sexy browser based on WebKit. Its been enough to get me interested again in porting WebKit to AROS, which obviously is the major piece required to get a viable browser there.

Oh the time required. Horrors!

wednesday, 9 may 2007

posted at 11:26

Last night I finished refactoring the lock code and checked it in. I'm kinda surprised that its still working. Here's the story.

The original code pretty much didn't track the locks it handed out. It put them in the list of locks held in the volume's DosList entry, and removed them when the locks were freed, but it never looked at them. It shouldn't have been doing that anyway - that lock list is only for when the volume is taken offline (eg the disk was ejected) while locks are still open. In that case, any outstanding locks are added to the DosList. Later, if the volume comes online again, the handler detaches the locks and takes control of them. This is the mechanism by which the Amiga in days of old could request that you "please insert volume Work: in any drive".

This list was being used incorrectly, so it had to change - I have a feeling its responsible for a bug on native where you insert a new disk and both the old and the new volume appears. A real list of all locks is needed by the handler, for a number of things:

  • If an exclusive lock on a file is requested, the handler needs to know if the file is already locked.
  • Certain file attributes (like its length or location on disk) are the same no matter how many locks are open on the file. If one of those attributes change, they all need to be updated. This doesn't seem like it should be a problem, as there's should only be one (exclusive) lock on a file for these attributes to change, however traditionally "read-only" locks (as created by FINDINPUT) can actually be written to, and renaming a file (which due to long filenames may require its directory entries to be moved) should be able to happen even when a file is open.
  • Obviously, locks need to be available so they can be attached to the DosList as necessary.

I did consider just having a straight list of locks, but this meant a search through all the locks every time some shared attribute needed to change. So instead locks now have two parts: a shared part which contains the file's name and protection bits, the location of its first data cluster, and the location of its directory entry, and a per-instance part which has the current seek position and the IO handle. Put another way, the shared part has stuff about the file itself, while the per-instance part has stuff about access to the file.

The shared part (that I call "global" locks) are held in a linked list attached to the superblock structure. Each global lock has a list of per-instance (just called filelocks) locks attached to it. Each one of those has a pointer to its global lock. The system just passes filelocks around as normal (and out to DOS and back), but goes to the global lock when it needs some file data.

Now that this is in place, all the above things can be implemented. Exclusive locks are already done - when an attempt is made to obtain a lock, the global lock list is checked. If the caller requested exclusive and a global lock was found, the operation fails. Renaming now should be trivial - if the new name won't fit into the existing directory entries, then the existing ones are blanked, new ones created, and the entry attributes in the global lock are updated and seen by all filelocks).

The DosList stuff is on hold. I'll get there, its just a few steps down on the my list. The next thing I want to do after renaming is done is to implement notifications. File change notifications are done by filename, not lock, and there can be more than one per file, so I need a place to store them even if the file isn't open. This is now trivial - the notifications get stored in the global lock (which will be created if a notification is requested and the file isn't open).

So now I've written this, and it still makes sense. Weird, it felt so hard at the time.

saturday, 5 may 2007

posted at 14:23

Since fat.handler reached a pretty significant milestone I felt like I needed a break before getting into the excitement that is refactoring all the lock code (sigh), so I've just finished picking off an item on my todo list - faster bitmap scaling.

Regular readers may remember that last month I found my old port of JJFFE. I released the code and Kalamatee added code that allowed the tiny window to be resized. This worked nicely on native, but on hosted things went from a pleasant smooth-scrolling full-speed affair to a glacial one frame every few seconds - completely unusable.

At the time I did some digging and found what I believed to be the cause. The bitmap scaling code in graphics.hidd did its work by copying into the underlying hardware (or X11) driver a pixel at a time. For most of the hardware drivers, this merely poked values into the hardware framebuffer, and so worked quite quickly. The BitMap::DrawPixel method in x11gfx.hidd is incredibly slow for multiple uses though, having to lock, then plot, then flush the image, then unlock. This was happening for every one of the thousands of pixels in the image, and FFE was trying to do it every frame. Naturally, this is suboptimal.

So, I spent some time yesterday and today nutting out a fix. There may be a problem with the speed of DrawPixel on X11, but it made more sense to me to try and reduce the number of calls that the basic BitMap::BitMapScale method was making into the underlying hardware driver. The solution I decided on was to scale the image in memory and then push the new image to the hardware in one hit.

It took me ages as I know virtually nothing about the AROS graphics system, but I managed to get something working. It uses the same naive scaling algorithm as before, but now it calls BitMap::GetImage on the source bitmap to get a raw byte array, creates a second raw byte array, copies and scales the image into it, then calls BitMap::PutImage on the destination to write it out. As far as I can tell, its working perfectly.

I won't commit the patch yet because I'm quite unsure of myself and I want to get a couple of people (particularly on native) to poke at it and make sure its sane. Nonetheless I'm still quite proud of it. Unfortunately the scaling it produces looks quite ugly, particularly on FFE, so if I get into hacking on FFE some more I'll probably start looking into to smarter scaling algorithms.

Update 8/5/05: This patch is now in the nightlies. Kal reports that Wanderer with a scaled background now starts instantly rather than with a brief pause, and FFE has gone from 8 to 20 frames per second on his native build. Awesome!

thursday, 3 may 2007

posted at 21:49

Support for creating and writing files was pretty much finished on the weekend, but my power test of copying the entire system to a FAT partition would cause things to explode in myriad ways, but not until a few hundred files has been copied. I've spent this week tracking down and fixes these problems:

  • There was some problems in my linked-list managment code inside the cache which would occassionally cause an infinite loop.
  • A different bug in the same code would sometimes cause a segfault as a NULL pointer was dereferenced.
  • Following that, I noticed that something was leaking cache blocks. I implemented a function that dumps the internal cache state including block use counts after every FAT operation. The usage wasn't coming down properly when create directory entries (ie making a directory or new file). Turns out I wasn't releasing a directory handle when finished.
  • Next, writing file contents was leaking blocks too. Again, I wasn't releasing a directory handle when updating the file size in the directory.
  • Finally, I skimmed through and found a number of other places in error handling code where directory handles weren't being released.

So, I'm pleased to announce that file writing is stable. I haven't seen any corruption in my testing, but obviously take care on real data (like your Windows 98 partition). As I noted previously, there's still a few little things left to implement to declare write "done", but for the most part you won't care ;)

sunday, 29 april 2007

posted at 00:56

I'm currently checking in some fixes that I believe will stabilise the file writing. There was a few places where I'd got the math wrong when generating directory entries which would sometimes result in weird corrupted filenames (depending on what was on your disk before I incorrectly wrote directories onto it).

Assuming its right, things are really starting to wind down on this code. Here's a vague todo list, for my reference and yours :P

  • Implement RENAME_OBJECT and the SET_* commands. SET_FILE_SIZE is the only interesting one, because it means allocating or freeing clusters to match the size. None of them difficult.
  • Implement file notifications. This is a mechanism where an application requests to be notified when a file or directory changes. Conceptually its not difficult - just a list of listeners attached to every lock (and shared between duplicate locks). I think it would get a little hairy if I did it with the lock management functions in the state they're in, but I haven't started my cleanup of lock.c yet so I'll make sure I factor it into my planning.
  • Implement write-back caching.
  • Cleanup the DOS-side code (mostly main.c and disk.c)
  • Break out the rest of packet.c into ops.c
  • Make sure locks are being tracked properly. This should resolve the disk change issue on native too (where two volumes appear when you change the disk).
  • Fix the crash in native where you try to use the handler directly with DF0: (though technically this is an issue with the way DOS and packet.handler interact.

After that its time for the next thing. Volume manager, ext2 or something else - I'm not sure yet.

saturday, 28 april 2007

posted at 22:09

As of yesterday you can now create files. Its still not perfect - you can break it by copying large numbers of files in one go (eg Copy ALL C: FAT:), which will give you some broken filenames. I'm working on it.

Fairly unmotivated to write right now, but I wanted to give an update. More later.

tuesday, 24 april 2007

posted at 14:27

As I've been refactoring fat.handler I've noticed that its gradually change from its original spaghetti into a fairly layered setup - the I/O cache, the FAT, the directory code, the file read/write primitives, then high-level operations code ("create a directory", "delete a file", etc), with the OS interfacing (packet handling, volume management, etc) at the top. This wasn't intentional, but tells me I'm probably thinking about it the right way.

The OS interfacing code should actually be identical for all filesystems, which begs the question - why does every filesystem have to implement it? They shouldn't have to so, just as I intend to seperate out the cache into a library, I also intend to build a library to sit between the OS and filesystem. You could argue that this is redundant, since DOS already provides a filesystem interface. I don't intend to change that though - I'm not going to replace packets, but instead provide some generic handling code that will work for most of what you want to do. If its not suitable, then don't use it - handle the packets yourself.

I think my operations code is the beginning of the model for this. Essentially, the packet handler will accept packets, decode the arguments (ie convert BCPL pointers/strings), ensure that they're sane (eg make sure locks belong to us), then call a function in the filesystem for the requested operation.

I figure the initialisation interface would be something like:

    fs = CreateHandler(FS_Read,      (IPTR) FATOpRead,
                       FS_Write,     (IPTR) FATOpWrite,
                       FS_CreateDir, (IPTR) FATOpCreateDir,

(plus other stuff for setting options or whatever).

Any unspecified operations will result in a "not implemented" error being returned to the caller.

Further, the library would do plenty of checking and munging of arguments so you can always be sure of what you're getting. Locks will always be guaranteed to belong to the filesystem. BCPL strings and pointers would be converted to their C equivalents. Deep subdirectory ops would be fixed up so that every function wouldn't have to know how to parse and munge paths. If I had all this stuff, fat.handler would be vastly simpler than it is now, and when I eventually implement ext2, I wouldn't have to copy/paste anything but just implement the specifics of that filesystem.

Another option for the interface might be to create a "no-op" filesystem base class and have filesystems subclasss it. Its conceptually the same as the above but perhaps the interface is better. I haven't really looked at oop.library so I don't know yet what I'll do with it, but I will experiment further.

friday, 20 april 2007

posted at 12:54

I implemented DELETE_OBJECT today. As it suggested, it deletes things - its the power behind the C:Delete command. I'm actually quite pleased with how straightforward it was to put together - it suggests that I have the internal API for updating directories and writing things right. The process is to find the entry, delete it, delete any associated long name entries, and then free all the clusters the file was using. In practice its slightly more complicated - the file can't be in use, if its a directory it must be empty, etc - but its mostly quite pleasant.

One big philosophical change I made today was to make it so the current directory (.) and parent directory (..) entries are never ever exposed to DOS. It was just confusing things - DOS has its own understanding of how to move to a parent directory (/), and thought that moving to .. was moving to a subdirectory. It meant having special checks everywhere to make sure you didn't try to actually operate on these entries (eg try to delete them). In the end it makes sense if they don't exist, so now the internal function TryLockObj(), which looks for files by name and locks them, will always return ERROR_OBJECT_NOT_FOUND for one of these files. Similarly, GetNextDirEntry() which is used when enumerating all the files in a directory will skip over the dot entries. The only place now where the .. entry is used is in the internal GetParentDir() function, and it finds the entry manually.

Removing input checking code while making things less confusing for the user is not something you get to do often, so I'm pretty happy with the change :)

thursday, 19 april 2007

posted at 10:17

I put in a marathon day of code yesterday - perhaps six hours by the end - and finally got directory creations working. The process is actually quite complicated as you have to handle all the weirdness that makes FAT so wonderful. First you have to take the name, and figure out how many directory entries are needed (FAT stores its long file names across multiple entries). Then you search the current dir for a gap of that many entries (gaps happen when files are deleted) or move the end-of-directory marker to make room at the end.

Having found space, you then generate the short name, comprised of what FAT calls the "basis name" and the "numeric tail". You've probably seen this if you've used disks created in Windows on older system like DOS - a file called "mars attacks.html" gets converted to "MARSAT~1.HTM". The conversion process is non-trivial. After storing the short name, you then cut the long name up and store it across multiple directory entries.

At this point the name exists, and will turn up in a directory listing, but the job isn't done yet. Next we have to allocate space on the disk to store the directory contents, and put three entries within it - the "dot" (.) entry, pointing at the new directory (ie pointing to itself), the "dotdot" (..) entry, pointing to its parent, and the "end of directory" marker. Once this is done, we report success back to DOS and the calling application.

My code isn't perfect yet. Most significantly it doesn't do all its error checking and its possible for the filesystem to get into an inconsistent state if some lowlevel error occured (like a hardware error). It also hasn't been well tested - its undoubtedly trashing my filesystems in every interesting way. But it appears to work, and thats the most important thing. Creating directories is also the hardest bit of doing write support - the rest shouldn't take long to implement!

tuesday, 17 april 2007

posted at 09:14

I'm about to go out, but here's a couple of screenshots to demonstrate the progress of the last couple of days.

First, writing works in the most minimal sense. FAT volumes can now be renamed:

Its not much, but it proves that the underlying write infrastructure (ie WriteFileChunk() and cache_mark_block_dirty()) is at least slightly correct. I'm hoping to be writing files before the end of the week.

The other is that you can now mount your FAT volumes under native. It wasn't working because I made some bad assumptions about where the boot block lived, but once I straightened that out it did just fine:

It takes a little bit of messing to set up. If you want to try it, grab the latest nightly (or you may have to wait until tomorrow; I'm not sure if it made it in to last nights' build) and edit DEVS:DOSDrivers/FAT0. You'll need to get correct values for LowCyl and HighCyl from HDToolbox (or DOS or Unix fdisk or whatever). After that it should just be Mount FAT0:. Kalamatee is planning to add smarts to the installer to take care of detecting your partitions and writing the mountfiles when you install AROS.

Taking Francesca to the park and the creek now. I should be around in #aros in a few hours.

sunday, 15 april 2007

posted at 12:29

I'm on my break! Today is just the second day, and still the weekend, so I haven't really done much of anything yet, but it'll happen - all this time is making me giggle :)

I did massive amounts of cleaning on fat.handler during the week. The cache is being used for all I/O now. It no longer loads the entire FAT into memory which should save quite a bit of memory on large filesystems. That forced me to get into the weirdness of FAT12. As the name suggests, each entry in the FAT is 12-bits wide. So there's no wasted bits, two entries are stored in three bytes. This requires a little bit of math to extract the entry you want, which gets even more complicated if the two bytes needed are split across different disk blocks. The original code never had to deal with this because it had the entire FAT in memory in one long chunk - there was no cache blocks. I'm pretty sure I've got it right - its reading things correctly at least.

So now I've started adding the necessary bits needed for write support. The cache has new functions for marking blocks dirty, though I still have to implement the actual writing stuff. It will have the two standard cache policies available - "write through" where blocks are written immediately when they're marked dirty, and "write back" where some job fires up every now and again and writes out any that have been marked dirty since last time. Writethrough is easier to implement and safer anyway, so I'll just do that for the moment.

I have new (untested) code for writing bytes to a file, which is nearly a straight copy/paste of the read code - I'll have to do something about that. The only major difference is that when it reaches the end of the file, it allocates and links another cluster rather than returning "not found". My algorithm for finding an empty cluster is completely stupid at the moment - it just searches the FAT from the start, every time. Eventually it will start looking around based on where it found a free cluster last time it looked. I'll also be allocating multiple clusters at a time under the assumption that we're no where near writing the end of the file yet. This reduces fragmentation. Obviously if they aren't all used before the file is closed, any left overs get marked as free again. All this is for later though.

Once we're able to write files I'll start looking at formatting fresh FAT filesystems. There's a pile of options that can be provided when creating a filesystem, but the existing ACTION_FORMAT command is woefully inadequate (being designed for OFS/FFS). I had an idea to allow things to query the handler for information about extended commands and options it supports. The handler could then return a ReadArgs format string to the calling process detailing the arguments it can take for commands (format, check, etc), allowing them to tailor their interface for each filesystem without them having to know the specifics of each themselves. This is something I'll look at in a bit more detail when the time is right.

Time to go put the daughter to bed, then I'll be back into it.

wednesday, 4 april 2007

posted at 07:55

This morning I checked in rewritten fat.handler directory code that uses the new cache. The code is now much cleaner, and I hope more readable. At the very least its well commented and I can read it, and so can work on it. In terms of features, nothing has changed from before (except that you'll now get masses of debug information). Its an important first step though.

The next step is to rip out struct Extent and replace with naive calls to the cache. The original point of extents (and libDeviceIO) was to make it easy to request multiple disk blocks in one IO request, which enables the disk driver and/or the disk itself to optimise the request. This ability was removed when I switched over to using the block cache. I will be bringing it back later as its basically a required feature, but it makes the code more complicated. For now I just want a nice, clean naive implementation.

Following that struct ExtFileLock and the locking code will get an overhaul, which will probably lead to the packet handling code getting some work as well. All this is good. Once the entire filesystem is in a clean and stable state I'll begin work on write support. While not simple it will be vastly easier with the new code than it was before, as I'm trying to design things with writing in mind.

In other news, I was digging through old hard drives last week and found my port of JJFFE to AROS that I did a couple of years ago during my previous dive into AROS. I released the code not expecting to work on it again, but then Kalamatee did some great work to make the window resizable (though the changes make the whole thing unusable in hosted; might be a problem in AROS' X11 graphics code). That was enough to make it interesting to me again, so I'll probably keep it as a side project for when I need to clear my head. I'll set up a repository for it somewhere shortly.

Holidays are coming. I'm off on a camping trip this weekend (the Easter long weekend), and then taking two weeks off work from the 16th, and going away on the 21st and 22nd for my anniversary and birthday. By the end of the month I should be well rested and have a clear head. Of course work is still going to be insane, thats just the deal this year, but at least I'll have had a chance to reset. I'll be writing lots more code during the long break, I expect :)

thursday, 29 march 2007

posted at 20:57

Work is absolutely insane at the moment so I come home very tired, which means I'm only getting an hour or two a day to work on AROS. Despite the glacial pace I'm pleased with the progress I'm making.

I've completed the buffer cache (though the write portion is currently untested) and rewritten fat.handler's internal FS_GetBlock() function to use it. Its working just fine insofar as reading files is working. Its fairly mundane as there's only one user of it, but as time goes on it'll get more heavily used. I still have hopes for it eventually becoming a system-wide service (eg a cache.resource).

I'm currently pulling apart the rest of the code to remove all traces of its own caching, instead making it utterly dependant on the buffer cache. The first on the block is the directory code. DirCache and Extent code is being removed and the API is being changed to make it very much like the UNIX opendir() interface. You get a handle on a directory, and then call other functions to get individual entries, iterate over it, etc. This is much cleaner that what the handler had before, and the lessons learned here will serve well when cleaning up the other parts of the handler. At the end this will effectively have been a rewrite, but well worth it in my opinion.

I'm back in reading up on filesystems as well. I think the next thing I tackle once FAT is done (so still a way off) is ext2. It really is straightforward and I'll be able to borrow most of the FAT code anyway if I do it right.

I really do hope to have something to demonstrate soon. I get next Friday through to Tuesday off for Easter, then back to work for three days, then two weeks off for the Christmas vacation I never had. That should give me plenty of time to cut code, even allowing for the camping trip and the anniversary :P

sunday, 25 march 2007

posted at 14:52

I finished implementing the read side of the block cache yesterday, but I haven't even tried to compile it yet. I really don't like it. The logic seems to make sense, but it just feels wrong. I've learnt to trust that feeling. So I did the unthinkable instead. I actually did a little research into block/buffer caches to see what has come before.

Turns out I was on the right track initially. Tanenbaum's book described a scheme very similar to what I had originally devised on my own. Basically, we allocate a pile of block structures, which are the raw block data plus flags and other meta data. All the blocks are stored in a linked list, which is hooked up to a hashtable of which the key is the bottom N bits of the block number. There's also a double-linked list of recently used blocks.

When a block is asked for, the hash bucket index is extracted from the block number, and the corresponding block list is searched for the block. If its found, a use count is incremented and the block returned. If not, it has to be loaded from disk. A block gets recycled from the recently-used list, and if the data in there is dirty, it gets written out. Then we read the data from disk, and mark the block as used.

Earlier this seemed unworkable to me, but it seems this is fairly standard, at least as a first cut. The important bit is the low-level process running through the block list and regularly pruning and writing things to disk. Fortunately the whole thing is much easier to understand, so hopefully it won't take me long to have it written and the code will be readable this time. Initially I'll implement for single-block I/O, and later extend it to do multiple blocks where possible. This is also reasonable well established in OS theory - the concept is called "precaching".

Anyway, thats all. Back to my relaxing weekend - rebuilding my garage PC and playing Mariokart :)

wednesday, 21 march 2007

posted at 22:32

I'm currently on a tangent within fat.handler. I started to rip out the caching, but because I hoped to add it back in later in some form I started making everything that needed disk data call a single function that gets the block from disk. Once I had that though it really just seemed easier to actually implement a basic cache right now, that could be expanded into something for suitable for use by all filesystems later.

My basic requirement was that at all times the cache knows exactly what the contents of a block is and whether it needs to be written out or not. For this reason, I've decided to go with a model where there is only ever a single copy of a block in memory, each with a reference count. That way if one part of the code modifies a block, other parts of the code will see those changes and not have an out-of-date copy. And at all times, the cache can know what's going on.

The cache will maintain a list of dirty blocks and write them out at regular intervals based on some (configurable) criteria. Basically, it'll do all the hard stuff. The filesystem should just be able to say at any time "I need block X" and the cache will sort it out.

To do this I need an efficient data structure to store the blocks. My first thought was a kind of hashtable where without the hashing bit - just modulus/bitmask the block number. We threw it around the office over lunch and did the maths and it turned out that the overhead would be huge. B-trees (specifically, B+trees) looked to be the way forward, so I spent quite a bit of time trying to implement one.

I used to be a CS major, but for some reason I just can't work with proper algorithms and data structures, only wacky hacks. I still haven't been able to make my b-tree work, but thinking about it further I realised that a flat array and a binary search will actually do just as good a job in this case. B-trees really shine when the nodes are stored somewhere where its slow to get at them (on disk). When its all in memory its advantages are much reduced.

Again, my brain conspires against me. It took me about three hours to implement a basic binary search over an array. I'm sorely disappointed in myself - this stuff is supposed to be childs play. At least it works. The basic stuff is in, with reference counting and all the rest. The array is currently eight bytes per entry - four the key int, four for the data pointer. That may go up to twelve if I end up needing a 64-bit type for the key, but the overhead is still minimal. The entries get allocated in chunks (which will probably be configurable), and grow (and probably shrink) the array as necessary.

Tomorrow I'll start adding actually loading the blocks from disk. After that I should be able to start refactoring the handler code properly.

monday, 19 march 2007

posted at 09:37

Life is completely mad at the moment. I've still had a little bit time to write code, but work is full on and after hours is crazieness too, so I haven't had a lot of time blog. Let see if we can get up to speed.

I'm working to finish off fat.handler. That pretty much means write support. In the last week I've implemented the remaining "read" bits of the handler. File protection bits are mapped as best as is possible, file timestamps are converted and the free space on the volume is reported properly.

In addition to this I've been reading as much as I've been able to find about the filesystem format. Between that work, the official Microsoft documentation and the Wikipedia entry, I think I have a pretty good idea of how things are supposed to work. Its actually a nice filesystem to be learning on - nothing too fancy but enough clever bits to make it interesting.

One thing I'm not understanding too well is the existing structure of fat.handler. I think I follow it, but it really seems to be a quick-and-dirty job for read access. It caches lots of stuff, which is fine - caching is a good thing - but the structures that are used for caching don't really seem suited to writing changes back to disk. The code is also quite spaghetti'd, and isn't well documented at all, which makes for a pretty unpleasant experience.

One of my plans for filesystems is a generic block-caching layer, which will make it so that filesystems won't have to cache blocks themselves at all. So, my intention is to remove all caching from fat.handler and hit the disk every time. This will make things vastly slower, at least initially, but it will let me understand the code enough to implement write support. I can then add caching back later once I have everything clean and documented (and working!).

I think I'll write something soon about those future plans. Basically it'll be a busy year, but my plan is to come out at the end with AROS having rock-solid disk/file stuff. That would make me happy :)

tuesday, 13 march 2007

posted at 14:56

A couple of screenies before I dash home. I added seperate DOS types for the three FAT types this morning, and updated C:Info and DiskInfo to cope, so we get this:

I'm continuing to cleanup the code in preparation for adding write support. Time for the bus and more code!

monday, 12 march 2007

posted at 21:02

Got back to work today after a few days off. No bus trip means limited dedicated coding time so I've mostly been writing email for the last week. Its nice to write some code again :)

Also got some movement on the subjects of those emails. On aros-dev I've been trying to get some consensus on the whole packet API issue and what to do with the IOFS system. Opinions ranged from doing nothing to removing it, but noone had really been willing to step up and throw their support one way or the other. I had a good chat with Fabio Alemagna in #aros last night, and we finally managed to establish some direction. Basically, we're in agreement that the DOS API needs to returned proper FileLock structures where necessary (ie according to the AmigaOS autodocs), and the IOFS layer does nothing for us. We ran to time and so didn't quite reach a set of actions or anything like that, but I'm feeling much better about making significant changes to DOS and whatever else should I want to.

Meanwhile, over on teamaros I've been trying to get the bounty description clarified, as per my last post. Damocles came to the rescue, posting a simple "I think Rob is done, lets test and then hand over the cash" which I really appreciated. All going well I should get paid next week sometime :)

The pot has gone up in the last few days too: its now at US$440. I was a little bemused by the fact that people would continue to throw money at something that was already in progress, but my wife pointed out that its very much a cultural thing: Australians will generally take as much as they can for as little as they can get away with, whereas in many other places in the world the value of something is often considered independently of other factors like accessibility. Personally, I'm just humbled by the incredible support I've received while working on this project: bounty donations, blog comments, emails, and so forth. I really do appreciate the support guys, and find it very motivating. Thank you!

Of course I'm not leaving things here. Packets or not, we still need working filesystems. I'm currently reading the FAT spec and will shortly be working on implementing write support. I'm actually quite pleased that such detailed specs are available - Microsoft have a bad rap in this area, but this doc certainly doesn't lack for anything.

Following that, I'll be looking into other filesystems, and possibly revisiting my FUSE work. And since the EFIKA bounty has now been assigned, I find myself with less distractions and newly refocused on getting decent filesystem support in AROS. I hope it lasts :)

friday, 9 march 2007

posted at 18:42
From: Robert Norris
To: TeamAROS
Subject: Changing the DOS Packets bounty

Hi Team,

I've reached a block on the DOS Packets bounty that I can't work my way
around, so I need this group to help me figure out what to do. Here I
outline the issue and suggest a resolution.

I've taken the existing bounty description to mean full support for DOS
packets on both the device/filesystem side (ie allowing us to compile
and use existing packet-based filesystems from AOS/MOS) and the API side
(ie DOS calls like DoPkt()).

The former is largely complete via packet.handler, as demonstrated by
the availability of fat.handler. The latter I believe to be impossible
to complete without either the removal or a significant redesign of the
AROS-specific IOFS system.

If I'm reading the bounty correctly, then I can't complete it. If the
powers that be (ie the designers and/or advocates) decide that IOFS
should be kept, then the packet API can't be made to work. If they do
decide that it should be redesigned or removed, then its huge amount of
work that I can't complete before the bounty deadline.

On the other hand, if the bounty doesn't include the API, then I'm done,
but I don't feel like thats really fair to the people who thrown money
at this. The expectation was that with DOS packets available, Marek's
filesystems would be ported shortly after. Since that won't happen, I'd
feel weird about taking the cash if people haven't at least got
something close to what they paid for.

So, I propose modifying the bounty so that its clear that it doesn't
include API, but includes a completed fat.handler that supports writes.
This way people who put up cash at least get something tangible at the
end of all this.

Of course something still needs to be done about IOFS, and I'll be
pursuing it further myself, but I think it needs to be outside the scope
of this bounty.

So to summarise, I'd like the bounty to read as follows:

 - ability to compile and use existing packet-based filesystems
 - a working port of FATFileSystem, extended to support writes
 - a porting guide to assist developers porting filesystems

Deadline would remain the same: 31 April 2007.

What do you all think?


wednesday, 7 march 2007

posted at 22:33

The response to the packet stuff has been great, mostly because people can now read their FAT disks. Its exciting!

I have been tweaking a few things since the big release. There was one tedious problem that was preventing NameFromLock() from working correctly, which meant that anything more than a naive file copy wasn't doing the right thing. It turns out that NameFromLock() calls FSA_OPEN with a base handle point to a file rather than a directory like it should be in order to get a handle on the directory the file is in. This maps closely enough to ACTION_PARENT, so I've added code to detect that case and do the right thing. Its a hack, and really should be fixed in DOS, but it'll do for now, and makes this possible:

After a bit of discussion on the mailing list, I've also implemented an autodetection method and so got rid of that stupid "type = packet" thing. Its pretty straightforward - DOS tries to open the handler as a device, and then when it fails, packet.handler has a turn at it. Its naive and has a little overhead (loading the binary twice), but it works nicely.

The next thing to do with packets is to implement the API. I thought this was going to be the easy bit its a nightmare. I'm not going to go into too much detail here - join the mailing list if you want the gory bits. Essentially if you take the required changes to their logical end, it means getting rid of the IOFS system entirely. Now I have no problem at all with this - as I've said before, IOFS really offers nothing compelling over packets - but I think getting rid of it is going to take quite a bit of political wrangling that I'm really not interested in.

Its tricky. The way I read the bounty requirements I really need to get the API side working, but that really can't happen easily, and I'm worried I won't have time for it. But I'm still waiting to see the outcome from the mailing list; hopefully someone will make a decision. I may have to force the issue - that should be entertaining :P

monday, 5 march 2007

posted at 13:29

Hmm, haven't posted for a little while.

Last night I dropped all of the packet code in AROS SVN. That includes packet.handler, fat.handler and the various core changes needed to make it all work. A few people had been asking to test, and I want some feedback, so it seemed like the right time. I've already had a few questions and feedback which I'm working through, but generally the vibe is good :)

I heard back from Marek about getting updated FATFileSystem code. He sent me a patch with some bugfixes for the version I have, but unfortunately for us isn't able to give us more recent versions (ie write support) as he now has a commercial arrangement for his code. It did give me a little closure though - I now know the direction to take, and aren't waiting anymore. I hadn't realised that it was such a burden, but it was - Saturday was a much happier day because things weren't up in the air anymore.

I started trying to port another filesystem (SFS), and found its still pretty painful, because most handlers want to directly manipulate BCPL strings that they expect to recieve. So, to make it easier I've made packet.handler convert C strings to BCPL strings before passing them on to the packet handler, regardless of whether the AROS core is actually using them. Its introduces a small overhead but is probably worth it to make things easier to port.

Here's my TODO list for packets. It may be extended later, as I find things, but this will do for now:

  • Implement remaining IOFS->packet conversions
  • Fully test write/modify commands. This will require a filesystem that supports them, so I'll need to port another handler, probably SFS. This will also be useful as a "second implementation" to confirm that packet.handler is suitably generic.
  • Write a porting guide
  • Investigate/implement partition auto-detection
  • Complete and cleanup packet->IOFS conversion in dos.library *Pkt() functions

The hard stuff is out of the way, its mostly mopping up now.

wednesday, 28 february 2007

posted at 14:25

It seems every week there's a new discussion about how to bring a proper browser to AROS. I've seen talk about how to structure the bounty (if we have one), which codebase to use, whether it the source should be closed or open, and so on. Here I give my take on the whole thing, and offer a proposal for how to proceed.

In my opinion the requirements specified by the previous bounties have been seriously mis-stated. The current bounty to port MOS KHTML has a whole $10 attached to it. The bounty for AWeb Lite is better, but its not a browser that is going to meet most of the requirements people have. There's no question that a browser is important, and I think that plenty of people would be willing to drop some cash on it if they knew that they'd get something modern and usable out of it.

The main thing to consider is what the users need. Users don't care how their browser works or what software lies under it. All they care is that they can visit the websites they need and do whatever they can do elsewhere. To that end, I wouldn't mandate any particular codebase, but instead require that the browser support some set of technologies (CSS, JavaScript, XMLHttpRequest, etc) and/or some set of sites (AROS-Exec, Slashdot, GMail, etc).

There's other factors, of course. Ideally such a browser would be in two parts: a library that can be embedded in other applications, and a UI that uses that library. It would be wonderful if the UI was Amiga/AROS specific, meaning that it uses Zune, datatypes, and generally fits the rest of the system.

I also feel pretty strongly about it being open source. I've seen proposals that Sputnik, which is partially closed, be ported, but there's two problems with that. The first is that by offering a bounty for it we're pretty much limiting the bounty to the original developer - no other developer can ever hope to take on the task because the source isn't available.

The other problem is about maintenance. A closed-source browser means that we're beholden to a single developer/vendor for updates. The web is a fast-moving place. To keep our browser up to date and thus still useful for anything a user might want to do, it will need to be updated. It will need patching, particularly for security bugs. It will need new features. Without the source available, we can't have a team of people contributing, so things won't move as quickly as they could, and should the developer decide to abandon the project, then we're screwed.

(There's a certain irony in the fact that a project that exists to provide an open alternative to a closed system that was abandoned by the vendor for a number of years would openly embrace the potential for the same fate for a fairly fundamental piece of technology).

Its worth noting that this if done correctly this bounty could actually end up getting quite large. Not limiting the potential developer base and code base means that outside developers could take this project on.

So, my proposal is this. Close the existing bounties for web browsers, and start a new one. Put money from the previous browsers into that. The bounty requirements will be to produce a browser that has Zune-based UI, can usefully access a number of common sites (listed, including "hard" things like GMail), be reasonably standards conformant (with links to pages that can test this conformance), and has a clean seperation between the engine and the UI.

Note that the browser UI does not need many features. I'd settle for straight browsing - no bookmarks, no sidebars, and so forth. The rendering engine is where most of the complexity is, but there's a few excellent rendering engines available (such as WebKit, KHTML and Gecko), so most of the work has already been done. Features can be added as part of normal development outside of the bounty.

Of course, there's no requirement that one of these engines be used, as long as the result works.

Because the bounty is deliberately light on technical details, I'd recommend that anyone applying be asked to show how they intend to meet the requirements. That would mostly be showing what engine they intend to port.

And if any of the TeamAROS crew are reading and seriously thinking this is good idea, I'd be very happy and willing to act in some sort of sponsor role for this bounty, working with whoever ends up taking it on to make sure the requirements are going to be met and giving them whatever assistance they need. I'm happy to do any and all legwork on this, actually, because I consider it to be of great importance.

On the other and, we can just carry on as usual, and eventually I'll port WebKit and that will be the end of it. I'm already idly working on this. So idle, in fact, that I'm not actually working on it, but I do poke at the makefiles every now and again.

wednesday, 28 february 2007

posted at 13:13

I mentioned in a comment that FUSE-based filesystems could be fairly easy to port since they're not in the kernel and therefore shouldn't have tight dependencies on the kernel VM subsystem. I've had a chance today to investigate this a little further.

FUSE-based filesystems run as standalone processes. In their setup they pass a single object to the FUSE library that contains pointers to handler functions for each filesystem operation. They then call a mainloop function and let the library dispatch filesystem calls from the kernel appropriately.

To get this on AROS, we'd just need to implement the FUSE library interface. Like the packet stuff, this would be done via a special fuse.handler that would handle conversion from IOFS requests to FUSE calls and back again. It'd probably be a little more complex than packet.handler as FUSE is designed for POSIX operations and semantics, so there'd likely be multiple FUSE calls for certain operations. I don't think that would have to be a huge problem though.

The Type option that I added to the Mountlist would then get the handlers online, eg:

FileSystem = ntfs.handler
Type = FUSE

Many different filesystems are available using FUSE. Most of them aren't particularly useful other than in very specific application domains, but a fuse.handler would immediately give us support for NTFS and ZFS. Those alone are reason enough to do it, so I'll start looking into it.

Packets are basically done anyway. There's a few issues to sort out and a heap of testing, but without packet handlers available I can't do much. I'll probably mostly backburner packets for a little while (maybe a week or two) while I chase Marek, and work on fuse.handler. It'll help the packet work anyway, as it'll give a second implementation of the filesystem proxy model. This should be fun :)

tuesday, 27 february 2007

posted at 12:59

Now that the filesystem actually works, things have got a little boring. There's still plenty of work to be done, but not much I can test without getting updated FATFileSystem code. I have been thinking about porting some other filesystem like FS1541 or amiga-smbfs, but thats boring too, so its a little difficult to get motivated.

But work must go on. I've implemented (but not tested) a pile more iofs->packet conversions, to the point where there's only a few left. I've also added support for packet systems to expansion.library. Adding FAT to the Mountlist will be a simple as adding an entry like this:

FileSystem = fat.handler
Type = packet

The old StackSize and Priority options for setting up the filesystem task are also back in action. This arrangement sucks a little - I really would have liked the handler type to be auto-detected - but our Mountlist options are different enought to the original that things wouldn't carry over properly anyway. So this will do until either enough people complain or someone proposes something better.

I'm also starting to poke around in the dos.library *Pkt() functions to figure out what's needed to have it just quietly pass packets through to packet-based filesystems. That should be fairly straightforward, but mostly theoretical since nothing on AROS currently uses these interfaces directly (mostly because they don't work). I'll probably write a small program to do some basic file operations using packets and make sure they get handled properly, and that will be it. I also need to finish support for packet->iofs conversion; AROS has this already but its not finished.

I just found PickPacket on Aminet, which looks like it might be useful for testing. I'll have a go at porting it on the way home today.

friday, 23 february 2007

posted at 23:57

Just got back from bowling and Daytona. Had a couple of drinks in quick succession so I'm a little buzzed. I'll write this and then go to bed, its 1am.

Today I finally found and fixed the last crashing bug - some pointer arithmetic had gone awry, resulting in the stack being trampled. I'm not entirely sure what it the problem was, but a slight tweak fixed it up.

With that gone, I now have stable support for traversing the directory hierarchy and reading and examining files. The filesystem can be browsed via Wanderer, which seems to work fine. Currently using multiview to display files isn't working (tested with both a PNG image and an AmigaGuide doc), and I'm not sure why, but I believe it to be a bug in FATFileSystem itself.

I've contacted Marek to try to get an updated version, but haven't heard back. If anyone reading knows of another way to get hold of him, could you please prod him and see if he got my mail? Perhaps his email address has changed or something, these things happen.

While I'm waiting I'll be working on getting the mounting stuff (dos.library, expansion.library and C:Mount) to know how to setup packet handlers. I'm thinking some kind of simple "proxy" Mountlist option, though I also need to implement AmigaOS options like StackSize and Pri. Shouldn't be hard, just needs a little thought. I'd really love it if the system could just auto-detect the handler type, but I don't see how to make that happen without requiring modifications to the packet handler itself. That rules out binary compatibility with old filesystems, which I want to avoid. So for now, users will just have to set it up in the Mountlist.

tuesday, 20 february 2007

posted at 14:56

We're in the middle of a heat wave here in Melbourne, as is normal for this time of year. Both Saturday and Sunday were up around 38-40 degrees (celsius), so I couldn't find much motiviation for anything other than dozing on the couch and complaining.

That said, things continue to move. It seems that my typical code style is to write about five lines, then chase a crash for a few hours/days until finally finding a poor assumption somewhere deep in dos.library. All of DOS assumes (probably reasonably) that the AROS-specific fields in struct DosList were being filled out, so when FATFileSystem decided to add a volume to the system (something filesystems that handle removable media can do), it resulted in a corrupted DOS list and a crash on the next DOS operation. A little detection code in AddDosEntry() was all that was needed.

I think I've basically finished the port of FATFileSystem. It had been assuming both that it was running on a big-endian machine and that BCPL strings really were BCPL strings rather than normal C strings that Linux-hosted AROS uses. I rewrote parts of the code to take care of this when running on AROS, and its good now. One more problem removed.

Finally, I've got the basic framework for converting IO requests into packets, and converting their results back again. I'm rather proud of the setup, actually. On receiving an IO request, a new packet is created and the request stashed in dp_Arg7, which is rarely (never?) used. The IO request type and parameters are converted and stored in the packet, which is then pushed to the handler on its process message port. Rather than wait for the reply, the request handler now returns, resetting IOF_QUICK to inform the caller that it will have to wait for a response.

A PA_SOFTINT port gets set as the reply port for the packet, which results in a call to another function within packet.handler that takes the result packet, extracts the original IO request from dp_Arg7, populates it with the results from the packet, and replies to the request message so that the caller can receive it.

All of this means that calls to packet handlers are truly asynchronous if the caller wishes them to be, and also means that we only need two context switches for a packet round-trip. This setup makes it exactly like the traditional AmigaOS environment for packet handlers, and means that packet-based filesystems shouldn't perform any worse on AROS than they do on other systems.

So far I've only implemented a few filesystem requests:


These are enough to do this:

The "filesystem action type unknown" is in response to an attempt to perform FSA_EXAMINE, which I haven't implemented yet. That should happen within the next couple of hours on my bus trip home.

wednesday, 14 february 2007

posted at 21:44
[packet] in init
[packet] in open
[packet] devicename 'fdsk.device' unit 0 dosname 'PKT'
[packet] couldn't load fat.phandler
[packet] couldn't load L:fat.phandler
[packet] loaded DEVS:fat.phandler
[packet] starting handler process
[packet] in packet_startup
[packet] calling handler
[fat] starting up
[packet] started, process structure is 0xb7616c48
[packet] sending startup packet

FATFS: opening libraries.
        FS task: b7616c48, port b7616ca8, version: 0.1debug [AROS] (Feb 14 2007)
        Device successfully opened
        Disk change interrupt handler installed
        Initiated device: "PKT"
Returning packet: ffffffff 0
Handler init finished.

[packet] handler fat.phandler for mount PKT now online

Got disk change request
        Disk has been inserted
        Reading FAT boot block.
        Reading sector 0
        DoIO returned 0

        Boot sector:
        SectorSize = 2
        SectorSize Bits = 1
        SectorsPerCluster = 4
        ClusterSize = 8
        ClusterSize Bits = 3
        Cluster Sectors Bits = 2
        First FAT Sector = 256
        FAT Size = 5120
        Total Sectors = 256
        RootDir Sectors = 32
        Data Sectors = -10448
        Clusters Count = 1073739212
        First RootDir Sector = 10496
        First Data Sector = 10528
        Invalid FAT Boot Sector

That there is the output of the moment of truth, where you know that you're on the right track and every is going to work out OK. I had the same kind of magical moment when working on tap.device, where the foundation is in place and the rest is just adding features. Its extremely satisfying.

For the uninitiated, this is the debug output from FATFileSystem as it mounts a ten-megabyte image created under Linux with mkfs.vfat and made available to AROS via fdsk.device. It seems to be correctly reading the image, which means my replacement block code is correct, and the handler is happy doing its own thing.

This comes on the end of over two days of completely depressing debugging work. I've been deep inside LoadSeg(), I've disassembled the handler code, I've looked desperately for any kind of unusual AROS-ness that might be causing gdb to spit up some truly outrageous output, such as the same function appearing multiple times in the stack trace.

The problem was eventually found in AllocDosObject(). This function, among other things, allocates struct StandardPacket objects, which are a kind of wrapper around the normal DOSPacket structure, providing an Exec message as well as the packet itself. The thing it doesn't do is link the packet and the message, so any attempt to access one via the other resulted in a null-pointer dereference and a crash.

I have no idea why gdb was handling it so badly, but even stepping the code before the crash produced the wrong values. Checking the value for a certain global pointer yielded the "base" value of that symbol in the executable itself before relocation, which is what led me down into LoadSeg(). In the end, printing the value showed that gdb was quite wrong, and thus leading me off the scent.

I was really excited when I finally got this all sorted out and got the thing to run. So excited that I very nearly cheered and punched the air in the business meeting I was in at the time. I was bored, and coding excites me :)

Next step is to implement IOFS-to-packet translation in packet.handler. Soon I should be reading actual files :)

monday, 12 february 2007

posted at 12:53

No huge progress, but a few small things to report.

I've got FATFileSystem building the way I want it. Turns out the whole main() thing was totally unnecessary - its enough to build with usestartup=no uselibs="rom". That vastly simplifies things. Now the loader is simply LoadSeg() followed by AROS_UFC0(). I've also removed all the references to DeviceIO and replaced it with a naive block getter/setter. Its untested, and probably performs woefully, but it should work. All the other bits that had been commented out because of missing things in the AROS headers are now re-enabled, so the driver itself should be ready to go.

Christoph did a little sleuthing for me, and managed to get a new email address for Marcin Kurek. I resent my request for information, and promptly got a reply! He sent headers, and is looking around for source code, which may be lost as he says this quite an old piece of code. My suspicions are confirmed though - its a block caching layer with support for various device backends - standard trackdisk, NewStyleDevice, TrackDisk64 and SCSIDirect. I don't know what most of these are (though I can guess), but they're out of scope for this project. I'll worry about this stuff further when the time comes for writing porting instructions.

I've started re-adding things to the DOS headers to support packets. The first thing I did was put dol_Task back into struct DosList. Upon recompile, AROS segfaults before it starts. Some tracing revealed that struct DeviceNode and struct DevInfo need to have the same layout as DosList, as they need to be happily converted to and from via casting. That's completely braindead, in my opinion, but such is life. Wholesale adding all the missing stuff in one swoop caused no crashes (yet), so I'm guessing thats enough for now.

The next step, which I've just started, is to add two conversion functions to dos.library, IOFSToPacket() and PacketToIOFS(). They really should be internal-only helper functions, except that they'll need to be accessed by packet.handler, so they'll just be documented as AROS-specific and recommendations made to simply use DoIO() or DoPkt() as appropriate. All this may change, as I'm starting to see signs of my current architecture fraying a little at the edges. Not enough that I can put my finger on it exactly, but the warning signs are there. Fortunately most of the code I've written so far will be required no matter what, so I'm not too concerned just yet.

And for the curious, I'm now storing my code in Subversion. Its not everything yet - I'm also making changed to DOS and its headers. Remember that its all extremely fluid, but any feedback is quite welcome.

sunday, 11 february 2007

posted at 10:53

"I must've put a decimal point in the wrong place or something. Shit, I always do that. I always mess up some mundane detail." -- Michael Bolton, Office Space.

And so it is with me. I consider myself a fairly good programmer, but I always make the most ridiculous mistakes, that usually cost me hours or days in debugging. Case in point: packet.handler creates a process to mimic the environment required for traditional filesystems. The structure of the startup code for these processes is the same as everywhere else in AROS - create the process and have it wait around until signalled by the main (creating process) to run.

For some reason though, whenever my newly created processes called WaitPort() the whole thing would segfault. I chased this around for over two days. Then, in desperation, I started writing a small program to try and capture just the relevant pieces of this setup for testing. In these cases I usually copy by hand rather than doing a real clipboard copypasta, so I make sure my brain sees all the code as I go.

As I was copying, I noticed something that clearly wasn't going to work, so I fixed it on the fly. A few seconds later, my brain kicked in. Sure enough, the same problem appeared there. Same fix, recompile, run. No crash!

The problem? CreateNewProc() returns a pointer to the newly created process. I store this in an (effectively) global chunk of memory. The new process was accessing this chunk of memory to get its process structure, but of course, it was doing this before CreateNewProc() returned in the main process. Invalid piece of memory, crash!

The solution is easy. Have the new process call FindTask() to get hold of its process structure, and all is well.

Avoiding this kind of thing is kiddie stuff for multithreaded programming. I've done this hundreds of times before. Its simple, and thus exactly the kind of thing I screw up.

thursday, 8 february 2007

posted at 22:22

packet.handler is coming along, but of course I hit yet another obstacle today. The loading mechanism I described previously is fine, except for a fatal flaw: OpenLibrary() loads shared libraries, meaning that I'll only get one instance of the packet handler ever. That would be fine, except that FATFileSystem (and probably every other) assumes it one unique instance per mount - it has global variables.

There are three ways around this that I can think of:

  • Implement my own loader that is basically a copypasta of the existing OpenLibrary() implementation. Duplicating that much code is an awful idea.
  • Hack up OpenLibrary() (or more specifically the LDDemon) to know about packet handlers and treat them specially. I'd feel really nervous about that - packet handlers are hardly "first class" objects like libraries and devices are.
  • Turn the packet handlers back into real processes, rather than libraries. That is, give them a main(), and call them from RunCommand().

I'm taking the latter option. Its sucks a little more for porters, as they have to add more code, though at least its minimal (and again, easily described). Its also a little weird in that it will make the handlers runnable from Workbench/CLI, though they won't do much. However, I'm going to recommend that the handler main() do a little bit of detection, and if thinks its being run by a user, to bail out. I believe a main() like the following should suffice:

void main(void) {
    if (((struct Process *) FindTask(NULL))->pr_TaskNum != 0) {
        printf("this is a filesystem handler, and can't be run directly\n");


I've run out of things to write, I suppose because I really haven't made much progress since this morning. Its getting late too, so I'm going to go to bed.

thursday, 8 february 2007

posted at 08:11

I spent a couple of days beating my head against the AROS dark magic that holds everything together. I got FATFileSystem building, but on trying to call into it with my loader, I'd get a segfault every time I tried to make any kind of function call.

In desperation I stripped back the library to just a single one-line function that printed some (rather unsavoury) text and exited. That worked. The whole thing only fell over when the rest of the files were linked. The confusing part was the they weren't being used - in theory, they should just be random data along for the ride.

A brief rant in #aros yesterday got an answer from Michal (who else?). Apparently AROS has some lovely magic that automatically makes sure your program has a valid declaration and value for SysBase, which is sort of like the global context for the operating system - most system calls (like AllocMem()) actually take SysBase as an extra argument, though this is #define'd away from the user. Its a nice scheme that works well, unless, as was the case here, you have explicitly declared SysBase in your program. In that case, AROS assumes you know what you're doing, and you're expected to set it to the correct value yourself.

I've now surrounded the declaration in a #ifdef __AROS__ conditional, and its loading fine. I don't mind that this feature is there - it makes sense and is useful - but once again, lack of documentation hurts me.

On the topic of documentation, in the last few days I've managed to procure soft copies of both the ROM Kernel Manuals (thanks Hogga!) and the Guru Book (thanks Michal), though the latter is in poor shape being a scan/OCR of the book. Its serviceable though, and makes for interesting reading. I'm hoping to find time to convert all of these to HTML soon, which should make them much more useful.

Back to the code, I've started implementing the loader code into packet.handler. Once thats done, its onto the first of many tricky bits - re-adding things to the DOS headers that were removed (or at least commented) when AROS switched away from packets. Things like struct FileLock, dp_Port, and other excitement. Those will be the first core changes. Yikes!

Oh, and I haven't heard back about deviceio.library yet. I'll have to start trying a little harder. UtilityBase might be a good place to look.

tuesday, 6 february 2007

posted at 09:26

I'm making good progress, and am quietly confident about success. I can see the next few steps I need to take, which always motivates me, and usually means I'm on the right track.

Yesterday I started porting FATFileSystem to AROS. Of course it won't work, given that packets aren't available, but I'm looking to have something around for me to write the emulation around. The port has not gone without issue - it seems to depend on a deviceio.library, but Google knows nothing about it. I have managed to track down the author, one Marcin 'Morgoth' Kurek, so I emailed him last night asking for more information. My goal in all this is to make porting filesystems as simple as possible, so I want to make sure AROS has everything a packet filesystem might need. Marek Szyprowski's filesystems use this library, and my understanding is that they will be the first filesystems that are ported, so making it available can only be a good thing. For now, however, I've just commented out the code, and a bunch of other stuff too, mostly related to missing DOS fields and structures in AROS. My goal is compiling, not working.

As far as I'm able to tell, traditional packet handlers are basically normal programs except that they use a function called startup() instead of main(). I can only assume this means there's a specific loader somewhere in dos.library to load and run them. I had planned to write a loader of my own in packet.handler to do this, but it proved to be more difficult to get the thing to compile than I'd anticipated. As I was investigating, I came up with a better idea - make the handler a library, with the startup function as a single vector entry point. This is easy to realise - all thats necessary is to add AROS build files (mmakefile.src, lib.conf, a header to define the LIBBASE structure) and then change the entry point, from this:

void startup(void)

to this:

AROS_LH0(void, startup, LIBBASETYPEPTR, LIBBASE, 5, FSHandler)

Its a small requirement, but one that is easily described in a porting doc, so I'm happy with it.

Then, in packet.handler, the "loader" becomes something as simple as:

    handler = OpenLibrary("DEVS:fs.phandler", 0);
    AROS_LVO_CALL0(void, struct Library *, handler, 5, );

It'll be a little more complex, as it will have to setup a process and such, but thats the basic idea. Work will start on the loader on the way home this afternoon, now that FATFileSystem is compiling :)

sunday, 4 february 2007

posted at 22:52

I don't feel like I've got much to write, since I've spent the weekend just reading code and getting more and more confused, but Tom_Kun (of AROSAmp fame) told me to just write about the confusion and bemoan the lack of documentation. Sounds at least at interesting as what I usually write about, so I accept his challenge :)

I've given alot of thought to how to make packets happen. Going back to pencil-and-paper design, I came up with a block diagram that had IOFileSys and packets operating "side-by-side" so I started digging into the code.

I seem to have a weird sixth sense that fires when I'm coding something wrong. I usually can't point to exactly where its wrong, but I've learnt to trust that sense. In this case, it fired, and I could work it out. The new system needs to be able to allow IOFileSys commands sent via DoIO() still reach a packet-based handler if appropriate. This means DoIO() accepting the command, translating to a packet, then calling DoPkt().

The problem? DoIO() is in exec.library, while DoPkt() is in dos.library. Thus, Exec gains a dependency on DOS. Thats wrong.

This forced me to look deeper, so I went into the AROS port of AmiCDROM, the CD-ROM handler. Both AmiCDROM and SFS were ported by adding a IOFileSys-to-packet translation layer to the handler itself. This model seems reasonable, so I've changed tack. I'm going to try and build a "generic" packet.handler that can load and wrap packet-based handlers.

The model is pretty straightforward, and I plan to copy and cleanup code from AmiCDROM/SFS to get it running. But now I have to deal with the problem of getting the handler online. I figure its something loadable, like a device or library, so I've dug deep to find how to do this. As far as I can tell, I want a combination of LoadSeg() and CreateNewProc() with the NP_Seglist tag. Mount and the Mountlist also need some extending so that you can specify to use the packet handler as well as a real handler (unless some sort of auto-detection can be done), but I think thats the way forward, at least as a first implementation.

The hardest part of all this is that I have barely any examples of how the packet layer is supposed to work. I've learnt heaps, but I'll really have no way of knowing if its right until someone tries to port a filesystem to it. I hate working with so many unknowns.

There's a massive documentation void in AROS - I can muddle through the code, but there's not a lot of commentary, and what is there is often vague or unhelpful. I'm going to turn this around at least in my corner - this project will have good comments and full higher-level documentation that explains how the whole thing hangs together.

Hoping to write a little code tomorrow, haha.

friday, 2 february 2007

posted at 22:53

As you probably expected, I've finally applied for the DOS packets bounty. While is hasn't technically been accepted yet (since the Team AROS mailing list is having some issues), I'm working on the assumption that it will be accepted, and starting work accordingly.

Deliverables are as follows:

  • Major updates to DOS such that it can accept either packets or IOFileSyst commands and either pass them through to the filesystem if it is of the same type, or convert to the other type first. Similarly, the responses will be passed or converted as necessary.
  • A console based tool that can issue both packet and IOFileSys commands to DOS. Mixing both command types should work seamlessly. This will be my primary testing tool, and so is the first piece I'll be working on (already in progress). I expect that I'll have to extend it throughout the project.
  • A working port of Marek Szyprowski's FATFileSystem. This is the one that Michal sent me last week, that I have permission to release under the APL (and thus include in the AROS source tree). Its packet based, so the aim is to require minimal actual porting work. Its read-only, and perhaps I'll add write support at some stage, but thats for another project, and isn't included here.

I won't be implementing every packet listed in dos/dosextens.h, but finding a balance somewhere between every packet I can find documentation for and doing just enough to get FATFileSystem running. Its more important that the foundations are in place rather than every obscure feature is implemented.

The target date I set myself is 30 April - three months from now. Its probably about right. At my current work rate it feels conservative, but I do have a tendency to assume things are easier than they turn out to be, so hopefully its about right. Of course I'll keep blogging with my progress.

In other news, earlier today I picked up a 120GB Seagate Momentus hard drive and tonight got it running in my laptop. Thanks to my mad Linux skills, no reinstall required. I got a 2.5"-3.5" adapter, hooked the new drive to my Windows desktop machine and booted up a Linux Live CD. Dropped the laptop to single-user and remounted the drive read-only, and then, with the help of a crossover cable (since my only hub is 10 megabit), did:

# cat /dev/hda | ssh -e none -c blowfish root@ "cat > /dev/hda"

After a couple of hours the entire drive image had copied, so a brief jaunt into Parted resulted in a much larger version of my standard filesystem. And I left a spare 10GB on the end in case I want to do some gaming and/or try some kind of "alternate" operating system ;)

wednesday, 31 january 2007

posted at 18:57

I've been interviewed by Paul J. Beel's "The AROS show". Just more of my usual ranting, I'm afraid, but it was fun to be asked - Thanks Paul!

Quick status update: I have the PuTTY core compiling, and nearly have the plink frontend ready. I still have to write the other AROS specifics, including the network layer. I'll write more soon - tonight I'm playing cards with some friends :)

monday, 29 january 2007

posted at 08:50

Sheesh, you step out for a couple of days and people start hassling you to write (thanks Christoph ;)

Anyway, I've sort of backburnered filesystems for a little while. The work is still interesting, but its hard to hit a moving target, which is what this is until all this DOS packets stuff is resolved. I'm still waiting on an email from Michal with his assessment of the situation, so I want to wait for that before planning my next move.

In the meantime, I've started looking into a port of PuTTY. So far I've got the core building, to the point where the link fails because all of the platform-specific functions aren't there. All thats left to do now is implement them.

I'm starting with plink, which is roughly equivalent to the UNIX ssh - does the protocol, but no real terminal smarts, leaving that to the calling console. Writing (or borrowing) a full-blown terminal emulation will be required, but for instant gratification I want to see a remote login first.

One thing that even the command-line tools need is a way to accept a password without displaying it. The normal AROS console.device doesn't allow this, so I've implemented an extension to the FSA_CONSOLE_MODE IO command that allows echoing to be switched on and off. My original plan was to extend the DOS SetMode() function to make the echo toggle (easily) available to user programs. It currently only recognises 0 and 1 as valid inputs, so by using a higher-numbered bit, we could just use that call. I asked for feedback on this idea, and Fabio Alemagna responded positively, but pointed out that a PuTTY port could possibly form the basis for a new console.device that has a full terminal emulation in it (ala xterm).

I think this is a great idea, as the standard console seems quite limited. In interesting twist, if we had a really great console, then the need for PuTTY is removed somewhat, as something like OpenSSH can do the trick. On the other hand, the PuTTY code is much cleaner, so I'd be inclined to use it, but not port the putty tool itself (though a GUI session manager is still possible).

If we had a better console.device, then we'd also need an API to drive it - something like termios on UNIX, but with a more pleasant interface. So until I have an idea of what to do here, I'm not going to extend SetMode(), because I don't want to make a new interface that will become legacy if a new console interface appears. So for the moment, plink will do a IOFileSys call for FSA_CONSOLE_MODE directly. Its a little more unwieldy, but I think its the right first step.

Oh, and by the way, a couple of people have already mentioned KingCON. Sounds great, but without source, I'm not interested (and if there is source, a quick Google doesn't find it). Remember that AROS is also an educational experience for me - porting software is far less satisfying than writing it myself.

thursday, 25 january 2007

posted at 21:29

Michal just sent me a read-only FAT32 driver that was written by a Marek Szyprowski for MorphOS. Its DOS packet based. So now I have to choose - do I continue writing my almost-from-scratch filesystem for the learning experience? Do I take this one and convert it to the AROS filesystem API, and then add the write functions? Or do I implement the DOS packet interface (and then add the write functions).

My questions about the packet interface stand, and I've just sent a long email to Michal seeking guidance. Meanwhile, I'll fiddle with something else. Right now I'm working out why resolve crashes under hosted. Hardly glorious work, but worth doing :)

thursday, 25 january 2007

posted at 09:12

Michal Schulz informed me of the existence of fdsk.device, which is basically a loopback device - a way to mount filesystem images.

The interface is a little unwieldy, but its quite usable. I've put vdisk.device into mothballs for a while, though I did have it very close to working. I might bring it back to life some time in the future, or at least extend fdsk.device with some of the ideas I have. Its not a high priority for me right now, as the point of this whole exercise was to build a filesystem.

I've started looking through the FreeBSD msdosfs code, to try to get a feel for it. Amazingly I'd forgotten just how bad POSIX code can be - the Amiga interfaces really are pleasant to read and use. Anyway, I've pretty much decided that trying to get the raw BSD files to compile and be usable is going to take at least as much effort as writing the filesystem from scratch, cutpasting the useful bits, so I've settled on the latter. In theory it'll produce cleaner code, possibly at the expense of some stability. I'm not bothered - I learnt the hard way that readable code beats just about everything. You can fix bugs and stabilise things later, but if you can't read it you don't stand a chance.

All this means I've lost some time last night and this morning, so I've only stubbed the startup code for the handler, but I'm hoping in a few days I'll be able to read a floppy image.

wednesday, 24 january 2007

posted at 08:31

My brain isn't in the right place for SDL hacking right now. It seems pretty doable, but I'm bored. I keep thinking about filesystems, so thats where I'm going to focus my efforts for the moment.

AROS is going to need an implementation of FAT16/FAT32, if only to use USB keys when USB support appears. So my intent is to port msdosfs from FreeBSD.

Regardless of whether DOS packets or IO win the day as the filesystem interface of choice, something I am going to need is some sort of virtual disk device so I can use real filesystem under hosted. This idea is the same as what's used in virtualisation software everywhere - you have a big opaque chunk of disk that the virtual machine treats as a real piece of hardware.

This one doesn't look too hard to implement. All devices are roughly the same, so most of what I learned from tap.device should apply here. The initial goal is to mount a CD-ROM image using cdrom.handler - only read magic required :)

tuesday, 23 january 2007

posted at 21:36

tap.device is pretty much finished. There's a few little things that need to be added, but my motivation is gone on it now - I'll just add bits as they're requested. Its in the nightly builds, so its pretty much just a matter of waiting for feedback now. So I'm now turning my attentions towards my next project.

I've been looking at the bounty list, and the two most easily attainable for me are probably "SDL graphics and input drivers" and "DOS packets" (the EFIKA port is also interesting, and quite lucrative, but I'd have to buy an EFIKA board first, and I'm cheap).

To do DOS packets, however, requires a pile of technical knowledge that doesn't seem to exist outside of books long out of print. I've asked a few questions about it on the AROS list, but haven't had any real reply yet. Even if I was convinced of the utility of this stuff (which I'm really not yet), I wouldn't be able to do it anyway.

So, as I wait, I'm looking into doing a SDL backend. Just playing at this stage, but I expect I'll have an idea in a day or two of whether or not I can do it, and if I can, how long it will take. All going well, I'll apply for the bounty and hopefully make a little pocket money :)

monday, 22 january 2007

posted at 10:27

Finished off the stats tracking code this morning, which gets the thing into a usable state, so its time for a release! Checking into AROS SVN as workbench-devs-tap-unix. It'll build by default, so it'll be in the next nightly build, after which time I expect to get a bit more feedback.

Made a release announcement over at AROS-Exec, and made a little screenshot too, running the MSS Snug web server. A screenshot of a network driver in action is fairly pointless, but it was asked for, so who am I to question it? :P

There's still a few things to do before I can leave this project and move onto other things. The biggest one is removing the requirement to run AROS as root. I'm currently digging around in the QEMU code, which appears to both use TUN/TAP but is not root (or even setuid root). If thats no good, i've got a couple of ideas, so I expect something to happen soon.

saturday, 20 january 2007

posted at 13:55

Someone in #aros was saying that AROS has no goals and no directions, and that he couldn't support it based on that. I didn't comment at the time, but I've been thinking about it a bit and I've decided I agree with him, except that I think its a good thing.

Leaving games aside for a moment, how often do you actually have fun just using your computer? Windows and Unix systems are about work, not play (this includes Linux). Every new application is aiming big, trying to be "professional" and "enterprise-grade". And often they do a good job of it, but at the cost of having no soul.

The Amiga, on the other hand, is for play. Thats not to say its not possible to do serious work with it, but look at its history. It arrived at a time when computers were for home, not for work. They were for hobbyists, not professionals. You'd sit down and experiment, see what you could make the computer do, and with a bit of ingenuity, you could do quite a bit. You used it to create, rather than process.

The best example of this? Paint programs. Windows ships with MSPaint, a cute little freehand drawing tool. It doesn't do much, but pretty much everyone has used it at least once just to play - drawing a house, a boat, or just random graffiti.

As far as I know, neither Windows nor Unix has a "serious" freehand drawing program. If they do, they're not well known. That type of program is derided as a toy, while tools like Photoshop (GIMP) and Illustrator (Inkscape) control the field - both designed for serious processing work.

The Amiga, on the other hand, is well known for programs like Deluxe Paint and TVPaint, and more lately, Lunapaint. In terms of features and complexity, these applications are "serious" - they're not toys. But, they're aimed at artists - people producing digital art just for the sake of it.

The point I'm trying to make is the thing that distinguishes the Amiga (and thus AROS) from other systems (save perhaps the Mac, though I'm not familiar enough with that system to comment) is that Amiga is for the artists, the musicians, the inventors, the creative folk, where others are for the white-collar workers, the processors, the business types.

Now don't get me wrong, these are important jobs, and someone has to do them. I think that the creative types have lost out as computers have hit the mainstream and become merely tools to be used to get a job done rather than an end in themselves. I think I've felt this for a while, though I couldn't have articulated it until now.

And thats were AROS comes in. AROS can provide a way for computers to be fun and interesting and sexy again. So in a way no goals are required, because the very act of building the system is the point - if AROS was ever considered finished, then we've either lost our way or it isn't needed anymore.

None of this means AROS has to be a toy. If I had to set a goal, it would be to build an operating system that can take advantage of every piece of hardware in my computer and every last cycle of computing power to make me want to just play with my computer. I'd say its already well on its way.

In the last week I've enjoyed working on the AROS code more than any other code I've worked on in the last four years, since about just before I did the first jabberd stable release. It lets me stretch, try things out without worry about doing it "wrong". It rewards me when I get it right but leads and teaches me when I get it wrong. The codebase, like the system it implements, is optimised for fun.

To anyone looking to make AROS into a "serious" operating system, while I wouldn't discourage you, I would say tread carefully. Don't remove the soul from the system in your efforts to make it like the "big boys". We need a fun and creative system like AROS. What we don't need is another Windows or Unix clone - they're quite good at doing that on their own.

saturday, 20 january 2007

posted at 10:48

Does this mean anything to you?

rob@plastic:~$ ping -c5
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=255 time=13.0 ms
64 bytes from icmp_seq=2 ttl=255 time=12.9 ms
64 bytes from icmp_seq=3 ttl=255 time=12.9 ms
64 bytes from icmp_seq=4 ttl=255 time=12.8 ms
64 bytes from icmp_seq=5 ttl=255 time=12.9 ms

--- ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3999ms
rtt min/avg/max/mdev = 12.824/12.948/13.074/0.147 ms


I was right, and the problem wasn't my fault. I got an email this morning from one Georg Steger (who has a finger in every pie) who read my last blog entry, and based on my rather vague description was heroic enough to track down a bug in UnixIO, and sent me a patch. Applied, and its a ping frenzy - I've sent ~10000, without issue. Thanks Georg, you're a legend.

While others were fixing bugs for me, I got to spend some time refactoring large chunks of code and adding various error checks and other stuff. Its now at the point where I think I've got a pretty solid and clean codebase to build all the other needed pieces on - stats tracking, broadcast/multicasts, and so forth. The hard stuff is done, it should be pretty plain sailing from here!

friday, 19 january 2007

posted at 09:19

I'm unbearably close to having this working. I found the problem that I described yesterday. I was using soft interrupts to have UnixIO signal event readiness. As far as I can tell, the interrupt handler is called more-or-less directly by whoever triggered the interrupt, meaning that the my handler was running in the UnixIO select loop task. My handler calls back into UnixIO after the last write is done to disable write events. I'm not exactly sure, but I think I understand - UnixIO isn't reentrant, and so the lists got corrupted.

So that was a bit annoying, as it seemed so close, but I had to backpedal. The answer is to switch from soft interrupts to signals, but using signals requires that I have a task to signal, so I reworked the code to have TAP IO handled by a separate task. It sits and waits for UnixIO to poke it, and then reads or writes as appropriate. It took me ages to get it going, mostly because I spent two hours tracking down a stupid crasher that resulted from my own inability to read.

This morning I finally got it working (for some value of working). It successfully responded to three packets, the initial ARP request and two ICMP ECHOes (ie pings), before hanging. The last thing I had chance to check before getting to work was to see where the hang is. Its stuck somewhere in timer.device, called from AROSTCP, looping over a list that points back into itself. I'm not quite sure yet how to track this one down. I figure it'll be me not setting some value properly in one of the AROSTCP requests, or more likely, not locking something properly before changing it. Horrible horrible problems in both cases. I'm not sure how I'm going to find it, but I'm sure I'll think of something.

I just wish it would work. Its kinda demoralising - its been almost working for three days now, but just not quite. A few people have asked about it, so there's instant glory as soon as I'm done, which I want now :P

It certainly is a wonderful way of learning the system though. I know this for certain - UnixIO is a great concept, but the code is horribly crusty and disgusting. It needs quite a bit of work, which I might do sometime (ie added to TODO list).

thursday, 18 january 2007

posted at 07:46

Work continues apace. Yesterday tap.device received and decoded its first packets, and I happily watched as AROSTCP issued write commands in response. This morning on the bus, the first packet was responded to. Some output:

[tap] [0] got a packet
[tap] [0] packet dump (42 bytes):
0x000000  ff ff ff ff ff ff be 21  7e 9b ce 97 08 06 00 01  
0x000010  08 00 06 04 00 01 be 21  7e 9b ce 97 c0 a8 1e 01  
0x000020  00 00 00 00 00 00 c0 a8  1e 02                    
[tap] [0] source address: be:21:7e:9b:ce:97
[tap] [0] dest address: ff:ff:ff:ff:ff:ff
[tap] [0] packet type: 0x0806
[tap] [0] broadcast packet
[tap] [0] found a request that wants this packet, sending it
[tap] [0] packet copied successfully
[tap] in begin_io
[tap] CMD_READ
[tap] [0] queued read request
[tap] in begin_io
[tap] [0] queued write request
[tap] [0] waiting for write events
[tap] [0] ready to write
[tap] [0] buffer has 28 bytes
[tap] [0] packet dump (42 bytes):
0x000000  be 21 7e 9b ce 97 2e 2e  22 89 d7 0a 08 06 00 01  
0x000010  08 00 06 04 00 02 2e 2e  22 89 d7 0a c0 a8 1e 02  
0x000020  be 21 7e 9b ce 97 c0 a8  1e 01                    
[tap] [0] source address: 2e:2e:22:89:d7:0a
[tap] [0] dest address: be:21:7e:9b:ce:97
[tap] [0] packet type: 0x0806
[tap] [0] wrote 42 bytes

Thats the debug output from the driver as it receives an ARP who-has broadcast from Linux, and sends a reply. tcpdump was kind enough to show it:

08:02:42.663596 arp who-has tell
08:02:42.675941 arp reply is-at 2e:2e:22:89:d7:0a (oui Unknown)

So we're extremely close. I've got a bug at the moment that is only really allowing one packet to be sent by AROS before it gets stuck somewhere deep in the kernel and consumes all of my CPU. Its to do with trying to disable write events when I've got no more packets to send - there's no point having UnixIO wake me up every second to tell me I can write if I have nothing to write. It seems to be causing some kind of interesting race condition inside the kernel's signal stuff. I'm not sure yet if its a bug or a limitation of UnixIO, but I'm sure its possible to fix, so my next step is to print unixio_class.c and study it for a while.

tuesday, 16 january 2007

posted at 07:44

Made some excellent progress yesterday. Turns out that only code built into the kernel can access the host OS, so I have to make use of a HIDD of some kind. But then I found the UnixIO HIDD. Essentially it exposes Unix file access to AROS applications. Since all I do is file operations on /dev/net/tun, it will work nicely.

Late last night I got tap.device as far as detecting that packets were being sent. I thought I'd add a simple packet dumper before bed, because its only an extra couple of lines of code - read then print. And then something truly horrible happened. Turns out the the UnixIO API doesn't have a method for reading data from a file. It has one for writing, but not for reading.

This is truly bizarre. I can only guess that it hasn't been required thus far. I went to bed rather annoyed, and this morning poked around for an alternative - I wondered if maybe the data was being sent along with the "ready to read" event. Sadly, no dice. So on the bus trip this morning I implemented a ReadFile method, which is working very nicely. Once again, I'm impressed at how intuitive the code is - in under an hour I'd learnt what I needed and got it working.

I'll write some tests for it today (mostly just extending test/unixio) and check it in tonight. I'm not sure what the etiquette is for changing something rather core to the whole system. I haven't broken anything, so I think I'll just check it in, tell #aros, and then deal with any fallout (though that seems unlikely). They gave me commit bits, I intend to use them :P

saturday, 13 january 2007

posted at 12:40

So I got a stub driver done, and this morning instructed AROSTCP to use it, but it failed in startup - couldn't find tap.hidd. Sure enough, the file wasn't in my install, so I tried to build it manually, but that failed too. Considering that I can't find anything anywhere (within AROS or via Google) that uses it, I suspect its just another victim of bitrot. So my job just got more exciting - now I have to resurrect that too.

I think I'm not going to bother though, and instead have tap.device talk to the Linux TUN/TAP layer directly. Its designed to mimic a network card, so why would you want to have anything other that a network driver talk to it? And if there's only going to be one thing talking to it, why not integrate them and get rid of a pile of complexity?

I can't help but wonder what it was for in the first place. Once I get some Subversion access I'll look through the history and see where it came from.

friday, 12 january 2007

posted at 20:38

Merry year, etc. I didn't write a great deal of code over the break, opting to trounce Dark Samus instead. But now I'm back at work, which means a few spare laptop hours each day, mostly while on the bus. I've already forgotten everything I was working on, so of course, its time for something new.

In days of old, I was an Amiga fanboy. I would've liked to be more, but I had nfi what I was doing, and couldn't afford the necessary s3kre7 t3ch to do the really awesome stuff. I did spend many long hours in AMOS Pro, but I think the real heroes would tell you that doesn't count.

Amiga is mostly dead and gone now, but I still have fond memories. At various times in the last few years I've stumbled onto AROS. A few clowns thought it'd be great to reimplement AmigaOS from scratch, and who am I to argue? I remember when I first saw it it was complete pants, but I tried it again in 2005 and found it to be quite impressive. I started playing with the code then, even porting JJFFE (sadly, that code is lost). As usual, I got sidetracked later (I think on some interesting MUD code), and forgot about it.

Anyway, I rediscovered it again a couple of days ago and grabbed the code. It didn't compile, which turned out to be the build system wrongly assuming that a particular header file was always in the same place. On Debian, its not. But in the course of finding and fixing the problem, I got to look at the code again, and was reminded of just how interesting the system is. So AROS has been selected as my next plaything.

I've joined the lists and posted my build patch, and last night dropped into #aros. Talking about possible projects, someone suggested a decent telnet/SSH client would be useful, immediately making me think to port PuTTY. First of course, I need network.

As far as I can tell, AROS has working network/TCP support, but only if running in native mode (ie running direct on the hardware). I'm not particularly interested in that mode of operation, preferring the Linux hosted mode - saves dicking around with partitions and rebooting and whatever. Unfortunately there doesn't seem to be a network driver for hosted mode, so I decided my first project was to make one.

I know from past fiddling that the TUN/TAP drivers are the way to realise this. Basically it creates an network interface that is attached to a process rather than a physical bit of hardware. With the right setup it sends and receives raw ethernet packets. So my thought was to learn it, then work out how to integrate into AROS.

I wrote taptest.c to get a feel for how things should work. Armed with that, I set about building a AROS hardware driver.

Since AROS is trying to be portable, its device drivers don't talk directly to hardware. Instead, there exists Hardware-Independent Device Drivers (HIDDs), that essentially abstract the underlying physical hardware. Then, the device drivers (eg graphics.device) talk to the HIDD to do the work. It makes sense, but the importance was completely lost on me as I charged ahead, copying one of the network card drivers and gutting it to do my bidding.

After a little while I started thinking about how to make a AROS binary use facilities provided by the host when it hasn't the faintest clue that there exists something called a "host system". Then it dawned on me that I'd need a HIDD that talks to the Linux TUN/TAP driver, and then a network driver that talks to it and implements the SANA-II interface.

So now I had two things to write. Digging around, turns out there's already a tap.hidd, but a corresponding tap.device is nowhere to be seen. I can't imagine what the HIDD could possible be for on its own, and nothing in the source tree seems to reference it, but I'm not above stealing it and using it.

So there, I've said basically nothing, but at least you now know. The one thing I am finding is that this project is fun to hack on, something I've really been missing in the last little while. And there's heaps to be done still, so I shouldn't be lost for ideas. Hopefully I can hang around here a little longer than most other things.