friday, 22 march 2013

posted at 17:50

Over the last couple of weeks I've been working on integrating remote filestores into FastMail. We've had our own online file storage facility for years (long before it was cool), and you've always been able to attach a file in your store to an email and save an email attachment to your store. We've been extending that to allow you to use a "cloud" file storage service in exactly the same way.

Our file storage facility is fairly simple in concept, and operates around the traditional files-and-folders model that we've used forever. For the first external service to integrate with we chose Dropbox, mostly because its by far the most widely used, but also because they use the same model and so it was very easy to create an abstraction and slot it behind it. For FastMail subscribers, you can try it out right now on beta. Once we've finished polishing and testing it we'll be releasing it to production. Should only be a couple of weeks away, but don't quote me on that!

When I developed our internal remote filestore abstraction I designed it with the idea that it would be fairly simple to integrate other remote filestores as well. Today I spent a good amount of time working on an integration for Google Drive. I'm mostly doing this to satisfy myself that I have a good abstraction in place, but of course Google is no small fish and I think that it would be wonderful if we could make this available to users as well.

This has not been a simple undertaking. To find out why, lets talk the architecture of our client a bit.

One of the features of our internal API (the AJAX stuff that our client uses) is that it is completely stateless. This has been done deliberately, as it makes it very easy to scale our backend servers. Obviously state can be held (there is a very nice database available all the time) but the API itself has no real concept of state, so it becomes tough to know what and when to store and expire data. So to build anything we start by assuming it will be stateless.

Our attachment system is very simple. There is a file picker that requests metadata for a given path (a standard /foo/bar/baz construction). It gets back name, full path, timestamp and type for the requested folder and its immediate children. When the user selects a folder, a new metadata request comes in for that folder. The server does not care what's gone on before, it just turns paths into lists of metadata. Later to actually attach a file, we call a different method with the wanted and path, and the file data comes back. Like I said, very simple.

So back to Drive. The major reason for it being an utter pain in the backside to integrate is that the API itself has no concept of folders or paths. Now anyone using the Drive web interface will know that it has folders. This is actually something of a lie. A Drive is just a giant pool of files with various properties that you can query on. A file can have "parent" and "child" pointers to other files, which allows a loose hierarchy of files to be constructed. A folder is simply a zero-size file with a special type (application/ and appropriate parent and child pointers.

Every file has a unique and opaque ID, unrelated to the file's name. These IDs are what's used in the parent and child pointers. There's no way for us to construct the ID of a file from a path. To find the metadata for our file, we have to follow parent and child pointers around.

So lets say we want to get the metadata for the folder /foo/bar/baz and the files inside it. We start off by getting the metadata for the root "folder", helpfully called root (gotta start somewhere). Along with all the info about that root folder we get back its ID. Lets say its ID is 'root123456' (it won't be, its opaque and apparently random, but this will do for our purposes).

Now we have to find foo. We request the file list, with some search filters (normally all on one line, presented here with newlines for readability):

'root123456' in parents and
title = 'foo' and
mimeType = 'application/'

Gotcha 1: deleted ("trashed") and hidden files are returned by default. We don't want those. So actually the filter is:

'root123456' in parents and
title = 'foo' and
mimeType = 'application/' and
trashed = false and
hidden = false

Gotcha 2: this query goes into the q= parameter of a GET request, however it needs to be form-encoded rather than the standard URI 'percent' encoding.

(Neither of these gotchas are documented. Good luck).

Assuming it exists, we'll get back a "list" containing one item. I'm not actually sure if two items with the same name and type can exist. Probably, so I return "not found" if I don't get exactly one result yet. That's an implementation detail though, and it might change.

So now we have our foo metadata, we can get its ID and then repeat the process for bar, and so on.

Each of these requests is a separate HTTP call. They're stateless, so various performance tricks can be utilised (keepalives, etc). My servers are on good networks so its not that slow, but its still a lot of round-trips.

Once we drill down that far, we do a final request for a file list with the same filter, this time leaving off the title and mimeType term (we want everything).

'baz123456' in parents and
trashed = false and
hidden = false

Gotcha 3: this will return Google documents, spreadsheets, presentations and the like. These are identifiable by MIME type, and are not downloadable (because they're internal application magic). Their metadata has various URLs for converting to eg Word documents, but these aren't really appropriate for our use. We'd like to filter them out. Unfortunately that means filtering by excluding a specific set of MIME types in the filter:

'baz123456' in parents and
mimeType != 'application/' and
mimeType != 'application/' and
mimeType != '...' and
trashed = false and
hidden = false

That sucks because you have encode the full list of exclusions right there in the query, so you have to update that when Google adds a new type of document. Instead I've opted to drop anything with a zero size, but there's no size search term available, so instead I've got to pull the lot and then filter.

Anyway, we now have the metadata for the requested path and all its children, so we can return this back to the caller. It takes N+2 HTTP requests to get all the data we need, where N is the number of "inner" path components. This is hardly ideal, but it works well enough, is probably fast enough for most cases (ie there aren't likely to be very very deep folder hierarchies) and isn't even a lot of code.

So next up is search. Our file picker has a "search within folder" option, which looks for a given file name (or name fragment) within a folder and its subfolders. The subfolders bit is actually a significant problem for us here. Finding matching files within a single folder is pretty easy - its just a repeat of the above but the last query gets an additional title contains filter.

Deep search is far more difficult. The obvious approach (and the one I started implementing) is to drill down to the given path, then do a search for files with title contains or subfolders. And then loop through the folders, repeating as we go. The slightly more refined version of that is to drill through the folders, collecting their IDs, then constructing a single filter for the files of the form:

title = 'bananas' and (
    'root123456' in parents or
    'foo123456' in parents or
    '...' in parents

Gotcha 4: You can group terms with parentheses. This is not documented.

The trouble here is that this is potentially unbounded. We don't know how deep the the hierarchy goes, or how many branches it has. It wouldn't be a problem if each request was negligible (as it often is with a local filesystem with metadata hot in a memory cache), but here its hugely expensive in a deep hierarchy. As noted above, the regular metadata lookup suffers this but to a lesser degree, as it only ever goes down one branch of the tree.

This is where I got to when I left the office today. The approach I'm likely to take is to request a list of all folders (only), assemble an in-memory hierarchy, drill down to the folder I want, collect all the IDs and then perform the query. So it actually only becomes two requests, though potentially with a lot of metadata returned on the full folder list.

And from there I guess the metadata lookup becomes the same thing really.

And I suppose if I was in the mood to cache things I could cache the folder list by its etag, and do a freshness test instead of the full lookup.

But mostly I'm at the point of "why?". I recognise that around Google search is king, and explicit collections like folders are always implemented in tearms of search (see also tags in Gmail). But folder/path-based filesystems are the way most things work. We've been doing it that way forever. Not that we shouldn't be willing to change and try new things, but surely its not hard to see that an application might want to take a traditional path-based approach to accessing their files?

I'm doubly annoyed because Google is supposed to be far ahead of the pack in anything to do with search, yet I cannot construct a query that will do the equivalent of a subfolder search. Why can the API not know anything about paths, even a in light way? Its clearly not verboten, because parents and children pointers exist, which means a hierarchy is a valid thing. Why is there no method or even a pseudo-search term that does things with paths? Wouldn't it be lovely use a query like:

path = '/foo/bar/baz'

to get a file explicity? Or even cooler, to do a subfolder search:

path startswith '/foo/bar/baz' and title = 'bananas'

Instead though I'm left to get a list of all the files and do all the work myself. And that's dumb.

I'll finish this off, and I'll do whatever I have to do to make a great experience for our users, because that's who I'm serving here. It would just be nice to not have to jump through hoops to do it.

wednesday, 6 october 2010

posted at 12:01

Just a reminder that I'm speaking at the Melbourne chapter of the Google Technology Users Group tonight about the details of Monash's move to Google Apps. If you're in Melbourne and without plans you should come along!

friday, 1 october 2010

posted at 09:44

Well almost. I'm speaking at the Melbourne chapter of the Google Technology Users Group next week about the details of Monash's move to Google Apps. I don't have a lot of speaking experience so I'm a bit nervous but its all coming together nicely and I think its going to be a lot of fun. If you're in the area do come along and support me and learn some stuff too :)

friday, 6 august 2010

posted at 19:23

Disclaimer: I work for Monash University, but I don't speak for them. Anything I write here is my own opinion and shouldn't be relied upon for anything.

So Google Wave has been put out to pasture. That makes it a good time for me to write a bit about what I've been working on in the last few months and what I think about the whole thing.

For those that don't know, I work for Monash University as a web developer and Google Apps specialist. We've spent the last ten months rolling out the Google Apps suite to our staff and students. We completed our rollout to some ~150K students last month, and so far have about 10% of our staff across. A big part of the reason we've been able to move that fast is that for the most part, people are extremely excited about Google technologies and how they might use them for education. That excitement goes to the highest levels of the University (one of our Deputy Vice-Chancellors was the first production user to go) and has seen Google Apps being included in our recently-announced strategy for online and technology-assisted learning and teaching, the Virtual Learning Environment.

The interest from our users in Google extends beyond the Apps suite of products to pretty much every product that Google offers, and perhaps none more so than Wave. Through the eEducation centre Monash has already been doing a lot of research into how teachers and students can teach and learn from each other (instead of the traditional top-down lecture style) and how technology can assist with that. Groups sharing and building information together is really what Wave excels at, so it wasn't long before we started seriously considering whether or not Wave was something we could deploy to all of our users.

There were three main issues that needed to be addressed before this could happen:

  • Wave doesn't have enough features to allow a lecturer or tutor to control access and guide the conversation flow.
  • The sharing model opens some potential legal issues surrounding exposure of confidential information, particularly to third-party robots.
  • The Wave UI does not meet the stringent accessibility requirements that University services must meet

Over the last few months we've been working with the Google Wave team to address these issues.

The first is simply a case of the Wave team writing more code. Its well known that they have been thinking and working on access control stuff. Plans exist for limiting access to a wave to a set group of users, allowing robots to better manage the participants in the wave, locking the conversation root down so that users can only reply, and so on. In many ways its the easiest thing to fix, and given the commitment from the Wave team to talk to us and do something to help with what we needed we were never particularly concerned about this stuff.

I won't comment much on the legal side of things, mostly because I don't understand most of it. I do know that its a serious issue (eg Victorian privacy law is perhaps the strictest in the world) but its something that our solicitors have been working on and it probably would have come out ok in the end, if for no other reason than if it didn't people would just use the public Wave service with no protection at all. Users are notoriously bad at looking after themselves :)

The accessibility issues are where my interest in Wave came from so I'll spend a little time there.

I'll be the first to admit that I don't really get accessibility. I am in the happy position of having my whole body working as designed and to my knowledge all my close friends and family are the same, so I really have very little exposure to the needs of those who are perhaps not so fortunate. What I do understand though is that its critically important that information be available to everyone equally and achieving that is far more complicated than the old tired lines of "add alt attributes to your images" and "don't use Javascript". So I'm very happy to follow the lead of those who do know what they're talking about.

Not far away from me in my building we have a wonderfully competent team of usability and accessibility experts. They were asked to do an accessibility review of the Wave client and perhaps not surprisingly, it failed hard. Most of it comes from the difficulty of expressing to assistive technologies (eg screen readers) that something in a page has changed, particularly with proper context. The Wave client builds a complex UI on the fly (eg as the wave is incrementally loaded) and of course has realtime updates. At a more basic level though the static parts of the interface are constructed without using semantically-correct markup. A user agent (eg a screen reader) that scans the page looking for interesting things like links pretty much comes up with nothing.

The accessibility team presented their findings to some people from the Wave team and the response from where I sat appeared to be equal parts of surprise and dismay. They were receptive to the issues raised though. I travelled to Sydney shortly afterwards for DevFest and had the opportunity to chat to some of the team and they all had seen or heard of the report, so it would appear that it was taken seriously.

For me though, I could see that this had the potential to be a real showstopper to our deployment and I didn't want that as I could see the potential for Wave to be a game-changer. Since at the time I knew very little about accessibility, I started work on answering more technical but somewhat related question: "can Wave work without Javascript?". The Wave team had just released a data access API so I set to work trying to build a client using it. That work grew into the (still unfinished) ripple which more or less answers the question in the affirmative. This type of client doesn't solve the accessibility issues but its definitely a step in the right direction.

The part of ripple that I'm most proud of is the renderer. Rendering a wave well is actually quite a complicated prospect. Styles and links are represented as ranges over the static textual content. Its possible for these ranges to overlap it complex ways that make it difficult to produce semantically-correct HTML. It took three rewrites to get it there, and there's still a couple of little nits that I would have addressed sometime if this code had a future, but I mostly got there and I was happy with it :)

Anyway, these problems were being addressed, a few areas around the university started doing research and small pilots usng Wave, and it all seemed to be only a matter of time. I started work on a robot/data API client library for Perl for two reasons, one being that ripple really needed its server comms stuff abstracted properly and two being that we're a Perl shop and we would soon want to host our own robots and integrate them properly into our environment.

This was a great opportunity for me to learn Moose and my suspicions have been confirmed - Moose is awesome and I'll use it for pretty much everything I do with Perl moving forward. A few of weeks later and we get to Wednesday night and I've got things to the point where you could have a nice conversation with a Perl robot. And then I got up Thursday morning and heard that Wave was going away and all my code just got obsoleted.

I was shocked initially, but I was surprised that I didn't feel angry or sad or anything. I can hardly call the time I spent on it a waste as I learned so much (Moose, tricky HTML, accessibility, operational transform) and met some incredibly smart an awesome people, both at Monash, at Google, and elsewhere. I think for the most part though I was ok with it because its probably the right decision.

We (as in the Wave users and developers everywhere) have been playing with Wave for over a year, and we still don't know what it is and what its for. Unless you have the ability to build your own extensions it doesn't really do much for you. The interface is painful, the concepts within don't match anything else we're used to and despite various mechanisms for exposing internal data, you're still pretty much confined to the Wave client if you want to get anything useful done.

The technical issues would have been addressed with time. We would have gotten enough functionality to write a full-blown replacement client. It would have gotten much easier to expose not only data but the structure of data in other applications. But if you take that to its conclusion, Wave becomes a backing store for whatever frontend applications you build on top of it.

But what of the interface? By having lots of different ways to structure and manipulate data Wave tries to let you focus on the task at hand rather than the structure of the data. Traditional applications (web-based or desktop) are tailored to their own specific data models, so we have seperate apps for email, calendars, spreadsheets, etc. Wave wanted to pull all that information together so you could work on all the pieces of your puzzle in the same space. You start a document then realise you need some help, so you bring in friends and talk about the doc as you build it together. You need to process some data, so you drag in a spreadsheet gadget. You embed images or videos or whatever else you need to add to the discussion. Robots can help out by mining external databases and manipulating in-wave data to present even more rich information and even allow feedback via forms. Its all a nice idea, but how do you represent the different kinds of data and structure effectively? Wave tried, and we tried, but I'm not convinced anyone really had a clear idea of how to build an interface that makes sense.

It might not have been an interface issue. It might be that people want to have seperate loosely-integrated applications, one for each of the different types of data they want to manipulate. I don't think thats the case, but I think that a clearer migration path from those other applications would have helped a lot. People first came to Wave wanting to do their email in it. What if from the outset they could have easily pulled mail into Wave and if there was a "mail mode" that allowed some manipulation of Wave data in a way that they were familiar with? What about doing similar things for other data types? I'm don't know how much difference that sort of thing would have made, but something, anything to answer the "what do I do with this" question that everyone had that the start couldn't have hurt.

Wave's failure may also just be a problem of timing and circumstance. The Wave team have regularly acknowledged that they were surprised by the response. The message was supposed to be "we made something different, what do you think?". Unfortunately it was painted in the tech media as an "email killer", which of course it wasn't, but of course that's going to get everyone interested. Being such an early preview Wave was naturally buggy and slow and couldn't accomodate the load caused by the droves of users that wanted to play. So you got swarms of people banging down the door to see what all the fuss is about, and the few that got in found that it wasn't what they'd been led to believe it was and none of their friends could get in so they couldn't try it for what it was. So naturally they disappeared, disappointed, and even later when the bugs were fixed and the system was stable the first impression stuck and those users couldn't be lured back. And although there was a bit of a second wind a couple of months ago after I/O 2010, the same "what to do I now?" question came up.

From what I've seen of Google in the past, they're willing to take a risks if they see a likely or even possible positive outcome. But looking at Wave, how much future did it really have? We loved it, and we saw that it could do things better than existing services (though with some effort), but was it really going to displace them for the casual user? Was it going to make any serious money for Google? Was it ever even going to break even (remember that it takes plenty of infrastructure and manpower to develop and maintain things like this).

Based on all of this, you can totally understand an executive saying "guys, I see what you're trying to do, and thanks for trying, but the numbers just don't add up". Its not like its been a complete waste - there's some awesome technology thats already finding its way into other Google applications (eg Docs now has live typing just like Wave).

So is Wave dead? The product is, but as a concept it lives on. We're fortunate that Google and others have given us plenty of docs and code and their pledge to open-source everything remains. Then there's the third-party protocol implementations that already exist, both open-source (eg PyGoWave, Ruby on Sails) and commercial (eg Novell Pulse, SAP StreamWork). It will take some work, but any one of us could build and deploy another Wave. The question is, would you want to? I think its more likely that we'll see people incorporating bits of the technologies and concepts into new products. And maybe, just maybe, in a few years time some of the work that Wave pioneered will be commonplace and people will be amazed and we'll be those old curmudgeons saying "eh, Wave did that years ago".

So for Monash, we'll continue working on our existing plans. We've mostly been looking at Wave as a delivery platform for what we wanted to do. Not having it available means we'll have to look elsewhere for the technology we need (whether thats buying or building), but our direction won't change.

And for me? I won't continue work on ripple and Google::Wave::Robot code but they'll live on in GitHub should anyone want to rip them off for anything. My next project is building an OpenSocial container in Perl with a view to integrating it into the Monash portal (, which is where my "Web Developer" duties lie); hopefully I'll write something about it! I will however be hanging around Wave until the bitter end and I would like to do something with operational transforms in the future as they look really cool and interesting. See, its not dead, really!

And to any Wave team reading this, thanks guys. You've kept my interest and my enthusiasm alive, you've put up with my incessant questioning and harassment and you've contributed more good ideas and happiness to me and my colleagues than you're probably aware of. For the few of you that I've met and worked with already, I really hope that this isn't the end and that we get to work together in the future. I'll probably stalk you for a while to see where you end up because frankly, people are far more interesting than technology and you've all proven yourselves. Cheers :)