friday, 14 may 2010

posted at 09:28

Another little program from my toolbox. This one I'm quite proud of. Its a tiny little file transfer tool I call otfile (that is, one time file).

The idea is this. Quite often I need to send a file to someone on the work network. These can vary from small data files or images to multiple gigabytes of raw data or confidential documents. Our network is fast and the network and servers themselves are considered secure so I don't have to worry about eavesdropping, but there's a real problem with the transport mechanisms - they all suck.

I can put the file in an email, but there are transmission and storage size restrictions. Its also fiddly - create message, attach file, send.

I can put the file on a web server, but the only ones I have ready access to are publically-accessible, so I have to set up an access control. If its a large file then I have to think about disk space on the server (usually an issue) and then I have to wait while the file copies before sending the recipient a link. Oh, and I have to test that link myself because invariably I've screwed up file permissions or something else.

Probably the closest to what I want is file transfer via IM, but for various reasons that's currently blocked at the network level. I could probably get that block changed but it'd mean a bunch of negotiations for something that isn't actually related to my job. Its not worth my time.

So I wrote otfile. You run it with a single file as an argument, and it creates a web server on your machine with a randomised url for the file. You paste the url into an instant messaging session (I'm chatting with my team all day long). They click it, file downloads directly from the source, and then crucially, the script exits and the web server goes away. That url, while open for anyone to connect to, is near impossible to guess and only works once. That's secure enough for me.

The major thing I think this is missing right now is the ability to do multiple files at once. Its not that big of an issue because its pretty easy to run multiple instances - just a shell loop. If I went for multiple files I'd have to decide if I want to make it produce multiple urls (a pain to paste and to then require someone to click on them all), produce a directory listing (what are the semantics? when do the files disappear? when does the server shut down?) or build some kind of archive on the fly (cute, but is that painful for the receiver?). I'll probably just dodge it until I use it like that enough to be able to ask the receiver what they would have expected.

#!/usr/bin/env perl

use 5.010;

use warnings;
use strict;

use autodie;

use File::MMagic;
use File::stat;
use UUID::Tiny;
use Sys::HostIP;
use URI::Escape;
use Term::ProgressBar;

use base qw(HTTP::Server::Simple);

my @preferered_interfaces = qw(eth0 wlan0);

say "usage: otfile <file>" and exit 1 if @ARGV != 1;

my ($file) = @ARGV;

open my $fh, "<", $file; close $fh;

my $mm = File::MMagic->new;
my $type = $mm->checktype_filename($file);

my $size = (stat $file)->size;

my ($fileonly) = $file =~ m{/?([^/]+)$};

my $uuid = create_UUID_as_string(UUID_V4);

print "I: serving '$file' as '$fileonly', size $size, type $type\n";

my $server = __PACKAGE__->new;

my $interfaces = Sys::HostIP->interfaces;
my ($ip) = grep { defined } (@{$interfaces}{@preferered_interfaces}, Sys::HostIP->ip);

my $port = $server->port;
my $path = "/$uuid/".uri_escape($fileonly);
my $url = "http://$ip:$port$path";

print "I: url is: $url\n";

$server->run;

my $error;

sub setup {
    my ($self, %args) = @_;

    print STDERR "I: request from $args{peername}\n";

    if ($args{path} ne $path) {
        $error = "403 Forbidden";
        print STDERR "E: invalid request for $args{path}\n";
    }
}

sub handler {
    my ($self) = @_;

    if ($error) {
        print "HTTP/1.0 $error\n";
        print "Pragma: no-cache\n";
        print "\n";
        return;
    }

    open my $fh, "<", $file;

    print "HTTP/1.0 200 OK\n";
    print "Pragma: no-cache\n";
    print "Content-type: $type\n";
    print "Content-length: $size\n";
    print "Content-disposition: inline; filename=\"$fileonly\"\n";
    print "\n";

    my $p = Term::ProgressBar->new({
        name => $fileonly,
        count => $size,
        ETA => "linear",
    });
    $p->minor(0);

    my $total = 0;
    while (my $len = sysread $fh, my $buf, 4096) {
        print $buf;
        $total += $len;
        $p->update($total);
    }

    $p->update($size);

    close $fh;

    exit;
}

sub print_banner {}

I really need to set up a repository for things like this. Not hard to do of course, I'm just not sure if I should have one repository per tool, even if its just a single file, or all these unrelated things in one repository. I'll probably just do the latter; its way easier to manage.