newspaint

Documenting Problems That Were Difficult To Find The Answer To

Category Archives: Perl

Using Perl to Make Signed Requests to Public Transport Victoria Timetable API

So you want to create an application to access the Public Transport Victoria (PTV) Timetable API.

You’ve followed the instructions and e-mailed a request for a developer ID and an access key; and you’ve received both in an e-mail that contained text similar to the following:

Thank you for your interest in the PTV Timetable API.

Your email address has now been registered and your user Id and API key are
below.

User Id: 2912345

API Key: 4cc12345-ff11-2222-a00a-dd1297cd04aa

Now you want to create a signed request using Perl to access the API.

The following function will take a URL and return a signed URL that can be used to access that URL:

#!/usr/bin/perl -w

use Digest::HMAC;
use Digest::SHA;

use strict;

my $devid = "2912345";
my $apikey = "4cc12345-ff11-2222-a00a-dd1297cd04aa";

sub sign_url {
  my ( $url ) = @_;

  # add ?devid=$devid or &devid=$devid to URL
  $url .= ( index($url, "?") < $[ ) ? "?" : "&";
  $url .= "devid=$devid";

  # strip out base URL from part used in signing
  my $request = $url;
  $request =~ s{^(([a-z]+)://+)?[^/]+}{};

  # calculate signature using API key and URL without base
  #   e.g. sign over a string like "/v3/routes?devid=2912345"
  my $signature = Digest::HMAC::hmac_hex(
      $request,
      $apikey,
      \&Digest::SHA::sha1
  );

  $url .= "&signature=" . $signature;
  return $url;
}

print sign_url( "http://timetableapi.ptv.vic.gov.au/v3/routes" );

This should output the following:

$ perl -w signtest.pl
http://timetableapi.ptv.vic.gov.au/v3/routes?devid=2912345&signature=b7ee928f05499a0016746daef5013dba35224d8e

The example is using an invalid signature and devid, so this example URL will not actually return a page; you will have to provide the access key and devid you received in your e-mail.

Note: it is absolutely essential that you do not change the capitalisation of the access key in this script. While it looks like a hex string it is actually treated as case-sensitive text.

The Perl Hash Assignment Gotcha

Question: what do you get if you run the following Perl script?

use warnings;

my %hash => (
    'name' => 'Jimmy',
);

Answer: you get the message:

Useless use of a constant (name) in void context at hashtest.pl line 3.
Useless use of a constant (Jimmy) in void context at hashtest.pl line 3.

So what went wrong? The answer is simple enough but very easy to miss. You probably wanted to assign to the hash variable – but used the comma operator (=>) instead.

To fix merely use an equals sign only after declaring the hash variable:

use warnings;

my %hash = (
    'name' => 'Jimmy',
);

Filtering EasyList For Hosts-File Style Adblock

If you have an Android (rooted CyanogenMod) phone you may be wondering how to block advertising from your web browser.

You may have heard of the hosts file trick that makes the phone assume certain (advertising) domains are hosted at the localhost (127.0.0.1) – which is the phone itself and not running its own web server and thus will instantly respond to any such requests with a “cannot connect” type response – quickly and simply filtering out advertisement domains.

We can make use of Adblock filtering lists such as EasyList by looking for any whole-domain rules, extracting the domain name, and putting into the hosts file.

The script is as follows:

#!/usr/bin/perl -w

use strict;

my %hosts = ();
while ( <> ) {
    if ( $_ =~ m/^\|\|([a-z][a-z0-9-_.]+\.([a-z]{2,3}))\^\s*$/ ) {
        $hosts{$1} = 1;
    }
}

foreach my $host ( sort keys %hosts ) {
    print( "127.0.0.1\t$host\n" );
}

We can use this by downloading the EasyList (as, say, easylist.txt). Then:

user@host:~> perl filter-easylist-to-hosts.pl easylist.txt >easylist.hosts

As of 2014-08-18 there were 5029 such hosts in the list. You can add them to a hosts.new file. This file should look something like (all domains except localhost are fictionalised in this example):

127.0.0.1	localhost

127.0.0.1	an.advertising.net
127.0.0.1	be.advertised.to
127.0.0.1	more.commercialism.com
...
127.0.0.1	zzz.adverts.biz

You must ensure that your hosts file always has a localhost entry in it.

Next, connect your phone via USB and shell to it. The steps are:

  • push the new hosts file to the SD card (temporary location)
  • shell to the phone, become super user
  • find the /system mount path, remount /system as rw (read-write)
  • copy the old /etc/hosts file (backup)
  • copy the new /sdcard/hosts.new file to /etc/hosts
  • remount /system as ro (read-only)
user@host:~> adb push hosts.new /sdcard/
1572 KB/s (236687 bytes in 0.147s)

user@host:~> adb shell
shell@p880:/ $ su
root@p880:/ # mount |grep system
/dev/block/platform/sdhci-tegra.3/by-name/APP \
  /system \
  ext4 \
  ro,seclabel,noatime,nodiratime,user_xattr,acl,barrier=0,data=ordered,noauto_da_alloc \
  0 0
root@p880:/ # mount -o rw,remount \
  /dev/block/platform/sdhci-tegra.3/by-name/APP \
  /system
root@p880:/ # cd /etc
root@p880:/etc # cp hosts hosts.old
root@p880:/etc # cp /sdcard/hosts.new ./hosts
root@p880:/etc # mount -o ro,remount \
  /dev/block/platform/sdhci-tegra.3/by-name/APP \
  /system
root@p880:/etc # exit
shell@p880:/ $ exit
user@host:~>

Iterating With OpalORB

If you’re using the Perl CORBA interface, OpalORB, and you want to iterate through a list then you have to use iterator objects and process the list some elements at a time.

Here is a method I’ve used for iterating in Perl:

# list all XML files in this folder
my $numPerGrab = 25;
my ($objectItems, $objectItemIterator); # scalars to hold iteration objects
$folder->list_with_patterns(
    +[ "*.xml" ], # pattern_seq
    $numPerGrab, # number per grab
    \$objectItems,
    \$objectItemIterator
);

my $next_n_result = 1;
while ( 1 ) {
    # process this chunk of returned items
    foreach ( @{$objectItems} ) {
        printf( "  - file name \"%s\"\n", $_->name() );
    }

    last if ( ! $next_n_result ); # no more
    last if ( $objectItemIterator->_get_orb ); # iterator not valid

    # get next set of items
    $next_n_result = $objectItemIterator->next_n( $numPerGrab, \$objectItems );
}

# should probably destroy the iterator when done with it
# - but don't know if checking _repo_id is the right way to do it
if ( $objectItemIterator->{_repo_id} ) {
    $objectItemIterator->destroy();
}

The equivalent iteration done in Java would look like:

ObjectItemsHolder objectItems = new ObjectItemsHolder();
ObjectItemIteratorHolder iterator = new ObjectItemIteratorHolder();

int numPerGrab = 25;

folder.list_with_patterns(
    new String[] { "*.xml" }, /* pattern_seq - the sequence of the patterns (ORed) */
    (int)numPerGrab, /* how_many - how many objects will be returned (0 for all) */
    objectItems, /*  items - the sequence of the subitems requested */
    iterator /* iterator - an iterator to retrieve the remaining subitems */
);

ObjectItemIterator recordSet = iterator.value;
bool next_n_result = true;
while ( true ) {
    for ( ObjectItem item: objectItems.value ) {
        System.out.println( "  - " + item.info.name );
    }

    if ( ! next_n_result )
        break;

    if ( recordSet == null )
        break;

    next_n_result = recordSet.next_n( (int)numPerGrab, objectItems );
} while ( 1 );

It is worth noting that when iterating it may be faster to get smaller groups of items than larger. Initially I was trying to pull 500 filenames at a time but I found my script actually performed much faster pulling 25 filenames at a time. A little counter-intuitive but worth testing depending on your circumstances.

There are probably better patterns to use for iteration and by all means leave me a comment if you can demonstrate a better one. But this should get you going if you have no idea where to start.

How To Get Object’s IOR With OpalORB

If you’re using the Perl CORBA interface, OpalORB, and you want to know how to get a text string representing the IOR of an object you can do the following (where session is the CORBA object I want the IOR of):

my $sior = $session->_get_orb->object_to_string( $session );

This is possible because almost all objects compiled from IDLs in Perl will inherit from the CORBA::Object class which has the _get_orb() method – and this returns a CORBA::ORB class instance which has the object_to_string() method.

A way of doing this in Java is equivalently:

String ior = session.toString();

A CORBA Consumer In Perl

Lately I’ve discovered the OpalORB open source implementation of the Object Management Group (OMG) CORBA standard.

Given a set of interface definition language (IDL) files they can be compiled and the corresponding Perl classes can be used to communicate with a CORBA server.

There isn’t too much example code for this on the Internet – perhaps because the documentation for OpalORB is rather light – and perhaps because CORBA isn’t widely used these days save for some specialist applications in private industry.

Installing

Download the OpalORB tarball and extract.

You probably will need to edit Makefile and remove the spaces after the equal signs and all trailing space and comments after the variable definitions for COPY, MKDIR, DESTINATION, and TARGET.

Then, as root, run make install.

Copy your custom IDL files into /usr/local/opalORB/idl/myidl.

Compile your IDLs:

cd /usr/local/opalORB/idl/myidl
for i in *.idl; do echo ===$i===; ../idl.pl --client $i; done

Change all references to Object to MyCompany::Object:

find . -type f -name '*.pm' \
  -exec perl -i.bak \
  -pe 's{(?<=[^:A-Za-z0-9.])(Object)(?=[^:A-Za-z0-9.])}{MyCompany::$1}g' {} \;

Running Your Perl Scripts

You’ll probably want to run your Perl scripts in such as way so as they’ll find your OpalORB and custom classes. To do this prepend your Perl executable with the PERL5LIB environment variable, e.g.:

PERL5LIB=/usr/local/opalORB:/usr/local/opalORB/Naming:/usr/lodcal/opalORB/idl/myidl \
  perl script

Examples

First of all let’s solve some simple problems in Perl.

Connecting to a CORBA Server and Getting a Repository/Channel/etc

You may have a IOR reference which looks something like IOR:010000002600000049444c… (for 3 or 4 lines of hexadecimal digits). If that’s the case here is how you can get your repository or channel or whatever…

use CORBA;

# compiled from my own IDL files
use MyCompany::Repository;

sub get_repository_from_ior {
    my ( $ior ) = @_;

    my $orb = CORBA::ORB_init( +[ "MyORBInstance" ] );

    my $repositoryObject = $orb->string_to_object( $ior );
    my $repository = MyCompany::Repository::_narrow( $repositoryObject );
    return( $orb, $repository );
}

Or alternatively if you have a corbaloc string you could use the following function:

use CORBA;
use CosNaming::NamingContextExt;

# compiled from my own IDL files
use MyCompany::Repository;

sub get_repository_from_iiop_and_path {
    my ( $corbaloc, $path ) = @_;

    my $orb = CORBA::ORB_init( +[ "-ORBInitRef", "MyServiceID=$corbaloc" ] );

    my $namingContextObject = $orb->resolve_initial_references('MyServiceID');
    my $namingContext = CosNaming::NamingContextExt::_narrow( $namingContextObject );

    my $name = $namingContext->to_name( $path );
    my $repositoryObject = $namingContext->resolve( $name );
    my $repository = MyCompany::Repository::_narrow( $repositoryObject );
    return( $orb, $repository );
}

Exception Handling

CORBA servers frequently express inability to perform an action through exceptions. I recommend explicitly including the Error CPAN module with debugging turned on so that you can produce a stacktrace pinpointing the line in your code that resulted in the triggered exception.

use Error qw(:try);
$Error::Debug = 1;

use strict;

my $orb = undef;

try {
    ( $orb, $repository ) = get_repository_from_iiop_and_path(
        "corbaloc:iiop:10.15.16.17:3900/NameService",
        "MyCompany/Repositories/MyDB"
    );

    my $version = $repository->get_version();
    my $version_string = $version->version_string;
    print( "My database version: $version_string\n" );
}
catch Error::Simple with {
    my $e = shift;
    print( STDERR "File " . $e->file . ", line " . $e->line . ": $e" );
    warn( $e->stacktrace );
}; # semi-colon MUST be here

# clean up before quitting
$orb->shutdown() if ( $orb );

POA Server

You might have need to receive events from a CORBA server. To this end you will have to run your own infinite-running thread that registers a class to receive messages.

It is VERY important that any dependent classes used in decoding received messages must be included in the script with the “use” pragma otherwise POA may generate exceptions about being unable to find the new method in the sequence or struct objects. To this end I recommend editing the CORBA/TypeCode.pm module and adding the following bold line in the _id_to_name() function during debugging so you can become aware of which classes you need to include in your main script:

return $$id_name{$id} if (defined $$id_name{$id});
warn( "Unknown id \"$id\"" );
###############################################################################

package ConsumerImpl;

# parent classes
use MyCompany::Consumer;
use PortableServer::ServantBase;

push( @ISA, 'MyCompany::Consumer', 'PortableServer::ServantBase' );

sub new {
    my $class = shift;
    my $self = $class->SUPER::new();
    return( $self );
}

sub msg_push {
    my $self = shift;

    my ( $event ) = @_;

    printf( "%s: %s\n", $event->name, join( ", ", @{$event->items} ) );
}

###############################################################################

package main;

# setting up naming context and channel
use CosNaming::NamingContextExt;
use MyCompany::Channel;

# POA and dependent classes (must be loaded to be recognised)
use PortableServer::POA;
use MyCompany::ext_event;
use MyCompany::ObjectInfo;

# pragmas
use Error qw(:try);
$Error::Debug = 1;

use strict;

sub get_channel_from_iiop_and_path {
    my ( $corbaloc, $path ) = @_;

    my $orb = CORBA::ORB_init( +[ "-ORBInitRef", "MyServiceID=$corbaloc" ] );

    my $namingContextObject = $orb->resolve_initial_references('MyServiceID');
    my $namingContext = CosNaming::NamingContextExt::_narrow( $namingContextObject );

    my $name = $namingContext->to_name( $path );
    my $object = $namingContext->resolve( $name );

    my $channel = MyCompany::Channel::_narrow( $channelObject );
    return( $orb, $channel );
}

my $orb = undef;

try {
    my $channel;

    ($orb, $channel) = get_channel_from_iiop_and_path(
        "corbaloc:iiop:10.15.16.17:3900/NameService",
        "MyCompany/Events/EventsService"
    );

    my $consumerImpl = ConsumerImpl->new();
    my $rootPOA = $orb->resolve_initial_references( "RootPOA" );
    my $poaManager = $rootPOA->the_POAManager();
    my $consumerId = $rootPOA->activate_object( $consumerImpl );
    $poaManager->activate();

    my $consumerRef = $rootPOA->id_to_reference( $consumerId );
    my $consumer = MyCompany::Consumer::_narrow( $consumerRef );

    $orb->run(); # never returns, but consumer will now receive messages
}
catch Error::Simple with {
    my $e = shift;
    print( STDERR "File " . $e->file . ", line " . $e->line . ": $e" );
    warn( $e->stacktrace );
};

# clean up before quitting
$orb->shutdown() if ( $orb );

Installing Perl Padre IDE Into Ubuntu 12.04 Precise

Running cpan Padre was giving me all sorts of troubles.

I had to follow the advice in this blog post when I got the error message Undefined subroutine &ExtUtils::ParseXS::errors – which was to apt-get install libwx-perl and edit /usr/local/lib/perl5/5.12.4/ExtUtils/xsubpp and change the errors() call to report_error_count() (as specified in this forum post).

Padre kept failing to install when it couldn’t pass tests depending on Debug::Client. To install that I had to install some dependent libraries:

apt-get install libncurses5-dev
apt-get install libreadline-dev
cpan Term::ReadLine::Gnu
cpan Debug::Client
cpan Padre

Basic Authentication in Perl by Inheriting from LWP::UserAgent or WWW::Mechanize

We can easily provide HTTP basic authentication in the LWP::UserAgent and WWW::Mechanize modules by subclassing them and overriding the get_basic_credentials() method. We also provide a set_credentials() function for providing the username and password.

Note that this is a very simple override – because the get_basic_credentials() routine may discriminate between different Authentication Realms as well as URLs. The provided function in this article merely assumes you want the username/password to be provided for every webpage accessed.

package MyAgent;

use vars qw(@ISA);
@ISA = qw(LWP::UserAgent);

my ( $username, $password ) = ( undef, undef );

# method to set username and password for authentication
sub set_credentials {
  my $self = shift;
  ( $username, $password ) = @_;
}

# this routine gets called when your browser
#   would otherwise be asked to provide a
#   username and password
sub get_basic_credentials {
  my ($self, $realm, $uri) = @_;
  print( STDERR "  - providing auth to realm \"$realm\"\n" );
  return( $username, $password );
}
package MyMech;

use vars qw(@ISA);
@ISA = qw(WWW::Mechanize);

my ( $username, $password ) = ( undef, undef );

# method to set username and password for authentication
sub set_credentials {
  ( $username, $password ) = @_;
}

# this routine gets called when your browser
#   would otherwise be asked to provide a
#   username and password
sub get_basic_credentials {
  my ($self, $realm, $uri) = @_;
  print( STDERR "  - providing auth to realm \"$realm\"\n" );
  return( $username, $password );
}

Example Usage

You can use your new class where you would have otherwise used LWP::UserAgent (or WWW::Mechanize), e.g.:

package main;

my $mech = MyMech->new();
$mech->set_credentials( "administrator", "incharge" );
$mech->get( $url );
print( $mech->status . "\n" );
print( $mech->success );

Summarising Local Changes Against A CVS Repository

Say you have a large project checked out from CVS with many subdirectories. And you want to know what files you have changed that need checking in.

You could use the cvs status command but the output can get unwieldy on the screen. The following script presents a coloured text summary.

#!/usr/bin/perl -w # -*-CPerl-*-

use constant DEBUG => 0;

use strict;

my $cvs = "cvs";

# using ANSI?
my $ansicolor = 1;
eval "use Term::ANSIColor;";
if ( $@ ) {
    $ansicolor = undef;
}

# can we detect column width?
my $columns = undef;
if ( ! $columns ) {
    eval {
        my ( $fin, $line );
        if ( open( $fin, "resize |" ) ) {
            while ( defined( $line = <$fin> ) ) {
                if ( $line =~ m/COLUMNS=(\d+)/ ) {
                    $columns = $1;
                    last;
                }
            }
        }
        close( $fin );
    };
    if ( DEBUG() && $@ ) {
        my $err = $@; warn( $err );
    }
}

if ( ! $columns ) {
    eval {
        my ( $fin, $line );
        if ( open( $fin, "stty -a |" ) ) {
	    while ( defined( $line = <$fin> ) ) {
	        if ( $line =~ m/columns\s+(\d+)/ ) {
	            $columns = $1;
		    last;
	        }
            }
	}
	close( $fin );
    };
    if ( DEBUG() && $@ ) {
        my $err = $@; warn( $err );
    }
}
#$columns-- if ( $columns && ( $columns > 1 ) );
if ( DEBUG() ) {
    print( "Using columns of \"$columns\"\n" ) if ( $columns );
    print( "Could not determine column width\n" ) if ( ! $columns );
}

# autoflush on
select( (select(STDERR), $|=1)[0] );
select( (select(STDOUT), $|=1)[0] );

# start the cvs status command
my $arg = join( " ", map { "\"$_\"" } @ARGV );

if ( ! open( FIN, "$cvs status $arg 2>&1 |" ) ) {
    die( "Could not start cvs: $!" );
}

# cycle through each line of text returned by cvs status
my $lastdir = undef;
my $lastlength = 0;
while ( defined( my $line = <FIN> ) ) {
    if ( $line =~ m/^(cvs \S+: )(Examining )(.*)/s ) {
        my $dispdir;

        my $dir = $3;
        if ( ( $columns ) && ( $columns <= length( $1 . $2 . $3 ) ) ) {
            $dir = substr( $3, 0, ( $columns - length( $1 . $2 ) ) - 1 );
        }

        if ( $ansicolor ) {
            eval '$lastdir = $1 .color("yellow") .$2 .$3 .color("reset");';
            eval '$dispdir = $1 .color("yellow") .$2 .$dir .color("reset");';
        } else {
            $lastdir = $1 . $2 . $3;
            $dispdir = $1 . $2 . $dir;
        }

        my $length12dir = length( $1 . $2 . $dir );

        $lastdir =~ s/[\r\n]+//gs;
        $dispdir =~ s/[\r\n]+//gs;

        print( ( ' ' x $lastlength ) . "\r" );
        $lastlength = $length12dir;
        print( $dispdir . "\r" );
    }

    if ( $line =~ m/^(File: )(.*?)(\s+)(Status: )(.*)/s ) {
        my $fileline = undef;
        if ( $ansicolor ) {
            eval '$fileline = $1 . color("cyan") . $2 . color("reset") . $3 . $4 . color("magenta") . $5 . color("reset")';
        } else {
            $fileline = $1 . $2. $3. $4. $5;
        }

        if ( $line !~ m/Status: Up-to-date/ ) {
            if ( $lastdir ) {
                print( ( ' ' x length($lastdir) ) . "\r" );
                print( $lastdir . "\n" );
                $lastdir = undef;
            }

            print( $fileline );
        }
    }
}

# clean up screen
if ( $lastdir ) {
    print( ' ' x length($lastdir) . "\r" );
}

# close cvs
close( FIN );

It first attempts to discover the terminal width so that directory names don’t spill off the edge when displaying what folder is currently being analysed.

Example output would be:

host@server:/tmp/cvs# cvschange.pl
cvs status: Examining mytest
File: verifymsg         Status: Locally Modified
cvs status: Examining nagiosconf
File: nsca.pl           Status: Locally Modified

This script was written in 2006.

Quickly Comparing Working Copies With CVS, SVN, and Git Repositories

A frequent problem I had was wanting to run a graphical (UI) diff between my working copy of a file and the latest (or any particular) revision in the source repository.

Of course one can run the common commands cvs diff (and similar for other repository types) but this gives a textual diff which can be less helpful when a source file is long.

So I present here the source code for the Perl scripts I use before every check-in I make.

Note that I have used tkdiff as the graphical diff utility in these scripts. I find this diff tool very lightweight and it is free. I also have variants of these utilities that use Beyond Compare which is a heavier but very powerful diff utility and one which I thoroughly recommend purchasing by any serious developer.

CVS

#!/usr/bin/perl -w
# -*-CPerl-*-

use Getopt::Long;

use strict;

# configure your utilities here
# ... in particular your favourite graphical diff application
my $diff = "tkdiff";
my $cvs = "cvs";
my $cp = "cp";
my $patch = "patch";

my %tempfile = ();
my $result;

$tempfile{patch} = "temp.patch.$$.temp";

sub patch_copy( $ $ $ $ );
sub clean_tempfiles( $ );

# process options
my @revisions = ();
Getopt::Long::Configure("bundling");
GetOptions( "revision|rev|r=s" => \@revisions );

my $cmpfile = $ARGV[0];
if ( ! $cmpfile )
{
    print( STDERR <<EOF );
cvs-tkdiff - compare a file against another version in a CVS repository

Usage:
  cvs-tkdiff [-r <revision>] [-r <revision] <filename>

Description:
  If no revision is specified then <filename> is compared against the
HEAD in the repository.

  If a single revision is provided (e.g. "cvs-tkdiff -r1.1 a.txt")
then the file ("a.txt") in the local working directory will be compared
against the version in the specified commit.

  If two revisions are specified (e.g. "cvs-tkdiff -r1.1 -r1.3 a.txt")
then the files from the two specified commits will be compared to each
other.
EOF
    exit( 1 );
}

my @filediff = (); # will contain the two files to diff against

if ( @revisions == 2 ) {
    # compare revisions mode
    my ( $rev_a, $rev_b ) = @revisions;
    $tempfile{rev_a} = "temp.file.$$.ver.$rev_a.temp";
    $tempfile{rev_b} = "temp.file.$$.ver.$rev_b.temp";

    eval {
        patch_copy( $cmpfile, $tempfile{rev_a},
                    $rev_a, $tempfile{patch} );
    };
    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    eval {
        patch_copy( $cmpfile, $tempfile{rev_b},
                    $rev_b, $tempfile{patch} );
    };
    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    @filediff = ( $tempfile{rev_a}, $tempfile{rev_b} );
} else {
    # compare against CVS version mode

    # does $cmpfile exist?
    if ( ! -r $cmpfile ) {
        print( STDERR "File \"$cmpfile\" is not readable" );
        exit( 1 );
    }

    # get current repository version of file
    my $rev_b = undef;
    if ( @revisions == 1 ) {
        $rev_b = $revisions[0];
    } else {
        my $cvsstatus = `$cvs status "$cmpfile"`;
        my $repositoryversion = undef;
        if ( $cvsstatus =~ m/Repository\srevision:\s+([a-zA-Z_0-9.-]+)/i ) {
            $rev_b = $1;
        } else {
            die( "Could not obtain current repository version" );
        }
    }

    $tempfile{rev_b} = "temp.file.$$.ver.$rev_b.temp";
    eval {
        patch_copy( $cmpfile, $tempfile{rev_b},
                    $rev_b, $tempfile{patch} );
    };

    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    @filediff = ( $tempfile{rev_b}, $cmpfile );
}

# run diff
system( $diff, @filediff );

# clean up
clean_tempfiles( \%tempfile );

exit( 0 );

###############################################################################
# subroutines
#

sub clean_tempfiles( $ ) {
    my ( $tempfileref ) = @_;

    foreach my $key ( keys %{$tempfileref} ) {
        unlink( $tempfileref->{$key} ) if ( -f $tempfileref->{$key} );
    }
}

sub patch_copy( $ $ $ $ ) {
    my ( $fileorig, $filedest, $revision, $patchfile ) = @_;

    # copy original file to destination
    my $result = `$cp "$fileorig" "$filedest"`;
    die ( "Could not copy original file" ) if ( ! -r $filedest );

    # create patch file
    $result = `$cvs diff -r $revision -u "$cmpfile" >"$patchfile"`;
    if ( ! -r $patchfile ) {
        die( "Could not create patch file revision $revision from cvs" );
    }

    $result = `$patch -R -p0 $filedest $patchfile`;
}

SVN

#!/usr/bin/perl -w
# -*-CPerl-*-

use File::Copy;
use Getopt::Long;

use strict;

# configure your utilities here
# ... in particular your favourite graphical diff application
my $diff = "tkdiff";
my $svn = "svn";
my $patch = "patch";

my %tempfile = ();
my $result;

$tempfile{patch} = "temp.patch.$$.temp";

sub normalise_eol( $ $ );
sub patch_copy( $ $ $ $ );
sub clean_tempfiles( $ );

# process options
my @revisions = ();
Getopt::Long::Configure("bundling");
GetOptions( "revision|rev|r=s" => \@revisions );

my $cmpfile = $ARGV[0];
if ( ! $cmpfile )
{
    print( STDERR <<EOF );
svn-tkdiff - compare a file against another version in a SVN repository

Usage:
  svn-tkdiff [-r <revision>] [-r <revision] <filename>

Description:
  If no revision is specified then <filename> is compared against the
HEAD in the repository.

  If a single revision is provided (e.g. "svn-tkdiff -r3266 a.txt")
then the file ("a.txt") in the local working directory will be compared
against the version in the specified commit.

  If two revisions are specified (e.g. "svn-tkdiff -r3266 -r5230 a.txt")
then the files from the two specified commits will be compared to each
other.
EOF
    exit( 1 );
}

my @filediff = (); # will contain the two files to diff against

if ( @revisions == 2 ) {
    # compare revisions mode
    my ( $rev_a, $rev_b ) = @revisions;
    $tempfile{rev_a} = "temp.file.$$.ver.$rev_a.temp";
    $tempfile{rev_b} = "temp.file.$$.ver.$rev_b.temp";

    eval {
        patch_copy( $cmpfile, $tempfile{rev_a},
                    $rev_a, $tempfile{patch} );
    };
    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    eval {
        patch_copy( $cmpfile, $tempfile{rev_b},
                    $rev_b, $tempfile{patch} );
    };
    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    @filediff = ( $tempfile{rev_a}, $tempfile{rev_b} );
} else {
    # compare against SVN version mode

    # does $cmpfile exist?
    if ( ! -r $cmpfile ) {
        print( STDERR "File \"$cmpfile\" is not readable" );
        exit( 1 );
    }

    # get current repository version of file
    my $rev_b = "HEAD";
    if ( @revisions == 1 ) {
        $rev_b = $revisions[0];
    } else {
        my $svnstatus = `$svn status -v "$cmpfile"`;
        if ( $svnstatus =~ m/^.{7}\s*(\d+)\s*(\d+)/ ) {
            $rev_b = $2;
        } else {
            die( "Could not obtain current repository version" );
        }
    }

    $tempfile{rev_b} = "temp.file.$$.ver.$rev_b.temp";
    eval {
        patch_copy( $cmpfile, $tempfile{rev_b},
                    $rev_b, $tempfile{patch} );
    };

    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    @filediff = ( $tempfile{rev_b}, $cmpfile );
}

# run diff
system( $diff, @filediff );

# clean up
clean_tempfiles( \%tempfile );

exit( 0 );

###############################################################################
# subroutines
#

sub clean_tempfiles( $ ) {
    my ( $tempfileref ) = @_;

    foreach my $key ( keys %{$tempfileref} ) {
        unlink( $tempfileref->{$key} ) if ( -f $tempfileref->{$key} );
    }
}

sub patch_copy( $ $ $ $ ) {
    my ( $fileorig, $filedest, $revision, $patchfile ) = @_;

    # copy original file to destination
    my $result = File::Copy::copy( "$fileorig", "$filedest" );
    die ( "Could not copy original file" ) if ( ! -r $filedest );

    # create patch file
    $result = `$svn diff -r $revision -x -u "$cmpfile" >"$patchfile"`;
    if ( ! -r $patchfile ) {
        die( "Could not create patch file revision $revision from svn" );
    }

    # make line endings common
    normalise_eol( $filedest, "\n" );
    normalise_eol( $patchfile, "\n" );

    $result = `$patch -R -p0 $filedest -i $patchfile`;
}

sub normalise_eol( $ $ ) {
    my ( $filename, $eol ) = @_;

    my $fin = undef;
    if ( ! open( $fin, "<$filename" ) ) {
        die( "Failed to open file \"$filename\" for reading: $!" );
    }

    my $fout = undef;
    if ( ! open( $fout, ">$filename.$$.tmp" ) ) {
        die( "Failed to open file \"$filename.$$.tmp\" for writing: $!" );
    }

    my $line;
    while ( defined( $line = <$fin> ) ) {
        $line =~ s/(\r\n|\n\r|\n|\r)/$eol/sg;
        print( $fout $line );
    }
    close( $fout );
    close( $fin );

    rename( "$filename.$$.tmp", $filename );
}

Git

#!/usr/bin/perl -w
# -*-CPerl-*-

use Getopt::Long;

use strict;

# configure your utilities here
# ... in particular your favourite graphical diff application
my $diff = "tkdiff";
my $git = "GIT_PAGER=cat git";
my $cp = "cp";
my $patch = "patch";

my %tempfile = ();
my $result;

$tempfile{patch} = "temp.patch.$$.temp";

sub patch_copy( $ $ $ $ );
sub clean_tempfiles( $ );

# process options
my @revisions = ();
Getopt::Long::Configure("bundling");
GetOptions( "revision|rev|r=s" => \@revisions );

my $cmpfile = $ARGV[0];
if ( ! $cmpfile )
{
    print( STDERR <<EOF );
git-tkdiff - compare a file against another version in a git repository

Usage:
  git-tkdiff [-r <revision>] [-r <revision] <filename>

Description:
  If no revision is specified then <filename> is compared against the
HEAD in the local repository.

  If a single revision is provided (e.g. "git-tkdiff -r0f1bef a.txt")
then the file ("a.txt") in the local working directory will be compared
against the version in the specified commit.

  If two revisions are specified (e.g. "git-tkdiff -r0f1b -rba88 a.txt")
then the files from the two specified commits will be compared to each
other.

  If you want to compare your local copy against the most current version
in the remote repository then specify "origin/master" as the revision, e.g.
    git-tkdiff -rorigin/master a.txt
EOF
    exit( 1 );
}

my @filediff = (); # will contain the two files to diff against

sub safe_fname {
    my $orig = $_[0];
    $orig =~ s/[^a-zA-Z0-9_-]/_/g;
    return( $orig );
}

if ( @revisions == 2 ) {
    # compare revisions mode
    my ( $rev_a, $rev_b ) = @revisions;
    my ( $srev_a, $srev_b ) = map { safe_fname($_) } @revisions;

    $tempfile{rev_a} = "temp.file.$$.ver.$srev_a.temp";
    $tempfile{rev_b} = "temp.file.$$.ver.$srev_b.temp";

    eval {
        patch_copy( $cmpfile, $tempfile{rev_a},
                    $rev_a, $tempfile{patch} );
    };
    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    eval {
        patch_copy( $cmpfile, $tempfile{rev_b},
                    $rev_b, $tempfile{patch} );
    };
    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    @filediff = ( $tempfile{rev_a}, $tempfile{rev_b} );
} else {
    # compare against git HEAD version mode

    # does $cmpfile exist?
    if ( ! -r $cmpfile ) {
        print( STDERR "File \"$cmpfile\" is not readable" );
        exit( 1 );
    }

    # get current repository version of file
    my $rev_b = undef;
    if ( @revisions == 1 ) {
        $rev_b = $revisions[0];
    } else {
        my $gitlatestlog = `$git log --full-index --summary -1 "$cmpfile"`;
        my $repositoryversion = undef;
        if ( $gitlatestlog =~ m/^commit\s+([a-fA-F0-9]{40})/m ) {
            $rev_b = $1;
        } else {
            die( "Could not obtain current repository version" );
        }
    }

    my $srev_b = safe_fname( $rev_b );
    $tempfile{rev_b} = "temp.file.$$.ver.$srev_b.temp";
    eval {
        patch_copy( $cmpfile, $tempfile{rev_b},
                    $rev_b, $tempfile{patch} );
    };

    if ( $@ ) {
        my $error = $@;
        clean_tempfiles( \%tempfile );
        die( "Could not retrieve repository version: $error" );
    }

    @filediff = ( $tempfile{rev_b}, $cmpfile );
}

# run diff
system( $diff, @filediff );

# clean up
clean_tempfiles( \%tempfile );

exit( 0 );

###############################################################################
# subroutines
#

sub clean_tempfiles( $ ) {
    my ( $tempfileref ) = @_;

    foreach my $key ( keys %{$tempfileref} ) {
        unlink( $tempfileref->{$key} ) if ( -f $tempfileref->{$key} );
    }
}

sub patch_copy( $ $ $ $ ) {
    my ( $fileorig, $filedest, $revision, $patchfile ) = @_;

    # copy original file to destination
    my $result = `$cp "$fileorig" "$filedest"`;
    die ( "Could not copy original file" ) if ( ! -r $filedest );

    # create patch file
    $result = `$git diff -r $revision -u -- "$cmpfile" >"$patchfile"`;
    if ( ! -r $patchfile ) {
        die( "Could not create patch file revision $revision from cvs" );
    }

    $result = `$patch -R -p0 $filedest $patchfile`;
}

History

I wrote cvs-tkdiff.pl in 2006 and have subsequently written the variants – git-tkdiff.pl was written in 2009.