newspaint

Documenting Problems That Were Difficult To Find The Answer To

Monthly Archives: April 2013

Best Sort Codes in the UK

Want to open a bank account in the UK? Well if you’re a bit of a number nerd you might want to plan ahead which bank and branch to visit to open your account in.

This is because the bank/branch is allocated a Sort Code – a 6 digit number made up of 3 pairs of digits – and there are some visually interesting combinations. Once you are allocated your first ever account number all subsequent accounts you open will generally come from the same branch – so it pays to choose carefully which sort code you want for a long time!

A great site to pick through (if it still exists at the time you read this) is http://sortcode.a1feeds.com/.

Some interesting sort codes and branches are:

Sort Code Bank Branch
40-01-01 HSBC Acton High Street, London
40-01-04 HSBC Fenchurch Street, London
40-04-01 HSBC Kensington High Street, London
40-04-04 HSBC Kilburn High Road, London
11-10-09 Halifax Queens Arcade, Cardiff
11-11-00 Halifax Hounslow High Street, London
11-12-13 Halifax Commercial Way, Woking
30-99-30 Lloyds TSB Welshpool, Birmingham
60-60-60 National Westminster Westminster House, Gibraltar

Getting To The Bottom Of Why A PhantomJS Page Load Fails

For this post I’m using PhantomJS version 1.9.

Quite frustratingly I occasionally have a call to page.open() where my callback receives a status of “fail”. This isn’t very helpful as it doesn’t describe what went wrong. Was it a SSL handshake problem (using the --ignore-ssl-errors=true command line argument may solve such problems)? Something else?

Unfortunately the PhantomJS API, at present, doesn’t appear to have an ability to determine the reason for the failure of the page to load. But there are a number of callbacks we can hook into to generate a lot of debugging messages to allow us to determine the reason for the failure.

Simplified Reason Tracking

Just before calling page.open() add the following code (after creating the page variable):

    page.onResourceError = function(resourceError) {
        page.reason = resourceError.errorString;
        page.reason_url = resourceError.url;
    };

Now you can print out the reason for a problem in your page.open() callback, e.g.:

var page = require('webpage').create();

page.onResourceError = function(resourceError) {
    page.reason = resourceError.errorString;
    page.reason_url = resourceError.url;
};

page.open(
    "http://www.nosuchdomain/",
    function (status) {
        if ( status !== 'success' ) {
            console.log(
                "Error opening url \"" + page.reason_url
                + "\": " + page.reason
            );
            phantom.exit( 1 );
        } else {
            console.log( "Successful page open!" );
            phantom.exit( 0 );
        }
    }
);

This script outputs the following:

Error opening url "http://www.nosuchdomain/": Host www.nosuchdomain not found

Detailed Logging

Just before calling page.open() add the following code (after creating the page variable):

    page.onResourceRequested = function (request) {
        system.stderr.writeLine('= onResourceRequested()');
        system.stderr.writeLine('  request: ' + JSON.stringify(request, undefined, 4));
    };

    page.onResourceReceived = function(response) {
        system.stderr.writeLine('= onResourceReceived()' );
        system.stderr.writeLine('  id: ' + response.id + ', stage: "' + response.stage + '", response: ' + JSON.stringify(response));
    };

    page.onLoadStarted = function() {
        system.stderr.writeLine('= onLoadStarted()');
        var currentUrl = page.evaluate(function() {
            return window.location.href;
        });
        system.stderr.writeLine('  leaving url: ' + currentUrl);
    };

    page.onLoadFinished = function(status) {
        system.stderr.writeLine('= onLoadFinished()');
        system.stderr.writeLine('  status: ' + status);
    };

    page.onNavigationRequested = function(url, type, willNavigate, main) {
        system.stderr.writeLine('= onNavigationRequested');
        system.stderr.writeLine('  destination_url: ' + url);
        system.stderr.writeLine('  type (cause): ' + type);
        system.stderr.writeLine('  will navigate: ' + willNavigate);
        system.stderr.writeLine('  from page\'s main frame: ' + main);
    };

    page.onResourceError = function(resourceError) {
        system.stderr.writeLine('= onResourceError()');
        system.stderr.writeLine('  - unable to load url: "' + resourceError.url + '"');
        system.stderr.writeLine('  - error code: ' + resourceError.errorCode + ', description: ' + resourceError.errorString );
    };

    page.onError = function(msg, trace) {
        system.stderr.writeLine('= onError()');
        var msgStack = ['  ERROR: ' + msg];
        if (trace) {
            msgStack.push('  TRACE:');
            trace.forEach(function(t) {
                msgStack.push('    -> ' + t.file + ': ' + t.line + (t.function ? ' (in function "' + t.function + '")' : ''));
            });
        }
        system.stderr.writeLine(msgStack.join('\n'));
    };

It is important that before this block gets called after the page and system variables are defined, e.g.:

var system = require('system');
var page = require('webpage').create();

PhantomJS Exit Doesn’t Exit Where You Expect It To Exit

Here’s a little trick using PhantomJS 1.9:

var system = require('system');

phantom.exit();
system.stdout.writeLine( "This is printed after exit!" );

Here’s what it does – it prints out:

This is printed after exit!

This doesn’t happen if you use the console.log() function but it does with the system.stdout and system.stderr objects.

It appears that PhantomJS finishes running the current function before actually terminating the JavaScript interpreter. So one should always take preventative actions to ensure no more code can run after a call to phantom.exit().

First Great Western Refund Fees Outrageous

So I made an online booking for a ticket but accidently chose the wrong day (I picked today instead of 2 days into the future). It was through First Great Western’s web booking site and so I simply logged on expecting to be able to change the date or refund the ticket (I hadn’t collected it from a ticketing machine, so the ticket hadn’t even been issued yet):

Here is what I saw:

First Great Western Fee For Ticket Refund

First Great Western Fee For Ticket Refund

Incredibly, for a £5.90 ticket, which would cost exactly the same two days later, would cost a whopping £10 to refund – and the site will refuse to let me cancel the ticket.

Oh, and it won’t let me change the date on the ticket:

Booking Details Page Refusing Amendment To Ticket

Booking Details Page Refusing Amendment To Ticket

So TEN POUNDS just to perform a few actions through an automated service. This is what is legally defined as a penalty fee – which is illegal in the United Kingdom. There is no possible way it costs First Great Western TEN POUNDS to process an unissued ticket refund.

How To Find A Human Being At The DVLA To Discuss Vehicle Matters

Want To Contact A Person At The UK DVLA About Your Vehicle?

2017-08-22 Update

As you’ll read from a lot of the comments below – the trick seems to be to dial the DVLA – then just simply wait when asked to enter an option. Just wait, wait, and wait. Eventually the number will put you through to an operator. Might take you several minutes but just don’t press any buttons after you’ve dialed. Let the automated operator get as frustrated as it wants!

Original Article

This link, from the DFT (Department For Transport), has a phone number, 0300 790 6802 (03007906802) that will actually take you through to a call centre agent. I found them very helpful and dealt with my issue quickly – to my surprise after pulling out my hair in frustration at the Internet and other phone options that couldn’t help me.

I had wasted a lot of time calling 0300 1234 321 (03001234321) but this was an automated line only and couldn’t deal with my problem. That number might be fine for anything you could already do on the Internet/web but, infuriatingly, had no option to talk with a person.

Waiting For Page To Load In PhantomJS

Here is a function I’ve created which waits for an element in the DOM to appear in PhantomJS.

Some modern JavaScript-dependant pages will accept your form submission then dynamically load the desired response – but this can take some time.

Syntax

The following function takes four parameters:

  • page – reference to the PhantomJS webpage object
  • selector – a string to pass to document.querySelector() to wait for
  • expiry – milliseconds past epoch at which waiting should cease
  • callback – the function to call on expiry or selector element found

Example

For example:

    // click button
    page.evaluate(
        function () {
            document.querySelector("button[name=do]").click();
            document.querySelector("form[name=theform]").submit();
        }
    );

    waitFor(
        page,
        "span.from", // wait for this object to appear
        (new Date()).getTime() + 5000, // timeout at 5 seconds from now
        function (status) {
            system.stderr.writeLine( "- submission status: " + status );

            if ( status ) {
                // success, element found by waitFor()
                page.render( "/tmp/results.png" );
                process_rows( page );
            } else {
                // waitFor() timed out
                phantom.exit( 1 );
            }
        }
    );

Implementation

The waitFor() function is defined as:

function waitFor( page, selector, expiry, callback ) {
    system.stderr.writeLine( "- waitFor( " + selector + ", " + expiry + " )" );

    // try and fetch the desired element from the page
    var result = page.evaluate(
        function (selector) {
            return document.querySelector( selector );
        }, selector
    );

    // if desired element found then call callback after 50ms
    if ( result ) {
        system.stderr.writeLine( "- trigger " + selector + " found" );
        window.setTimeout(
            function () {
                callback( true );
            },
            50
        );
        return;
    }

    // determine whether timeout is triggered
    var finish = (new Date()).getTime();
    if ( finish > expiry ) {
        system.stderr.writeLine( "- timed out" );
        callback( false );
        return;
    }

    // haven't timed out, haven't found object, so poll in another 100ms
    window.setTimeout(
        function () {
            waitFor( page, selector, expiry, callback );
        },
        100
    );
}

Some notes should be made. This function actually polls every 100 milliseconds. When it detects the desired object in the DOM it waits a further short period of 50 milliseconds as a precautionary measure in case the page was in the middle of generating at the time the element was detected.

How Do I Get The Current Page URL In PhantomJS?

If you simply do the following in PhantomJS:

console.log( "- current url is " + document.URL );

then you will see the javascript filename you are running with PhantomJS.

If you want to see the URL of the currently loaded page, however, then you have to do it within the loaded page’s sandbox:

var url = page.evaluate(
    function () {
        return document.URL;
    }
);

console.log( "- current url is " + url );

Passing Variable Number of Command Line Arguments in BASH

The Problem

Sometimes you want to write a script that calls another program but passes all the command-line arguments to that other program. The problem is that the arguments might contain space that you would ordinary enclose in quote marks.

For example, imagine if you wanted to write a script that put a vertical line between the arguments. You might do this as follows:

me@host:~# perl -e "print( join('|',@ARGV) )" one "four hundred" twelve
one|four hundred|twelve

Now you want to wrap this in a shell script so that you don’t have to remember the Perl code:

#/bin/bash

perl -e "print( join('|',@ARGV) )" $@

But this doesn’t work because if you try running this you’ll get the following:

me@host:~# myscript.sh one "four hundred" twelve
one|four|hundred|twelve

Clearly this is a problem. There should have been a space between “four” and “hundred”.

The Solution

The trick is to use the expression ${1+"$@"} as this wraps quote marks around each argument if there is at least one argument.

So if we change the script to the following:

#/bin/bash

perl -e "print( join('|',@ARGV) )" ${1+"$@"}

.. then the script will work as expected now:

me@host:~# myscript.sh one "four hundred" twelve
one|four hundred|twelve

How Does This Work?

Any expression ${variable+alternative} will result in alternative if variable isn’t set. The expression "$@" expands to each parameter with double quotes around it if there is at least one parameter. The problem is that "$@" results in a single zero-length parameter if there are no arguments. And sometimes you need to pass no parameters to a command.

so instead of passing "$@" to the command, we test to see if the first argument exists, ${1+alternative} and provide "$@" if it does (i.e. ${1+"$@"}).

Adblock for PhantomJS

Starting from version 1.9 of PhantomJS there exists the ability to abort a request for a URL.

The below code is an example of how to do this (blocking by site name):

// extract domain name from a URL
function sitename( url ) {
    var result = /^https?:\/\/([^\/]+)/.exec( url );
    if ( result ) {
        return( result[1] );
    } else {
        return( null );
    }
}

// add a callback to every request performed on a webpage
function adblock( page ) {
    page.onResourceRequested = function ( requestData, networkRequest ) {
        // pull out site name from URL
        var site = sitename( requestData.url );
        if ( ! site )
            return;

        // abort requests for particular domains
        if (
            ( /\.doubleclick\./.test( site ) ) ||
            ( /\.pubmatic\.com$/.test( site ) )
        ) {
            console.error( "  - BLOCKED URL from " + site );
            networkRequest.abort();
            return;
        }
    };
}

var page = require('webpage').create();
adblock( page );

If, for example, you wanted to prevent images from being loaded, you could define that adblock() to be:

function adblock( page ) {
    var regexpImg = new RegExp( '\.(jpe?g|png|gif|svg)(\?.*)?$', 'i' );

    page.onResourceRequested = function ( requestData, networkRequest ) {
        if ( regexpImg.test( requestData.url ) ) {
            console.error( "  - BLOCKED URL: " + requestData.url );
            networkRequest.abort();
            return;
        }
    };
}

Launching EC2 Instance From Command-Line But Can’t Find AMI

I was using the command ec2-run-instances to create an EC2 instance. However I kept getting the following error message:

Client.InvalidAMIID.NotFound: The image id '[ami-xxxxxxxx]' does not exist
Error starting (256):

I couldn’t understand it. I was trying to launch the standard Ubuntu Server 64-bit 12.04.1 instance in eu-west-1 (ami-f2191786) but kept getting the above error. I tried other AMI instances and kept getting the above error.

It wasn’t until I explicitly added the region option to the command line that the launch worked. Given that I was specifying a subnet (VPC) that was specific to the eu-west-1 region it puzzles me that this error was being thrown in the first place. Oh well.

To summarise, make sure you add the following flag to your launch:

--region eu-west-1

.. or whatever your region happens to be for that AMI (AMIs are region-specific but ec2-run-instances cannot deduce your region from other region-specific flags such as --subnet).

Note that the --region flag also needs to be specified to many other commands such as ec2-create-tags and ec2-terminate-instances.