Posts tagged sorn

‘Tax Dodge’ bikers vindicated – so where are the apologies?

The all new 2007 figures from the Department for Transport for Vehicle Excise Duty (aka Road Tax) evasion have just been released.

You can tell something is going to be different, when you see statements like this:

Substantial improvements in the way that the roadside survey data are collected mean that evasion estimates for 2007 are not directly comparable with those from previous years.
Analyses of this year’s survey data also suggest that misread registration marks do not have a neutral effect on estimates as previously thought and, instead, tend to inflate estimates of evasion.

This all sums up to a breathtaking conclusion – the evasion estimates reported last year for motorcycle were probably overestimated by staggering 300% (or thereabouts – effectively the stats were done in such a different way that it is impossible to do a direct comparison. Note also that the figures for cars were also overestimated by a similar percentage – but with less dramatic effect or tabloid outrage.)

Put another way, the headline 38% evasion figure reported last year, and repeated last month with some anti-biker vitriol by MP Edward Leigh, were roughly 4 times higher than they should have been.

At least.

In fact there are still some problems with the reported figure of 9.8% evasion for bikers.

First the sample size is still very small – that makes the error margin over 50%, so (even taking nothing else into account) the figures for bikers could be as low as 4.7%.

Second, the change to the survey methodology that had the biggest impact was the switch to using Automatic NumberPlate Recognition (ANPR) cameras. Using these they were able to check misread plates for the first time, and found they were incorrectly matching to vehicles removed from the roads much more often than they had expected.

BUT, for collecting the data on motorcycles they did not use ANPR, but instead relied on contractors stood by the side of the road with a clipboard. It seems inevitable that a guy with a clipboard by a motorway trying to jot the number of a moving bike travelling at 70mph (let’s assume bikers don’t ever break the speed limit) is going to write down the wrong number more often than an ANPR computer which takes a still photo of the same vehicle and then uses Optical Character Recognition software to match up the letters, for the simple reason that, the computer doesn’t have to deal with the effects of a high speed movement.

Someone might have picked up on this, had the DfT not glibly stated in the previous years report, that they had computed the effect as a ‘slight upward bias.’ The admission that they got this so badly wrong will be little comfor

The DfT also notes they made a number of other changes to the statistical methodology, in line with the Southampton university report into their previous methods and assumptions.

It is therefore my opinion that the figures for tax evasion by motorcyclists, although markedly reduced and only a quarter of what was previously being claimed, is still a considerable overestimate.

If next year they manage to use ANPR to record motorcycles as well as cars, and also collect some hard data about relative mileages traveled by taxed vs untaxed motorcycles (which currently they only have for trucks), my bet is that the numbers will dramatically fall again.

But in light of this publication, where are the apologies.

Miscalculations of this magnitude represent some serious bungling by the ‘top statisticians’ we pay our taxes to employ. I think, at the very least, bikers are owed some major apologies from Edward Leigh, the House of Commons Public Accounts Committee, the Department for Transport and National Statistics. The DVLA probably deserve an apology too – they were castigated for their poor performance in managing tax evasion, even though their own figures suggested they were collecting more tax than ever.

See also MCN’s story.

Goverment Wrongly brands bikers as tax cheats.

This is a summary of my more detailed post on Friday, outlining the key points.

  • The DfT surveyed traffic in June and July.
  • They waited until September to ensure no late corrections to the data.
  • By this time many of the bikers surveyed had taken their bikes off the road for the winter.
  • At this point the DfT checked the figures against the DVLA’s VED database.
  • The bikes which had been removed from the road were mistakenly assumed to be evading tax.
  • The error was then amplified by a “corrective” assumption that tax dodgers would use their bikes less and get missed by the survey, the number of evaders would be underestimated. This one step doubled the number of bikers assumed to be riding without tax and it did this because it assumed bike mileage figures would match evasion figures in the same way as they do for trucks (The only category of vehicle they have stats for). This doesn’t take account of the important fact that the average motorcycle covers many, many fewer miles than the average trucks. That means that the people you see most often are not necessarily people traveling furthest as they would be for trucks, but are much more likely to be people who just live nearby to a survey site. This means that the assumption that you will see lower than actual levels of evasion (because tax dodgers travel less far) is undermined, as only a very few of the people on bikes are travelling large distances.
  • Because of the small number of motorcyclists surveyed, the DfT’s own figures show that the margin for error would be at least 20% either way even if the incorrect assumptions were to have been true.


What does this mean?

These are quite major flaws in the methodology of the survey and (I think) blow apart the reported figures. The headline figure was extrapolated from an “observed” figure of only 16% on the basis more tax evading motorcyclists would have been missed, as they don’t travel as far, which I’ve shown above is almost certainly a flawed assumption. If only around 10% of the riders had SORNed their bike at the end of August or during early September, the vast majority of the “untaxed” bikers would disappear from the stats. Add to that a 20% margin of error, because of the small survey size and the figures may well be comparable with the rates for cars. A precise figure is going to be very difficult to arrive at, as no-one currently has the relevant data that could quantify the errors more precisely.

Whose fault is it?

The mathematics used in the statistical modelling was all applied correctly. The errors arose because of mistaken assumptions about how motorbikes are used and would probably have been spotted if a single representative of the motorcycling community had been consulted at the design stage of the survey. What probably should have been spotted is the ridiculously high figure of 38% evasion, which should, I believe, have raised alarm bells. I suspect that this is why Southampton University were asked to double check the result, but they only checked the statistical techniques used, and did not carry out an assessment of way the VED data had been obtained nor of the validity of the underlying assumptions.

So the blame for all of this lies with whoever designed the survey and data processing methodology, and not with anyone who actually carried it out.

I think at the very least all bikers are owed an apology from Edward Leigh MP, of the Public accounts committee for his intemperate remarks. And another apology is due, I feel, from the DfT, for managing to balls up the figures in quite such a spectacular fashion.

Evading Road Tax is wrong – but not as wrong as the Government Statistics

You probably saw on the news the shocking headline that 40% of motorcycle users are riding around on untaxed bikes. The head of the House of Commons Public Accounts Committee (PAC) (download pdf of the PAC’s Vehicle Excise Duty report here), Edward Leigh, went as far as to comment:

Large parts of the biking community are cocking a snook at the law.

Which would be fine if it bore any resemblance to reality. Anyone who is actually a part of the biking community will have probably been scratching their heads trying to work out who these evaders might be, as have our friends at MAG – they say:

Anecdotal visual studies carried out by the group at motorcycle events do not reflect anything remotely like this level of non compliance. (MAG article)

So what’s going on?

Well a press release by the Motor Cycle Industry Association (MCIA) (download it in Word format here)questions some of the methodologies used by the government statisticians. David Taylor, head of the MCIA, says:

We are expected to believe that motorcycle VED evasion rose by 47 per cent from an already highly unlikely figure the previous year. Common sense suggests that the estimate of nearly 40 per cent is wildly inaccurate, or they would surely be very easy to catch.

That seems obvious enough, but what isn’t at all obvious is the methodology and statistics adopted by the Department for Transport (DfT), who commissioned the traffic survey, the private company that carried out the survey, National Statistics who analysed it, DVLA who provided large amounts of data and the Commons PAC (the group of MP’s who have to interpret all this), who caused all the fuss.

Now there’s no requirement to be trained in statistics if you are an MP on the PAC, so despite Mr Leigh’s intemperate outbursts against bikers, we can’t blame him or the rest of the committee for taking the DfT report at face value.

When you take a deeper look at the DfT report and the National Statistics report that underlies it, though, it’s obvious that there are some big assumptions regarding the data and the statistical techniques that have been carried out on it.

This might get quite heavy, but bear with me – there’s not too much maths.

Now I’m assuming that not many people reading this have any kind of statistics qualification, so I’ll try to summarise what I’ve been able to work out without using too much maths. Unfortunately, the relevant bodies above haven’t been that kind, so if anyone is better at this kind of maths than me here are the original documents, feel free to post comments:

  1. National Statistics VED stats 2006
  2. National Statistics / DfT report of above VED stats.(pdf)
  3. Statistical Review of the VED figures by Southampton University

In what follows I’ll refer to these documents by the number I’ve given each one.

Looking at the full published figures, you will find hidden away in an appendix a list of confidence ranges for the final estimate of percentage of vehicles with no VED (road tax). (1 – appendix E7 / Table 18) Confidence range is a statistical term, but it’s not that hard to grasp, using the figure in question as an example. What it means in practice is that although the average evasion recorded is running at 37.8 percent, this is an estimate, but based on the data collected they can say they are 95% confident that assuming all their prior assumptions are true the true figure lies somewhere between 29.9% and 45.7%. This is a truly massive margin for error (for comparison the equivalent 95% confidence figures for cars are 4.0%-4.6%) and shows that even in the best case scenario the margin for error in the bike figures is going to be running at ±20%.

Why is this figure so massive. Well, there are a few reasons, but it mainly comes down to the fact that the number of bikes counted in the survey was much, much smaller than the number of cars. For every bike they counted, they saw 110 cars. (1) Table 10 Whenever you have a small sample, your uncertainty will be larger. Small uncertainties in a sample also tend to balloon when you perform other operations on that sample, as you introduce extra uncertainties which multiply through.

As I mentioned in passing above, all of these error margins are “best case scenarios” and rely very much on the assumptions made by the method used to derive the figures. If there are incorrect estimates made due to these assumptions the final figures will be seriously distorted

For this survey I believe that the assumptions made are in many cases entirely wrong, and I believe this has played a large part in inflating the figures. And although the figures have been independently checked by statisticians those doing the checking have done so on the basis that these assumptions are true (3), as they rightly state at the start of the analysis.

Assumption (b) is a crucial one. In statistical language it is this:

the observed sample of vehicles sighted in the Roadside Traffic Observation Survey is a simple random sample with replacement of the registered vehicles(3)p13

This is not immediately obvious, and uses technical terms, but what it means is this. Every time the person at the side of the road takes a measurement, the chance of seeing a particular vehicle pass by is the same as if he were picking registration numbers at random from a massive lottery machine containing a single ball for each vehicle in Britain. This is justifiable, if you were to be recording the traffic on every road in Britain, but in practise, with only 249 sites, this method has potential to be badly skewed. Anyone who happens to live near one of the sampling sites has a much higher probability of being ‘picked’ (and probably picked multiple times at that) than someone who never passes by.

What this means is that the selection of sample sites is going to have a large bearing on what is recorded. The sample sites chosen represent 1 of each kind of road (as defined by the DfT) per police force area (49 of those) with London getting an extra 3 of each. Motorways are sampled by local government region, and fewer of those are picked.

How have they selected which roads to measure? Well they left that “to the discretion of contractors” who had to reach a set minimum number of vehicle sightings at each location. We can guess they probably chose fairly busy examples of each type of road. It seems likely that the type of person who is going to dodge road tax is more likely to be of a lower social class than average, and live in a scummier area, probably closer than average to a busy road. This is supposition, but it is reasonable, and there is nothing in the results the DfT have presented to suggest that this kind of sampling bias has been effectively eliminated.

Another assumption taken by the survey is that stated quite clearly in (3)

One of the most important assumptions in the model is that the average number of sightings of a given ve
hicle is proportional to its mileage. This hypothesis is not testable from the survey data itself because the mileage of individual vehicles is not directly observed through the survey process. However, the first time that this working assumption was adopted – see §4 in Appendix C of (Department of Transport, 1984) – a postal survey of the keepers of heavy goods vehicles was used to test the adequacy of this hypothesis. Given that this research was carried out some time ago and for a limited sample of vehicles in a single tax class, the Department for Transport should investigate whether alternative data sources exist, or could be obtained, which could be used to re-examine the validity of this crucial assumption.

Or, put differently. it seems unlikely that motorcycles on the road today are being used in the same way as trucks were used in 1984! Estimated mileages for bikes are therefore likely to be way out of kilter with actual figures.

All of these assumptions seem likely to overestimate the proportion of bikes appearing to evade road tax, but there is potentially a far bigger problem, which is not even mentioned in any of the DfT documentation – bikes are much more likely than the average vehicle to be taken off the road.

Many bikers, as we all know (but perhaps the DfT doesn’t) are fairweather bikers. Many more people own a bike but might, like my Dad, keep it locked in a garage for years at a time. Many of these people will have notified the DVLA that the bike is stored off-road via a SORN form, but I would guess that a lot of people don’t. We probably all know someone with a bike in their garage that they probably didn’t use at all last year.

Notice there are two different figures for motorbikes without VED. The figures are 16% of motorbikes in traffic, and 37.8% of vehicle stock (i.e. all bikes with a reg number). Why is the second figure more than double. The logic runs like this. The vehicles they spot on the road tend to be those that travel more miles, so they will have not counted lots of vehicles that have a fairly low mileage. Because they also know that untaxed vehicles have lower average mileages, they apply a corrective figure.

What they don’t seem to have taken account of, though, is that at any given time, a very large number of motorcycles are sat in a garage not being used for months at a time, hence, for those bikes with a mileage of 0, they may be estimating high levels of tax avoidance!

If the assumptions about motorbike usage are corrected, we might find that this doubling effect will vanish.

But there is one last doubt I have about these figures, that might potentially show massive levels of VED avoidance where very little exists.

It’s to do with the way the survey has been carried out and the figures derived.

All of the survey data was carried out in June and July of 2006, a time of the year when a lot of bikes are on the road. At some point the registration numbers queried were tested against the DLVA’s VED database. The survey results were first published on January 25th 2007. Following that a review was carried out on the figures, and the PAC finally got round to studying them just a few days ago.

The crucial question is this – when were the data compared against the DVLA database, and what figures did they use?

The survey contractors almost certainly didn’t have the means to check the DVLA database in real time, and probably didn’t have access to the data, anyway. It seems likely that they would have returned all the data at once, at the conclusion of the survey period. But they probably carried out their own checks on the data before they did. So it seems likely that some time between July 06 and January 07 they checked against DVLA records. It then takes the DfT and National Statistics a further six months to process the data – they probably have a lot of validation and checking work to do, but exactly what and in what order we do not know.

So consider the following scenario. I get my bike out of storage on March 1st and tax it for six months. During June I ride by one of the government surveys and am counted as “on the road” and “in traffic”. Come August 1st the bike is back in the garage and SORNed for the winter. If the DfT didn’t get round to checking the VED database until September I’ll probably be recorded as not being taxed.

Did this happen? I don’t know. But realistically, the a huge number of bikers are going to put their bikes in storage through the winter and claim a VED refund by way of SORN, so if it did, the effect could be a huge overestimate of the number of bikers dodging tax.

Again I don’t know if this has happened, but I have asked National Statistics for more detail on how the VED status was derived and when (if) they reply, I will post here.

Well, this is a mammoth post already, but we’ve reached the end. There are other areas where an unintended bias may have crept into the sample, but we’ve considered what I suspect are the biggest sources of potential error.

The bottom line is that I think it is extremely unlikely that as many as 40% of bikers are evading VED and what this shows more than anything is the danger of placing total faith in your statistics when the underlying assumptions are not a realistic model of the situation you are trying to assess, as well as the difficulty of designing a survey.

And the motorbike insurance angle, of course, is that all of these people without tax presumably have no insurance. Given the prevalence of Automatic Numberplate vehicles run by the police forces now, you’d think they would have noticed if 40% of the motorbikes going by were uninsured, wouldn’t you?

************************************UPDATE***************************

I’ve received my reply from National Statistics to my query about when the survey data was checked against the DVLA database.

Here is the text of the email:

David,

Thank you for your enquiry.

The data for the VED survey in June 2006 was checked against the DVLA system in September 2006 to allow for late updates to be made.

P*** S****
www.dft.gov.uk/transtat/vehicles

So as I guessed above, anyone who SORNed their taxed bike before or during part of September will have been assumed to be dodging tax. This is going to have had a massive impact on the figures as reported in the press and the true figures for uninsured riding are much smaller than the ones the government has taken at face value.