back to article Defuse census outrage with independent oversight of data-handling

The ongoing argument over the 2016 Australian Census highlights a broader problem, argues opposition MP Tim Watts: the lack of any kind of national information policy. In this post at Medium, Watts notes that the information reform agenda was announced by the party during 2015, and would comprise a systematic scan of “ …

  1. Anonymous Coward
    Anonymous Coward

    Optional

    Actually, the problem here is that Australia doesn't need an independent data broker because this information shouldn't be held by the government. It doesn't matter whether its the ABS, a data broker, the old Australia card proposal, or anything else.

    Until now the ABS hasn't been much of a target for attacks for information theft because the analysed data is available anyway. There isn't much point in stealing the raw data to run your own statistical analysis when you can wait a few months/years and let the ABS burn their CPU time doing all the work for you. This proposal means that the ABS (or an independent data broker) now holds information that is valuable enough to be stolen because it has the identifying information.

    How would thieves steal the data? I have no idea. From my experience in IT I have realised that I don't need to know and I may well never understand fully. However, sometime in the not to distant future, someone a lot smarter than me (or not if they get lucky) will find a way. It doesn't matter if they (ABS, Data Broker, whoever) are using the industry best practices for security right now. The problem is hopefully in the future. As someone often quoted once said: "prediction is very hard, particularly about the future".

    There is a simple remedy for this threat. Don't store the information at all. Ever. Problem solved. You can't steal something that's not there.

    Why is the information needed? The Australian Bureau of Statistics needs to take a look at their name. The third word is a bit of a giveaway: "Statistics". Unless you need statistics like "how many red-heads, living in Queensland are called 'Bruce'?" you don't need names. Nobody wants those statistics, knowing that information doesn't improve government services or better enable business to understand their customers. You can't do maths on names and you don't need to use them for classification. They are worthless for statistics.

    Connecting census data with other records is unnecessary. If you want to know how many times in the last year I visited some medical specialist, then ask on the census form. If you think I won't provide the information then that says something else that they should be considering and maybe they don't need to know.

    1. veti Silver badge

      Re: Optional

      As I understand it (and I admit, I had the same misgiving when I saw the article's terminology), it's not suggesting that the 'data broker' actually hold any data itself. All it has to do is provide a secure 'cloud' platform for other people to hold data, and give them the tools to manage access to it and the rules for using them. There's no reason why the broker itself would ever need to access that data - indeed, it'd be better if they have no way to decrypt it at all.

      "Connecting census data with other records is unnecessary" ... yee-eess, unless you want to, y'know, USE the census data for something. Like projecting future road use, capacity for public services such as schools, parks, libraries, demographic projections - you know, the things that are the whole purpose of having a census in the first place.

      You are, of course, quite right that there's no earthly reason why personally identifiable information should be stored with census data. Last time I filled in a census form I don't think it even asked for names, although I don't know what the Australian census collects.

      1. Kratoklastes

        Dunning Kruger is no excuse.

        <blockquote>projecting future road use, capacity for public services such as schools, parks, libraries, demographic projections - you know, the things that are the whole purpose of having a census in the first place.</blockquote>

        I'll preface this by saying that the next paragraph is not just pure 'skiting': it's a potted way of establishing that I'm qualified to make declarative assertions about data accuracy and quality, and the extent to which any attempt to create 'noiseless' data will help in formulating 'accurate' projections.

        I've done a bunch of stuff on projects that did projections for housing demand (and supply of residential and industrial land), demography (regional migration; catchments for major retailers; proximity models for house prices near proposed railway stations; changes in household composition by small-area aggregates). Geospatial analysis is one of the things I understand reasonably well - up to and including analysing annual changes by individual cadastre parcel for the 31 Melbourne metro LGAs for 2004-2012, and the 5 Geelong-area LGAs for 2006-2015. My 'strongest suit', though, is the statistical analysis of data. My 'formal' training - Honours, Masters and PhD (incomplete) - was in Economics and Econometrics. I won the ABS prize in my Honours year, and got an RBA cadetship (one of only 4 offered in the entire country) and the Vice-Chancellor's undergraduate research award (the only student in the faculty who got one). I got straight Firsts for my Masters coursework subjects. One of the papers I co-authored resulted in Treasury asking our team to help them implement rational expectations in their macroeconometric model (TRYM). My PhD dissertation spent several sections demonstrating how using central-tendency measures as 'exogenous' inputs to a non-linear model was a waste of time[1].

        Phew...

        So with that by way of background... let's get to the idea that using the census data gives a better estimate of forward numbers, than a standard exponential curve with completely artificial noise (of the form x[t]=x[t-1]+e[t] where x[t] is the log of the variable of interest X at time t, and e[t] is a lognormal random variate).

        In other words, the key question is

        <blockquote>how much additional accuracy in projections would be obtained by using 'accurate' census data, versus modelling percentage changes in literally any metric of interest by dlog(X)=e where e is a vector of lognormal variates?</blockquote>

        Award yourself dix points if you realised that I was sneaking up on the idea that the correct answer is "None. There is literally zero reduction in forecast MAPE from using historical survey data, over a Monte Carlo simulation using 'sensible' estimates for the conditioning parameters for the distribution of e.".

        Award yourself another soixant points if you understand what variables cause the correct answer to be the case. Those variables are technological and preference changes and policy variables. Future values of these variables are literally impossible to estimate at an aggregate level, and even more impossible-er at a sectoral level... and they are not geographically constant (so Frankston and Brisbane will not have common tech change, preference and policy parameters in a regional model).

        If you have snaffled all the points on offer up to now, you are barely at 'HIIB' level, which means that I would not listen to you if you were a government advisor (most government advisors are IIA's, but that's still a very low bar).

        Another dix points will get you an HI (but only in one subject). These can be garnered by grokking the footnote.

        Footnote[1]... This is also true in a linear model, because linear models are not bijective from the exogenous variable space to any subset of the endogenous variables; policy analysis is only ever interested in 'key' subsets of the entire endogenous variable matrix.

        To see why this non-bijectivity is the case a fortiori, change the closure (swap the endogenous variables-of-interest for the same number of 'naturally' exogenous variables - so the system remains mathematically solvable).

        Force the swapped endo-vars to remain unchanged, then perturb the rest of the exogenous variables by some arbitrary percentage and solve the model. Do that several hundred times, and you will have several hundred sets of all variables where the endogenous variables of interest take the same value, but the exogenous variables are different. Bijectivity... categorically rejected.

        Congratulations... you just proved that there are multiple vectors of values for the 'exo-vars' that are consistent with the same vector of values for the endo-vars of interest. (This is why I stopped being interested in 'point' (or 'single-path') forecasting: to say anything meaningful about the statistical properties of the endogenous variables of interest, requires a stochastic sensitivity analysis).

        Award yourself the last dix points and join the Firsts. You still need a further douze points to finish next to me in 4th year. (OK, so that last bit was pure skiting).

  2. Diogenes

    The answer to more government is NOT more government

    Title says it all.

  3. Bubba Von Braun

    Just say NO

    I think Bill McLennan's article sums it up well.. https://www.privacy.org.au/Papers/ABS-Census_2016_and_Privacy_v8.pdf

    They don't have the statutory authority to ask for your name, and further the risk is government will just change whatever rules, to suit itself.. usually in the name or protecting us from some threat.

    for those who don't know who Bill is he is a past Australian Chief statistician (ie the Head of the ABS!!)

  4. Anonymous Coward
    Anonymous Coward

    Census night tonight

    Big upswing in guys called Malcolm Turnbull....

    1. MrDamage Silver badge

      Re: Census night tonight

      Or just for something different;

      Malcontent Turdball

      1. Anonymous Coward
        Anonymous Coward

        Re: Census night tonight

        Jobson Growth

        (with apologies to 1000s of AGE letter writers who thought of it before I did)

    2. Bubba Von Braun

      Re: Census night tonight

      Malcolm can afford to keep us all. ;-)

  5. dan1980

    When I read Richard's articles I am nearly always stuck wondering whether he is an incorrigible optimist or whether he is pointing out the yawning gulf between sensible, considered policy that respects those it covers and what we actually get, which is frequently quite the opposite.

  6. Kratoklastes

    This is the type of “throw more money at it” response that should be expected by those who spend their life on the tax tit.

    Let’s get to the core argument of this tax-eater’s thought-bubble…

    Government fails after spending 10-figures on something that a couple of 2nd-quintile undergraduates could accomplish in 3 days… so that means government has to set up yet another trough and hire a bunch more ASO5s and 6s who will work in a stultifying cubicle-farm (overseen — let’s be honest — by someone who is mates with the Minister).

    The result will — always and everywhere — be a boondoggle staffed by semi-competents, which will fail to achieve its objectives (even though those objectives will be low-balled by the political-parasite class).

    Bear these two things at the front of your mind…

    (1) — the ABS data repository is not secure. Anyone with any experience in pen-testing can verify that for themselves, and if I was ever dragged into court for refusing to participate in the census I would prove it in real-time. The ABS’s ad for its intrusive addition to government surveillance programs is typical of government advertising — it has the same truth content as a shampoo or cosmetics commercial.

    (2)— The data will be handed to Five Eyes — the Australian government’s surveillance sharing program with the US, UK, NZ, and Canada. Anyone who thinks otherwise is a naïf or a shill.

    inb4 “If you have done nothing wrong, you have nothing to fear”. If that’s true, what wrong has .gov perpetrated that makes it require its gigantic squads of obese half-wits in faux-military drag at airports? (I speak here of the ‘Border Farce’ — another corrupt crony-infested boondoggle). Ditto all the security at courts and every major .gov installation. They clearly fear us - so by their own logic they must have done something wrong, right?.

    The NBN is another example that really does show what government contracting is all about. It was birthed in corruption, and was never, ever going to come in on budget or be delivered on time, and it was always always going to be obsolete before it was finished.

    As it stands, every Australian household is on the hook for at least $10k before a single byte of data is downloaded. That’s roughly 10 years of ‘full-whack ADSL plus calls to locals and mobiles’… before anyone connects to the thing.

    And NBN will be obsolete before it’s launched (it’s obsolete now).

    And some politically-connected vermin will buy mansions in Potts Point based solely on the dough they have snaffled in the crony-fest.

    Government failure is a far more important drag on economic activity than market failure — when governments fail to educate children, furnish health-care, or prevent property crime, the government answer is to reward that failure with greater budgets: the private sector withdraws capital from failed projects and lets them be replaced by someone with better ideas.

    Let’s change the system so that the franchise only extends to those who are net tax payers — whose tax payments more than offset the goods and services they obtain from government. I refer to these people as NTP (Net Tax Payers), to be contrasted with NTR (Net Tax Recipients).

    That means no politician or bureaucrat would have a vote, because their entire salary has to be funded by the net taxes from NTP private sector workers. Sure, they give some back (i.e., they pretend to pay tax), but that is simply the return of some portion of the taxes that are used to pay them in the first place… and they still get .gov-furnished goods and services.

    Better yet: make .gov a subscription service. I get absolutely nothing from .gov that I could not get at a better price from the private sector. Monopoly always results in low-quality, expensive output, whether it’s one-size-fits-all Mao suits, or politically-monopolised justice and law enforcement.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like