back to article Dems are at it again, trying to break open black-box algorithms

Democratic lawmakers once again have proposed legislation to ensure that the software source code used for criminal investigations can be examined and is subject to standardized testing by the government. On Thursday House Representatives Mark Takano (D-CA) and Dwight Evans (D-PA) reintroduced the Justice in Forensic …

  1. An_Old_Dog Silver badge
    Devil

    Smoke and Mirrors

    If this law does not make it illegal for said vendors to obfuscate or encrypt the code they make available for review, then this new law is effectively useless.

    (Icon for, "The Devil is in the Details")

    1. FeRDNYC

      Re: Smoke and Mirrors

      Eh, it doesn't really get that deep in the weeds, and the bill isn't just about source-code review of the tools.

      What (the 2021 version of) the bill actually proposes is:

      1. To direct NIST (the National Institute of Standards and Technology) to draft a set of testing guidelines "to be known as the Computational Forensic Algorithm Testing Standards", by which tools used in generating criminal trial evidence are evaluated for biases and error rates, and to set "requirements for publicly available documentation by developers of computational forensic software of the purpose and function of the software, the development process, including source and description of data used to develop the tool, and internal testing methodology and results, including source and description of testing data".

      2. To remove the trade-secret protection that developers typically hide behind, when refusing to provide that information.

      3. To mandate that defendants against whom forensic-software evidence is used are to be furnished with "Any results or reports resulting from analysis by computational forensic software", "and the defendant shall be accorded access to both an executable copy of and the source code for the version of the computational forensic software—as well as earlier versions of the software, necessary instructions for use and interpretation of the results, and relevant files and data—used for analysis in the case and suitable for testing purposes".

      It doesn't have to be illegal to try to duck those restrictions, it's sufficient that any software so concealed fails to meet the NIST standards. Because an additional key point is:

      4. Make results from any software that doesn't meet the new NIST standards inadmissible as evidence.

      ...Of course, the bill will never pass anyway, heck it'll never make it out of committee. But it's nice to dream.

      1. AVR Bronze badge

        Re: Smoke and Mirrors

        In 2019 and 2021 the bill didn't even make it into a committee, let alone out of one. That's as firm a rejection as you can get and there's nothing to suggest it'll do better this time round. I assume the lobbyists have their arguments and other forms of persuasion ready.

        It does sound like a good idea FWIW.

      2. lukewarmdog

        Re: Smoke and Mirrors

        Does the defendant really need to see the source code? Run the same assessment done by company black box a on competitors black box b. Same inputs, different outcome is grounds for questioning the validity of one of the black boxes without having to see inside it. I'm assuming rigorous testing has taken place already so alleged bias has been measured for all of the software solutions on the market. And you'd be mad to put your source code in the hands of a defendant who'd then have to hire your competition to tell you how it works.

        1. veti Silver badge

          Re: Smoke and Mirrors

          Yeah, "access to the source code" does sound like a sticking point. I'm guessing the bill's authors would be willing to take that clause out, if it would result in their bill getting further through the process.

          But let's face it, even without that, the level of accountability proposed here would make most software vendors - in just about every business - shit their pants.

        2. An_Old_Dog Silver badge

          Re: Smoke and Mirrors [A vs B]

          The problems with giving the same inputs to System A and System B, and then comparing their outputs are:

          * System A might exist, but a System B might not exist.

          * System A and System B might not provide the exact same functions.

          * System A and System B might not use the same algorithms (quite likely, that).

          * Even System A compared with itself might give differing results, depending on which compiler / interpreter is used, and which compile-time / run-time options are selected.

        3. martinusher Silver badge

          Re: Smoke and Mirrors

          We all know that seeing source code isn't a viable option -- you'd have to reverse engineer it to figure out what it does and how it does it and "that might take some time". What is important, though, is to standardize how the code works and especially identifying systematic biases. The code and possibly the platform its run on have to be clearly identified -- its not enough to just produce a print out (its the court system -- it might even need to use a fax somewhere) and state blithely that the information is correct, it has to be provably correct and consistent. (Bit of a stretch for a lot of modern software....)

          1. Michael Wojcik Silver badge

            Re: Smoke and Mirrors

            We all know that seeing source code isn't a viable option

            No, we most certainly do not "know" that. I'd be perfectly happy with a law requiring the source code for any software product used to make any decision regarding the treatment of criminal defendants be published.

            If that drives firms like Northpointe out of business, well, that's a consequence I'll accept.

        4. FeRDNYC

          Re: Smoke and Mirrors

          Full disclosure: I'm an open-source developer and advocate. I firmly believe that it's possible to both create world-class software and to make money doing so, without having to hide your code in order to secure your revenue stream. There are sufficient examples of companies demonstrating that model's workability: Red Hat, Qt, Mozilla, sorta-Google, etc. I believe that security through obscurity is no security at all, so the concept of "trade secret" protections and software patents leaves me cold. Those are the biases I bring to this conversation.

          I'm assuming rigorous testing has taken place already so alleged bias has been measured for all of the software solutions on the market.

          You shouldn't assume that, because the only requirement that these products be subjected to any independent testing AT ALL is in this bill that's never going to pass. Right now, for software being used to produce evidence in trials today, no testing whatsoever is required to have been performed, so how much do you really think has been done? The free market ain't gonna incentivize the elimination of biases here, heck the market probably favors biased systems. (Doesn't it always?)

          Does the defendant really need to see the source code?

          IMHO, yes. It's the surest way to evaluate the algorithm being employed to make decisions that are, quite literally in some cases, matters of life and death. The stakes here are not exactly low.

          To be clear, not every defendant will be expected to evaluate the software being used in their prosecution. But the path towards making it possible to do so, for those with the means and motivation, is to provide everyone with that option.

      3. An_Old_Dog Silver badge

        Re: Smoke and Mirrors [Lying for the Money in it]

        Programmer: "Boss, why do we have to lie on all these documents we're generating for the government?"

        Boss: "Because our program generates a 'slightly'* false-positive rate, which in turn makes a conviction more-likely. More convictions make prosecutors happy, they make police happy (See how well we're doing!), they make the prison industry happy (We're building more prisons! And, we're getting more ultra-cheap prisoner labor!), and in the end, allow us to sell more copies of our software, which makes our Board of Directors happy."

        *A run-time parameter which can be adjusted by the software seller's field engineers from "less-aggressive" to "more-aggressive".

  2. M.V. Lipvig Silver badge

    I see no reason here

    to force a company to reveal its sources. Just do not allow their software to be used as evidence in any trial or any sort of policing action if they don't. Let the company make the decision on whether or not they release the information.

    1. Neil Barnes Silver badge

      Re: I see no reason here

      Why is it even legal to make this sort of assessment anyway? What happened to a jury of your peers?

      1. OldSod

        Re: I see no reason here

        This is my concern as well. A judge, issuing a sentence, elucidates his/her reasoning in a publicly available document. I don't know if this is actually legally required (I am not a lawyer). The use of software to aid in "legal reasoning" should require the same sort of transparency, at least in the US. I am not sure how this would interact with the UK "code is correct" unless proven otherwise assumption in legal cases as brought out in the Post Office debacle. Would an individual, brought in as an expert to aid in sentencing, be allowed to hide his/her calculations, and just issue a plain "the defendant should be sentenced to x years" statement that the court then enacts directly? Is this "trade secret" nonsense supported by the (frustrating to me) belief by some members of the public that if a computer says it, it must be so?

        1. Anonymous Coward
          Anonymous Coward

          Re: I see no reason here

          > A judge, issuing a sentence, elucidates his/her reasoning in a publicly available document.

          Curiously enough, there is a way to make software that, after it reaches a conclusion, also generates a nice report that explicitly details how it arrived at that conclusion.

          But that sort of thing is far too much work for our Modern World[1].

          Why bother to write, say, an Expert System (which would be a great fit for this sort of Decision Making Advisor) when you can just code COMPAS as a couple of Excel sheets (probably with at least a dozen cells showing bad-reference errors "oh, don't worry about those, we'll just pop that sheet put of sight").

          Daft thing is, if the systems actually just bothered printing out all the reasoning in the first place, along with suitable references, then these demands for "show us the source code" would not even need to happen! They could keep their precious trade secrets about how they generated that reasoning: just let the *content* of the print-out be challenged and overturned, not the mechanisms by which it was generated.

          [1] cue rant about how LLMs and yer bog standard Neural Nets are incapable of any explanatory exposition compared to the "too much hard work needed to make them useful" designs, such as XPS, which are explanatory[2] by their very nature.

          [2] opaquely, if you don't bother to make it format the explanations well, but still present

          1. Anonymous Coward
            Anonymous Coward

            I can see where is heading

            Hey Alexa/ChatGPT/Bard - write me a judgement summary to justify this decision.

          2. An_Old_Dog Silver badge

            Re: I see no reason here

            @AC 2024-02-18/01:17 -- Your paragraph #5 (Daft thing is ...) hits the nail on the head.

      2. JoeCool Silver badge

        Re: I see no reason here

        This is more about the evidence and the rest of the process that brought you to the trial, and that jury of peers.

        Note that juries do not examine the science and lab practice behind dna evidence. The justice system has established a threshold.

      3. Michael Wojcik Silver badge

        Re: I see no reason here

        What happened to a jury of your peers?

        In the US, juries do not sentence criminal defendants. They convict them (or acquit, or fail to do either, resulting in a hung jury and mistrial). Judges determine the sentences.

        And that's assuming the defendant requested a jury trial. It's a right; it's not compulsory.1 And statistically criminal defendants do better without a jury, though obviously this depends greatly on the specifics for any individual case.

        1And sometimes even famous defendants who have a history of getting away with, say, fraud, and have reason to believe they might do better in front of a jury, hire incompetent lawyers who forget to request one.

        1. DryBones

          Re: I see no reason here

          As someone that was on a criminal trial to completion as a jurist, I can assure you that they do.

          It may be in some certain situations, like how there can be a bench trial or jury trial. But where I am, at the county level the jury deliberates guilt, then sentencing within the guidelines. The judge can overrule, but if not grossly outside the provided range it stands.

  3. heyrick Silver badge

    Simple fix

    If it cannot be disclosed and analysed as a regular part of the due process of sharing evidence, then it is inadmissible as evidence.

    One should never, ever, rely on "a computer says so" as any statement of fact. Refer to Horizon for a perfect example of why.

    1. Wokstation

      Re: Simple fix

      I was going to post about Horizon too; have American legislators not noticed it, or maybe they don't care.

      1. bombastic bob Silver badge
        Devil

        Re: Simple fix

        American legislators, with few exceptions, are in it for the money and the power and care VERY little about things that do not affect either of those (when it comes to us PEASANTS).

        I was also considering that a simple NDA, whether explicit or implied, is all they should need to protect intellectual property. Just legislate THAT and all should go well.

        (But it takes a gummint to inflate something simple into a money-laundering scheme for your donors)

        1. Michael Wojcik Silver badge

          Re: Simple fix

          Most legislators are also scared to death of the "soft on crime" bugbear, which has been used extensively by members of both parties for decades. The carceral fetish in the US is entirely decoupled from actual crime rates or any sort of rational critique. Most people enjoy being scared, and they enjoy taking revenge on their imagined enemies (rather than their actual enemies). See also the immigration "crisis".

    2. Michael Wojcik Silver badge

      Re: Simple fix

      If it cannot be disclosed and analysed as a regular part of the due process of sharing evidence, then it is inadmissible as evidence.

      The output of the specific software package mentioned in the article, COMPAS, is not admitted as evidence. It's used in determining a sentence (by making a highly dubious1 estimate of the probability of recidivism), not in conviction. That is a process under the control of the judge, modulo statutory and judicial requirements such as truth-in-sentencing laws and sentencing guidelines established by the legislature and courts.

      Certainly the rule you suggest helps with software results used during trial, and I agree it's a good rule, but it doesn't solve the larger problem.

      1I'm aware of the recent study showing that judges using COMPAS and following its recommendation had a somewhat higher accuracy (in the sense of assigning sentences which subsequently correlated to actual recidivism) than judges who consulted COMPAS and overrode its evaluation. I don't think that's a particularly strong conclusion, but more importantly it has no bearing on the issues at hand. We know human judges aren't very good at assigning fair, just, and proportionate sentences.2 Using secret algorithms is a problem in itself, regardless of outcome. Bias in the results of those algorithms is a problem, regardless of overall results.

      2And then there's the whole problem with America's incarceration fetish, grossly excessive sentencing, the prison-industrial complex, and so on.

  4. Anonymous Coward
    Anonymous Coward

    Hierachy Not Recognised.....Funny That......

    Quote: "...barring defense attorneys from reviewing source code relevant to criminal cases...."

    So:

    (1) Software requirements written in english

    (2) Software design written in english

    (3) Various architecture diagrams (remember UML anyone?)

    (4) Database design (you know ERDs and so on)

    (4) Source code (oh dear....various.....C, shell scripts, SQL........)

    (5) Assembler

    (6) Actual machine code

    ......so why are we hearing here that there's a SINGLE LEVEL place to "review the source code" to determine "what the system is doing"?

    ......it's a fantasy................................a fantasy believed by PEOPLE WHO KNOW NOTHING AT ALL ABOUT COMPUTERS IN THE REAL WORLD!!!!!!!

    ......and that's even before we get to the nightmare of "The Agile Manifesto", scrum, "user stories"..................and so on...................

    Why am I not surprised?

    1. Anonymous Coward
      Anonymous Coward

      Re: Hierachy Not Recognised.....Funny That......

      Well, ignoring your inclusion of assembler and binary (which even you admit are beyond the point we'd call it "source code", are you telling us that your source code doesn't include within it comments that contain all the relevant parts of the requirements, design, ERDs etc etc? That you *don't* consider all of that as being part of your project's sources?

      And you don't believe that it ever could be (or that the NIST requirements can not require these products to be) that professionally presented?

      If you have never had the chance to work on a project that properly and professionally comments its sources and how that makes said sources comprehensible to those already working on the project, those being onboarded onto the project (saving time and money) and those providing external auditing of the project, you may want (need) to expand your horizons.

      You can get a (cheap) start by, say, grabbing a copy of Doxygen[1] and running a small demo project with it: write out your req specs as Doxygen input (and hand out the generated PDFs to the various parties for review and amendments - well, as this is a demo, read the ODFs yourself). Ditto the functional specs (and marvel at how you can now also directly reference back from the FS to the RS without fudging anything - or forgetting to do so - and can even do the inverse!). Drop all your design notes into the Doxygen files (and cross-ref back to the FS), add in all your ERDs and UML or whatever other diagrams you want[2]. As you get to it, add your SQL and/or Python and/or Java and/or batch files, marking them up to reference all the prior materials. At the end of it, you have one "project source tree", containing absolutely everything about the project (all in the version control system of your choice - and, at least if you've stuck to the tools suggested here for your demo project, all in nice plain text formats that really make sense when dropped into a VCS - e.g. are diff'able). And your top-level build should be spitting out not a set of binaries, neatly packaged for the relevant target system installers, but also a wodge of documentation that contains everything it is possible to know about those binaries, from the source code (neatly presented with hyperlinks between functions) to all the relevant diagrams, designs and original requirements.

      Now, I realise this idea is alien to you - and, sadly, to many, many people and companies[4] - but it is possible to organise yourself. And for NIST to demand that level of decent presentation be a requirement IN THE CASE WHERE THE SOFTWARE IS BEING USED TO DIRECTLY AND IMMEDIATELY EFFECT AN INDIVIDUAL'S LIFE.[5]

      PS

      Yes, I have done this: the very best time, the sole final deliverable was a DVD with an autostart that opened up the index.html generated by Doxygen, which just had hyperlinks to entry points for the various docs (User Manuals, Installation Manuals - which indicated the directories on the DVD for binaries and PDFs, Specs, C/C++ code with hyperlinks etc) and a note that, to recreate *everything* (binaries, all levels of docs as HTML and PDF etc), just copy the entire DVD to your PC (purely for speed!), then type one cd and a make (copies of all relevant compilers and tools included on the DVD). The man in the neatly pressed uniform was happy with the result.

      [1] other tools are available - I did say "cheap" as in free!

      [2] preferably as graphviz or Mermaid sources, not as JPEGs dumped out of your drawing package[3]

      [3] reminder, we are demoing how to get all of this material gathered together and are using free tools, so you can learn how it *can* work; if you have the dosh you'll be able to find some products that will work with your GUI diagramming tools, if that is all you are capable of using.

      [4] far too many are fixated on writing important documents as Word files, no matter how much time they waste on recreating frontispieces, copy and pasting glossaries (let alone re-inventing glossaries from proposal to proposal) and looking down on VCS and anything not immediately WYSIWYG as "that stuff the grunts in the coding pool worry about".

      [5] a pipedream, this whole thing will be fought by the companies who don't want to admit that their COMPAS is just an Excel 1998 spreadsheet with over a dozen celss that show error messages, but a lively dream nonetheless.

      1. mattaw2001

        Re: Hierachy Not Recognised.....Funny That......

        Amen to the excel spreadsheet being the core of the product - we have a couple of light to medium duty CNC lathes, they have a visual programming system built into the control so you can quickly set up basic jobs.

        We wanted a modification added as at the moment it spits out code that doesn't retract the turret on tool change, often leading to crashes if it's not manually edited into the right spots, and someone eventually forgets. The manufacturer was really cagey and our internal investigations turned up that the whole system is running from VBA in an Excel spreadsheet, and that all the people who knew how it worked have left the company so they don't dare change it.

        There is also reports of a gunshot detection system which is horrendously inaccurate and useless, it sells well as police can call up the support call center and ask if the system has detected anything near an address, which they always agree to. It's probable-cause-as-a-service!

  5. PapaPepe
    Linux

    Geese, ganders and hockey sticks

    Should the same criteria that applies to software models that can determine the fate of one man be likewise applied to software models that can determine the fate of multitudes?

    Just asking...

    1. Anonymous Coward
      Anonymous Coward

      Re: Geese, ganders and hockey sticks

      > Should the same criteria that applies to software models that can determine the fate of one man be likewise applied to software models that can determine the fate of multitudes?

      Yes.

      But, sadly, we need a big, noisy, politically driven, easy for Joe Public to comprehend, hardened point driven into the software industry to start a crack in their stonewall that can then be levered open, finally allowing somewhere for Good Practice to finally seep in.

  6. Tron Silver badge

    Add source code analysis to the lawyer's fees.

    quote: COMPAS to be biased against African Americans.

    One man's bias is another's ML.

    If computers are trained on what happened yesterday to guesstimate tomorrow, they will reinforce the past. That may be legitimate (more men than women rape women) or it may replicate the consequences of prejudice.

    You either train systems on the past, or you program them directly. Take your pick.

    There is no objective result possible from human-derived data, because humans are subjective.

    Over time, does a detective develop legitimate experience or forbidden prejudice? They may amount to the same thing. And they may both be accurate.

    Computers don't interface well with humanity. It might be better not to assume that they do, or that 'artificial intelligence' is actually any form of intelligence, as in SF movies. Because it isn't.

  7. captain veg Silver badge

    ambiguous?

    "Dems are at it again, trying to break open black-box algorithms"

    I read that as criticising "Dems" (whoever they might be) for their antipathy to open (source) algorithms in the field of black boxes.

    That's not it, is it?

    See also "Police rape claim woman in court".

    -A.

    1. DryBones

      Re: ambiguous?

      The best Reg article headlines are an ongoing battle between tongue and cheek.

      1. anonymous boring coward Silver badge

        Re: ambiguous?

        Sounding like Fox doesn't work.

  8. Groo The Wanderer Silver badge

    Far too much worry in Canada and the US about the "rights" of the CRIMINALS and not enough attention paid to the VIOLATED rights of their victims.

    1. Michael Wojcik Silver badge

      Aaaannnnd ... there's the fetish, everyone! Thanks to Groo for playing the part of today's village idiot.

    2. sabroni Silver badge
      Boffin

      re: Far too much worry in Canada and the US about the "rights" of the CRIMINALS

      Are you realy that bad at thinking or just a shit stirrer?

      1. Groo The Wanderer Silver badge

        Re: re: Far too much worry in Canada and the US about the "rights" of the CRIMINALS

        No, I'm a Canadian tired of seeing hardened criminals given rotating jail sentences and committing crime after crime after crime until they have rap sheets multiple pages long.

        1. anonymous boring coward Silver badge

          Re: re: Far too much worry in Canada and the US about the "rights" of the CRIMINALS

          Doesn't sound like USA at all, though.

        2. Cav Bronze badge

          Re: re: Far too much worry in Canada and the US about the "rights" of the CRIMINALS

          Which has nothing to do with this debate.

    3. joyful

      You mean like the rights of that shop which was deprived of a pair of precious socks as opposed to the rights of the defendant who was locked up for 25 years for the crime?

  9. anonymous boring coward Silver badge

    "Dems are at it again"

    You wrote that as if it was a bad thing...

  10. Anonymous Coward
    Anonymous Coward

    Didn't even know software was use to make these decisions. It's lunacy that any such software can be closed source.

  11. AnnieO1966

    All Democrats are CRIMINAL by nature. SO, WHAT'S NEW?

    Any time good Software Engineers create something GOOD for society, Democrats take it and ruin it for the rest of us AND THEN USE IT AGAINST US.

    ALWAYS.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like