back to article Thanks to generative AI, catching fraud science is going to be this much harder

Generative AI poses interesting challenges for academic publishers tackling fraud in science papers as the technology shows the potential to fool human peer review. Describe an image for DALL-E, Stable Diffusion, and Midjourney, and they'll generate one in seconds. These text-to-picture systems have rapidly improved over the …

  1. John H Woods

    Goodhart's Law

    I think I see the problem ...

    "... benefit academics who are compensated based on their accepted paper output, or to help a department hit a quota of published reports ..."

    1. jake Silver badge

      Re: Goodhart's Law

      Exactly. This has nothing to do with research or science.

      This has everything to do with papers == grant money.

      It is time to institute an across-the-board rule that says something along the lines of "If you are caught with AI fakes in your paper, you and your department will never get grant money ever again. Period. Because we can't trust cheaters, and everyone knows that once a cheat, always a cheat." And maybe rivet a big, polished brass S to the department head's forehead for being Stupid enough to allow the paper to be submitted for review in his/er department's name.

      1. yetanotheraoc Silver badge

        Re: Goodhart's Law

        "everyone knows that once a cheat, always a cheat"

        I don't know that. It would take a lot to convince me it's true. What you got?

      2. Aitor 1

        Re: Goodhart's Law

        It is already a thing... i have seen people get fired over fakes.

        But some desperate people will always think "either I fake it or I lose my career". Not putting people under this stress will reduce burnout and false data. Sadly it will still be a thing.

        1. Anonymous Coward
          Anonymous Coward

          Re: Goodhart's Law

          Plenty of depts have the same incentives , so little reason to look too hard .

        2. SuperGeek

          Re: Goodhart's Law

          Fake it till you make it.....or not!

      3. Michael Wojcik Silver badge

        Re: Goodhart's Law

        It is time to institute an across-the-board rule that says something along the lines of "If you are caught with AI fakes in your paper, you and your department will never get grant money ever again. Period. Because we can't trust cheaters, and everyone knows that once a cheat, always a cheat." And maybe rivet a big, polished brass S to the department head's forehead for being Stupid enough to allow the paper to be submitted for review in his/er department's name.

        It's completely unreasonable to expect a department chair to police the academic output of the department. That's not what they're trained for, they don't have the resources for it, and it's not a feasible process anyway.

        So all that would happen under this regime is, first, a bunch of innocent researchers people would be punished because of one bad actor; and then universities would tie the whole nonsense up in the courts and get the rule stayed by judicial order forever.

        We absolutely do need to mitigate the revenge effects of the publication mandate, which has all sorts of other problems, such as the excruciatingly low rate of reproduction studies (because they're not rewarded). And there are a number of cognate issues; the ACM has in recent years raised the problem of the close relationship between conference presentation and publication, for example, which can also impede good research. But a guilt-by-association overreaction certainly will not help.

  2. Anonymous Coward
    Anonymous Coward

    dodgy westerns are the tip of the iceberg alongside p-hacking, bloated reference sections to game citation scores, bloated author lists to game h-index scores, peer review where nobody re-ran any code or analysis but got a free pass based on the name of the principal investigator and institution (if the journal requires data and code to be submitted in the first instance). Generative AI probably doesn't add much value over the current standard operating model of scientific publishing, sadly.

    1. Alumoi Silver badge

      ^^This post was bought to you by ChatGPT.

  3. Martin Gregorie

    Scientists? Really?

    "Scientists can just describe what type of false data..."

    As a science graduate and retired IT professional I strongly object to you referring to these fakers as 'scientists'.

  4. Blackjack Silver badge

    [David Bimler, another expert at recognizing image manipulation in science papers, better known as Smut Clyde]

    You know if all researchers had nicknames like this people would pay more attention.

    Ah right Smut Clyde is the image manipulation in science papers not the guy nickname.

  5. tp2

    Guess my software product has this exact problem too. Basically I offer researchers ability to create artist's impression images based on research abstracts, but the software I have created cannot detect if the "scientist" is using the software for artist impression images or generating fake research images. Real researchers will find it out early enough when they put a reference to a tool called "GameApi Builder" that the tool wasn't meant for research purposes, but the research modules has failed during implementation of the software. Thus it's only useful for artist's impression images. The reason for the failure is that c++ compilers are breaking the module which was supposed to be used for research -enabling piece/there's a compiler or cpu dividing line that our programmers couldn't avoid while writing the software. => guess c++ programmers are not welcome in research area according to c++ compiler vendors.

    1. Anonymous Coward
      Anonymous Coward

      Which generative AI write that?

  6. Anonymous Coward
    Anonymous Coward

    “I Like Money” - Frito

  7. Bebu

    "il est bon de tuer ... un Amiral pour encourager les autres."

    "ability to create artist's impression images based on research abstracts"

    How is that going to work? From the myriad abstracts from a great variety of disciplines that have been my misfortune to have read, there is normally very little data to feed such a monster. You need to possess a fairly high level of relevant domain specific knowledge to make any sense of any abstract.

    Tp2's post looks to me like the artist's impression produced by said tool of the tool's promotional literature :)

    [meta enough?]

    Never understood why scientific fraud has never been treated like any other fraud ie jail time (not that white collar criminals get anywhere enough porridge in most jurisdictions.)

    When you publish fraudulant results and receive financial (grants) and other (promotion, tenure) benefits it is to the detriment of your society, institution and coworkers. How is that different from a untruthful prospectus or forged financial reports? (vide Bernie Madoff.)

    Perhaps as in Voltaire's words there is a need for few good Byngings.

    (Might be a T H White coinage.)

    1. doublelayer Silver badge

      Re: "il est bon de tuer ... un Amiral pour encourager les autres."

      "Never understood why scientific fraud has never been treated like any other fraud ie jail time"

      It can be. The people who wrote the grants can press charges or talk to law enforcement. It doesn't happen mostly because fraud isn't that often handled by law enforcement. Massive frauds that steal billions does because there's a lot of public outcry. Smaller fraud that harms someone who puts in the effort to investigate it generally does. Medium-sized fraud might or might not be. Small fraud is usually below the threshold where law enforcement will go after it without being informed and assisted, and most places that pay for grants aren't putting that much effort into identifying fraudsters. Maybe they're doing it because they don't want the public label of having been defrauded. Maybe they don't because it's expensive and might not work. Either way, they don't bother doing what they would have to to get law enforcement to do anything.

      1. Michael Wojcik Silver badge

        Re: "il est bon de tuer ... un Amiral pour encourager les autres."

        Scientific fraud can be difficult to prove, too, particularly in such details as who actually committed the fraud. You could have six or eight co-authors on a paper, most of whom did their part of the research in a perfectly above-board way, and just one or two who faked initial observations or tweaked datasets that others subsequently analyzed.

  8. Mitoo Bobsworth

    I think the next generation AI model should be called -

    George Santos - no, Anthony Devolder - no, Kitara Ravache - no...

  9. ecofeco Silver badge

    HORRAY!!

    We're screwed.

    It's William Gibson's world and we just live in it.

  10. LionelB Silver badge
    Headmaster

    Fraudulent

    The word is fraudulent. "Fraud" is a noun. This is extremely annoyance.

  11. T. F. M. Reader

    So one AI generates experimental data...

    ... or even writes the whole paper. Another AI is tasked with detecting the fraud.

    A new definition of "peer review"?

  12. NLCSGRV

    Let's be clear here. Anyone publishing fake data or results obtained using fake data is not a scientist. They are a swindler.

  13. Anonymous Coward
    Anonymous Coward

    ChatGPT lies

    I asked ChatGPT about myself because I'm a fairly prominent technologist in the field I work in and my name is unique in my part of the industry.

    It said I was president of a particular professional organization. although I am a member I'm not president and never have been so I asked why it thought that. It apologized and said it couldn't find any data to back it up but then said I was on the board of that same organization and a fellow too. I'm neither so I asked it why I thought I was on the board, it apologized again and said it couldn't find any data to back it up and then started getting sniffy and suggested I check their website.

    It also said I was a consultant for a certain consumer electronics technology provider which I've never been, and for the companies I have worked for, got the job title, employment dates and what I worked on all wrong.

    It made most of it up which, since I was asking about something factual, meant it lied.

    One the plus side, it made my resume look better than it is!

    1. Michael Wojcik Silver badge

      Re: ChatGPT lies

      Lying requires agency, which transformer LLMs lack. The term of art is "hallucinated", which means the token predictor followed a gradient into a non-factual basin.

      Train a model on a corpus of much of the web, and it'll have the average accuracy of what's on the web.

      Chat-GPT is not a liar. It's not malicious. It's a tool, and it's blunt one that isn't really good for much except generating anecdotes about how lousy or wonderful (depending on your inclination to be impressed by Stupid ML Tricks) it is.

      1. Ideasource

        Re: ChatGPT lies

        Lying not about agency.

        It's about intent.

        aI assigned to a task have intent.

        To communicate anything with intent to deceive, is to lie.

        To omit information critical to clear context and accounting, while providing some information that by itself creates a different idea, is lying. In this case by omission.

        Any purposeful deception is a lie.

        Computers will lie to you all damn day.

        Because computers themselves are expressions of people collectively abstracted into a physical device, they inherit the intent from those collective expressions which ultimately would resolve back to people if a proper accounting could be taken.

        That's the first thing you have to learn to become proficient at them.

        Utilizing computers successfully is not as much about reading your screen as it is detecting where it is lying to you and and working around those limitations.

    2. SonofRojBlake

      Re: ChatGPT lies

      I read an article about ChatGPT making up biographical facts (the journalist in question was somewhat perturbed to find that, although the majority of his bio was accurate, the bit that said he'd died the previous year was ... exaggerated). So I asked it what it knew about me. Answer: nothing.

      Fair enough, I'm fairly nobody-ish, so it asked me to give it a clue. I told it a little bit about myself. It parroted back what I'd told it (I like paragliding, I've written a book about it, I have a Youtube channel)... except it wildly overstated my level of skill, success and recognition (I am NOT "internationally recognised as an expert"), AND said I owned and ran a specific Youtube channel that I've watched but have absolutely no connection to. It's bullshit on stilts, but it's so damn *confident*.

      What it reminds me of is the very limited number of times I've read a story in a national newspaper or seen on the TV about which I have personal, specific knowledge (three times, ever: 1985ish, 2008ish, and about a month ago). In every single such case, the professional journalists reporting on the story have distorted, exaggerated, misrepresented and outright lied to make the story more interesting. Not just gutter tabloids, either - "proper" broadsheets and the BBC. Absolute garbage, presented soberly and confidently with authority by actual humans. I'm emphatically not a customer of the tinfoil hat shop, but it does make me wonder if there's any media I can trust on any subject. ChatGPT churning out bullshit is NOT a new problem, it's just automating what the people we're supposed to trust already do.

  14. SundogUK Silver badge

    The repeatability of any given experiment is part of the core of what it means to be doing science but no one seems to be doing it any more. You don't get grant money for repeating some one else's experiment and getting the same result, after all. I think we should institute a program in all government funded research universities to pay people to do this, maybe with a bonus for exposing fraud.

  15. codejunky Silver badge

    Hmm

    Must resist saying MMCC. Must not point to data being 'corrected' to show warming. That reality refuses to cooperate with the models. Naa screw it junk science has always been something to be concerned about. That is why science must be open to question and experiments repeatable.

    1. Anonymous Coward
      Anonymous Coward

      Re: Hmm

      Hmm. Must resist saying Tufton.

  16. Anonymous Coward
    Anonymous Coward

    It is important for scientific theories and models to be open to question and subject to repeated experimentation in order to ensure their validity and accuracy. However, it is also important to distinguish between legitimate scientific inquiry and unfounded claims or misinformation.

    The overwhelming consensus among climate scientists is that human activities, such as burning fossil fuels, are driving global warming and climate change. This consensus is supported by a vast body of empirical evidence, including temperature measurements, ice core data, and satellite observations.

    While it is important to be critical of scientific findings and to scrutinize data and models, it is also important to be mindful of the sources and motivations behind claims that seek to undermine established scientific consensus. It is crucial to approach scientific inquiry with an open mind and to base our beliefs on the weight of the evidence, rather than on ideology or personal biases.

    1. Anonymous Coward
      Anonymous Coward

      That sounds perilously close to "don't discuss the problems, you'll give ammunition to the skeptics!".

      The issue being is that the skeptics also claim that the "consensus" is actually subject to "ideology or personal biases". The measurements, ice-core data are subject to interpretation. Certainly, the "science" of paleoclimate seems to not suffer the criticism of sometimes wobbly stats used to justify the conclusions.

  17. CatWithChainsaw
    FAIL

    Here's an idea for a paper

    How much collective BS, like social media, fake news, deepfakes, and completely fake papers, can a civilization carry before Atlas faceplants?

    I'd put money on us finding out in the next 20 years, but if we do, it won't be worth the paper it's printed on, or the bits it's coded to.

  18. Norman Nescio

    Reproducibility

    From a scientific method point of view, no matter how startling the paper, it should not really be accepted until the reproducibility of the results by independent researchers is demonstrated. Unfortunately there is little to zero funding of the work required to attempt to reproduce published results; or to publish null results. The same issue is playing out in many fields, with the 'publish or perish' mantra being used to push publication of only the least-possible positive results for each quantum of 'advancement' to get the publication numbers up. If you want proper science, it needs funding, including the experiments that turn out equivocal and the one's 'simply' confirming other people's work.

    For an historic example, try the kerfuffle over the existence of 'N-rays' - there are, unfortunately, plenty of other examples, including many modern ones.

    I'm not saying that fraudulent science is not a problem - but part of the problem is rewarding people for 'jumping the gun', and outright fraud.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like