
British Citizenship Test
How will it fare?
"Currently the citizenship test is a 45-minute written test that features 24 questions on British traditions and customs."
OpenAI on Tuesday announced the qualified arrival of GPT-4, its latest milestone in the making of call-and-response deep learning models and one that can seemingly outperform its fleshy creators in important exams. According to OpenAI, the model exhibits "human-level performance on various professional and academic benchmarks …
How will it fare?
"Currently the citizenship test is a 45-minute written test that features 24 questions on British traditions and customs."
> British Citizenship Test
> How will it fare?
A good question!
I looked at some of the online practice tests (though the language of the questions is so broken, I doubt these are "official" questions). The first question was Do You Know When did the First World War end?
Now, strictly speaking the answer is "yes". Which was not one of the four options provided.
Even more strictly speaking, WW1 did not end with the armistice, that was signed on [ omitted so as not to be accused of a spoiler] but it actually ended when the Treaty of Versailles was signed in 1919.
So one question worth asking is whether the person who sets those citizenship questions would pass their own test, either?
how dare you offend our fully! qualified! staff!!!!
btw, yesterday I signed a consent form for my son where:
"Cadets will be dismissed at 21.30 hrs, long sleeves must be warn"
On 2nd thought I decided against amending it to 'warm' and went for 'warned' instead. I don't think anyone will bother anyway, which is going to be VERY disappointing!
How will it fare?
It will almost certainly be able to prove it's more British than I actually am, and I'll be the one on the next plane to Rwanda - which might be doing me favour seeing how this country is going.
The whole notion that 'these are things a Brit would or should know, must be known to be considered truly British' is deeply flawed.
I would expect it to beat almost everyone in any exam - it's not a level playing field. Give human candidates access to the web, time to look things up and formulate an answer, and they would do much better.
Where I have found Chat AI excels is in producing convincing bullshit and lies as quick as the guy I know down the local who always has the answer for everything, and every politician I have known.
TBH, I wouldn't be surprised if they already were, to some extent.
A politician could already fire up ChatGPT and practice a few debates with a virtual opponent.
"I am Rishi Sunak talking at PMQs, you are Kier Starmer and the Labour Front Bench. I have just announced my new policy to require government access to all Internet-connected devices, and a ban on unsigned or self-signed open-source operating systems which could circumvent such measures. We will also be installing automated antipersonnel weaponry along the beaches of Kent." What might be a typical opening question from my opponents?
A Typical opening question might be, "These kind of policies sound like something from 1930s Germany! Has this government gone completely stark-raving mad or is it truly plotting to turn Britain into a Nazi police state?"
OK and what would be a winning response which gets the media on my side despite tyrannical new policies?
A Winning response would be to accuse Mr. Starmer of minimising the Holocaust, just like his predecessor Jeremy Corbyn....
Given that it's likely that most of the answers are factual and easily findable on the Internet, I wouldn't be surprised if it could ace a citizenship test.
I'm equally unsurprised that it aced the bar exam. For all the fancy talk, most law cases hinge on lawyers being able to pore through vast volumes of written law and case law, and find precedents that meet the pattern. If a model has been fed the laws and a bunch of legal texts as part of its training it should do well enough. And doing well in mathematical tasks should be trivial, as long as it just needs to spout out results and not explain its reasoning.
Having said all that, it's still pretty impressive!!!
Most exams just test memory and ability to learn templates.
If you give answers that are correct but not in the answer sheet, there is a chance the tutors wouldn't be able to figure out that your answer is correct, because likely they don't understand the topics they teach well enough (otherwise they wouldn't be teaching...)
So yes, chat GPT-4 and other a"I" will be good add this as this does not involve thinking or reasoning, just finding matching patterns and spitting out whatever fits.
Physics lecturer used ChatGPT on one of his exams: The results will shock you!
(Well actually they won't, it got the answers wrong.)
And wrong to the point where the Prof. said "if anyone gets *this* wrong then my whole course has failed" - i.e. after fluking its way to a few sort-of good answers it failed on THE significant idea!
Which perfectly demonstrates that the decent answers came from anywhere *but* understanding the material.
I am one of the admins for the LinkedIn "Mathematical Olympiads" subgroup. One of the members fed a (relatively) simple mathematical question to ChatGPT. It started reasonably well, then made a series of mathematical and arithmetical howlers, failing several sanity checks on the way through.
ChatGPT is not good at mathematics.
It has such a large data store it is basically like taking a test with Google available to you to find formulas, look up definitions of words (and synonyms, antonyms, etc.)
Actually better than having Google available because with Google you have to be able to assess that you are looking at the right search result, while the training data given to GPT-4 can elide out all the confusing or incorrect stuff so it is like having a curated Google available to you.
> GPT series by its very nature is a family of regurgitation engines, drawing upon material it was trained on taught and reassembling it to address your query.
Which is what most people do, most of the time.
Few people create unique responses, and then only rarely. Most of us rely on knowing (though education / experience) what to regurgitate in response to a given situation and simply do that.
so what' the difference between people, who can make mistakes and whom we use to teach other people? All you need is a somewhat more, narrower verification and - voila - UK gov solves the problem of missing teachers at a fraction of the cost, etc, etc. Repeat across 160 or so countries of the world. First they didn't come for the teachers, they have already come for translators, and probably a few other obscure professions.
well, I can easily see GPs in crosshair, another workforce shortage. Given that most suggestions for a variety of issues is: paracetamol / calpol / get x off the shelf, cause cheaper than prescription / rest and drink plenty of fluids / etc. - even chat gpt v 0.1 can handle that.
"It is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it," OpenAI CEO Sam Altman acknowledged, referring to GPT-4.
"Still seems more impressive” is amusing and quite disarming, isn’t it .... and almost human like in its own coy self-deprecation.
> Finally, Brockman set up GPT-4 to analyze 16 pages of US tax code to return the standard deduction for a couple, Alice and Bob, with specific financial circumstances. OpenAI's model responded with the correct answer, along with an explanation of the calculations involved.
Sueball incoming from tax preparation software companies in 3...2...1...
They should feed it the whole code and ask it to find as many contradictions as it can. Then we feed the output to the politicians and tell them to stop prattling on with ideological battles and start fixing the shit that they've foisted upon us.
Repeat for all tax codes and the respective politicians, obviously.
"Now all in g! A sonnet, trochaic hexameter, about an old cyclotron who kept sixteen artificial mistresses, blue and radioactive, had four wings, three purple pavilions, two lacquered chests, each containing exactly one thousand medallions bearing the likeness of Czar Murdicog the Head-less..."
"Grinding gleeful gears, Giggling gynecobalt-6o golems," began the machine, but Trurl leaped to the console, shut off the power and turned, defending the machine with his body.
- S. Lem, The Cyberiad, 1967
Mr Lem was scarily prescient about problems with AI: a highly recommended read.
If you can get hold of a copy of his non-fictional Summa Technologiae you'll find he was even more prescient than his fiction would suggest. A quote from Amazon (which hasn't got it any more)
After five decades Summa Technologiae has lost none of its intellectual or critical significance. Indeed, many of Lem’s conjectures about future technologies have now come true: from artificial intelligence, bionics, and nanotechnology to the dangers of information overload, the concept underlying Internet search engines, and the idea of virtual reality. More important for its continued relevance, however, is Lem’s rigorous investigation into the parallel development of biological and technical evolution and his conclusion that technology will outlive humanity.
My first reaction was "what has a Rust library got to do with it?" Now I know what libgen.rs is, yes you can find copies there. Definitely worth getting hold of to see just how much of modern computer technology Lem predicted in the 60s.
being that 90% of folks couldn't pass a maths exam without cheating or looking up the answers.
Back in tha days of study it happened that 200 students each year for 10 years gave the wrong answer to a maths question. They all gave the answer that was printed at the back of the book and passed the exam. Unfortunately around 2002 it was discovered that the printed answer was wrong.
The majority of these students went on the "secure" a job in the industry.
HELP
ALF
"GPT-4 can pass a simulated bar exam in the top 10 percent of test takers,"
That may have been a mistake. Midjourney et al came for the paint splatterers and couloured crayon brigade, gpt came for the writers, there are upcoming toys out there with vocalists, musicians and composers in their sights. None of them really have enough clout to do anything about it, and anyway nobody really cares - why bother with a temperamental arty type when software can make a good enough picture in seconds, for free, without hassle?
But lawyers are a different breed. They make serious money, they (claim to) understand the law / legal system, and they are not going to be happy that said software can apparently do the same to them as it did to the artists.
They're cunning, too, in a limited way, and they won't like the implication that maybe they're not as special as they like to pretend if a mere machine can pass their exams. They will brush it off, pretend to be unconcerned, but suddenly the artists will find that lawyers are falling over themselves to take up their cause (without ever admitting that it's now their cause, too, of course).
Expect litigation against AI companies to skyrocket, along with instant heavyweight lobbying for regulation etc.
yes, it's going to be an interesting boxing match, lawyers v. Big Tech. But I don't see how they could slow down this avalanche, let alone stop it. One way would be to argue on the copyright front, but unless they manage to provide a reasonably sound proof 'AI stole this poor artists' potential USD 10b tune!', there's not much they can do. And something tells me lawyers are no more popular with politicians than any lesser mortal. So, unless they convince politicians that AI is gonna remove politicians from the equation, the lawyers are gonna lose like everybody else. And, to speculate further, politicians, like all humans, are short-sighted, they see a low-hanging AI fruit and will try to use it to their own benefit. And once it becomes omnipresent, at that point, even if they suddenly realise it was a grave error, there's no politician in the world able to take it down or even curb the use. Think mobile phones, those evil slabs.
p.s. when I use the term 'AI', it's only for convenience, you might call it 'chat-bots' or whatever. I don't really care whether it is 'intelligence' or not. If I'm fired because a toaster does my job faster and better, what's the difference if it's intelligent, omnipresent, or dumb. Interesting times ahead.
In other news, I see Hunt will be using AI to sanction UC claimants. I suspect this is because humans will ultimately give humans the benefit of the doubt, but the AI has no compassion and can be taught to sanction mercilessly. Any mistakes, well, you can appeal to another AI...
I need to throw it at our tests for becoming an "Authorised Person" or "Senior Authorised Person", which are the tests that are part of becoming qualified to work on HV equipment in the UK.
Much of the documentation and explanation that one would need to train on relies on combinations of drawings and written word that cannot be studied independently.
" GPT-4 can pass a simulated bar exam in the top 10 percent of test takers"
So they first feed the questions and correct answers to it and then it can get answers about 90% right. Despite having fed *all* correct answers.
And the sellers actually believe that's a *good* result? "It's guessing BS only 10% of the time" ... literally.
I don't think this proves that AI is better than the human brain, and I don't think that is point of the result...if anything, it proves how simple the tools are that humans use to categorise each other. We already knew that though, the whole academic system is built around tools that the top 1% use to sort the 99%...and keep the 1% where it is.
We've all worked with highly qualified simpletons before. I've worked with loads personally. So many in fact that I treat a Masters Degree from certain places as a red flag. Not because the places are bad per se, but because the people that roll off the production line there have a massively inflated view of themselves and tend to be quite fragile. Especially when you find flaws in their "lifes work" that they've been working on since year 1 of Uni.
I found a really basic SQL injection attack vector one of these peoples "projects", guy didn't understand it, was too proud to pay me to fix it (I am a lesser human to him, with no higher level academic paperwork) and he was too embarrassed to disclose it to his business partners not because of the bug itself, but because of who had found it. Pretty sure he initially didn't believe it existed...something about him having a Masters from Cambridge and therefore it being impossible...until I offered to demonstrate it, which resulted in around 30 awkward minutes of listening to him in denial.
The software was a healthcare practice management platform. His partners eventually did offer me the work to help fix the problem, but "Masters of Cambridge" became an absolute cock to work with, he wouldn't push code to the repo, he wouldn't release new builds to me to test...he went out of his way to ensure it was incredibly difficult for me to test for bugs and vulnerabilities. He was eventually cast adrift and a team of "non-academic" developers and engineers were hired to finish (or rather re-write) the entire product. His credit was therefore completely removed...last I heard (about 10 years ago), he was working in an ISP call centre as a second line support technician...probably works in Starbucks now.