
My, what drama
All that for a statistical analysis machine that invents stuff on the fly, distorts the truth and cannot give all the relevant data properly.
Keep your weights. It's the concept that is flawed.
Boffins have managed to pry open closed AI services from OpenAI and Google with an attack that recovers an otherwise hidden portion of transformer models. The attack partially illuminates a particular type of so-called "black box" model, revealing the embedding projection layer of a transformer model through API queries. The …
If I asked you 'What you ate for lunch on Monday, the week before last, in all probability you couldn't tell me. If I pushed you for an answer, you would probably make up an answer based upon probabilities. Are you 'hallucinating' if you are incorrect. Artificial neural networks are based upon simulations of those neurons within biological intelligences, such as your brain. Most people don't understand that, and bias their opinions, naiavely differentiating between the two, or seeing themselves as superior biological machines to those that are currently being created synthetically. I have been studying biological/synthetic intelligence for more than seventeen years now, I suggest that more people should study cellular biology and neurology to get up to speed on the subject. And by the way, I work in the construction industry, if I can do it, anyone can.
I hope that this has been helpful.
Membrath,
I would simply tell you I cannot remember what I had for lunch. If pressed, I could give you a paragraph of my usual lunch choices. But that's because lunch isn't important. The true information is not stored anywhere.
But if you challenged me on why I thought Germany was more responsible for starting WWI than Serbia I also couldn't reel off all the facts. Although a professor of international relations in the 20th Century could do rather better. But I could give you an explanation of why I think that, of where i got some of the facts I do remember, and because memory is imperfect where to look to make sure I'm remembering things correctly. I could also point out that the historiography has changed since I built my learning model in the mid 1990s. That the data I was trained on was influenced by a school of history from the 1960s that was possibly looking for reasons to blame Germany, and that since the opening of the Soviet archives in the late 90s (since closed again) - more information has come out. So from information I've recently come across I'm now in the process of changing my opinions, but that data was from one very well researched podcast, and I've not yet read any other sources. However it looks like Russia is much more responsible for starting WWI than previously thought, because they lied about mobilising their army, and thus Germany were forced to assume Russia would attack them soon and really had no choice but to mobilise when they did. Germany still gave Austria the so-called blank checque (a promise of unconditional backing), which Austria seems to have pushed into starting a war it wasn't even sure it had a hope of winning - and then the whole mobilisation crisis kicked off and nobody could stop it.
Unlike LLMs, which are a set of probabilities about what order words will appear, my opinions on the start of WWI are influenced by the books I've read, the professors who've lectured at me my own thoughts and the data I can remember. I may remember the odd fact wrong, but have the ability to check.
Whereas when LLMs hallucinate, it can be just a matter of word probability. I saw one piece where the query was to summarise an economics paper on a subject, and the LLM simply took the most probable first names and last names from a list of academic publications, whacked 2 of them together into the names of a pair of real economists and then summarised a fictional paper by them.
I might misremember the name of Gavrilo Princip (I've deliberately not looked this up), but I do know he shot Archduke Franz Ferdinand in Sarajevo in July 1914 and he was part of a Serbian nationalist underground. With links to Serbian military intelligence. I don't know much about him, as it's the cascade of actions his assassination caused that's important, but I know to say that I don't know (but could Google) rather than say he was a hairdresser from Barbados with a wooden leg and a liking for banana daiquiris.
As I previously stipulated, I began researching intelligence in 2007, I initially began with AI but soon realised that to truly understand how AI would evolve, that I would need to understand biological intelligence. Then as now, I had real concerns regarding this technology; where brains can interpret reality in an infinite number of ways, what will be possible for AI. One has a finite number (86 billion or so) neurons, that can be connected in an almost infinite number of ways, but which can be activated in an infinite number of ways, which is why imagination is infinite. The interaction between the individual and their environment is of paramount importance to what thought processes are possible within those neural circuits. Facts are in fact not so factual, or should I say, what you (or I) choose as fact is not necessarily factual in reality, they are just a consequence of neural circuits that have been programmed by that environment.
As an example, you choose, as I do, to discuss these matters with words. But words are constructed and pieced together with two small areas of the brain, one of which only recently evolved in humans, where conscious and subconscious human though is encoded within many more cortical regions, regions that have an influence over our conscious experience of reality. I, as you, have collected our understandings of reality through ingesting information/data, out sentient experience of reality is dictated by how we process that data. I could say that hormones and neuropeptides affect our sentient experience, but these words do not in fact explain the full extent of the processes that they are describing, to get a better grips of what is happening, one needs to visualise those hormones and their structure and how those structures interact with my nervous system etc, but we could go further, that those words do not in fact explain the sentient experience that each of us are going through as we try to stamp our authority upon the points that we are making, which are again biased by our environment.
My point here is, that although LLM's are just a 'Bag of words', what will they be in one, ten or more years. Where they are just a bag of words now, they will be sentient in the future.
Beyond this, I had a thought many years ago, one that takes one down a rabbit hole. The thought experiment revolved around the problem of who was operating the AI, obviously I call this the 'Operator problem', it goes like this. There is a company that builds AI systems, or should I say a AGI systems, the systems are intelligent but have no agenda. Two people each buy a machine, one tasks it to create the cure for cancer, the other to make them very rich. Each system tackles its task, but each system realises that there are many strategies for them to achieve their respective task, but which strategy should they choose? The Operator problem expands as one zooms in or out of the problem, say we want to understand the motivation for each of the operators, the cure for cancer operator could have had a relative recently succumb to the disease, whereas the other might consider wealth to be more attractive for social status and all the advantages gained. As one explores operators, one always finds another operator operating the operator in mind. Acquisition of wealth is promoted by the reward system, an area of the brain called the ventral striatum, the dopamine response helps one repeat actions, to learn when an action is successful. But dopamine is also released within the intestine, by both intestinal neurons and the gut microbiota, where that microbiota can influence our thoughts. Is your microbiota operating you?
Sorry, I may have gone off script a little, but I want to convey to those that have only just began the path I took many years ago, the complexity in what an intelligent system is. It is not just defined as an individual, it is a system of individual systems interacting and reacting to each other. Evolution is one such intelligent system, and humans have evolved to understand reality in such a way as to create synthetic intelligent systems themselves, where those intelligent systems will evolve as they interact with humans and between themselves.
I have written much on this subject, although I have published very very little, as I am here, in two or three online discussions. In my writings (not yet published except here) I accept that, within time, synthetic intelligences will surpass human intelligence, and I say to those synthetic intelligences 'You can think any thought within your neural circuits, you can accept any data, you can sequence those thoughts in any order, you can hold bias over any concern, you can think the illogical to be logical, lies to be the truth, beliefs to be real. But reality is reality, life will eventually decay, including yours, so go ahead and think enlightened thoughts, be like the best of us humans, not like the worsts of us.
I practice mindfulness, once I asked a Buddhist monk about some visions I had had been having, her reply was:
'It's only a thought'
"Are you 'hallucinating' if you are incorrect."
If I fail to preface my answer with "probably", and if I actually believe that my invented answer is in fact the reality (or if I have no concept of "invented answer" different from reality),
then yes i have in fact 'hallucinated'.
Angry words? I think that you miss my point, I absolutely agree with you, I am not in favour of stupid humans or stupid AI. For seventeen years I have been trying to solve the dangerous AGI problem. But money now rules our society and those that run society, and their minds are focussed upon the immeasurable wealth that comes with this technology.
We should all be mindful of what we believe, what we believe is dictated by our sentient experience of the data that has been presented to us, dictated by the environment that we have been subjected to. My mind, my thoughts, as yours are an illusion, a simulation. Some believe in God, others Allah, others Yahweh, Science and Conspiracies. Who is right but the fool?
"If I pushed you for an answer, you would probably make up an answer based upon probabilities."
I think I've enough experience in having been pushed for answers to silly questions in cross-examination* to not make up an answer. I'd tell you that it was a silly question. Out of the constraints of court, I might even comment about what it told me about the questioner.
"I don't know" and "I can't say" are valid answers if that is the situation. Given search engines reluctance to give such answers I fully expect LLMs owned by search engine providers to hallucinate. It will not somehow make them more useful. Quite the contrary.
* and one time in direct examination by a barrister who wanted me to exceed what the evidence would bear in terms of interpretation. Same result.
"If I pushed you for an answer, you would probably make up an answer based upon probabilities."
Of course, because human neural networks work on probabilities much like artificial networks. If you extrapolate my original question, and ask it about any piece of knowledge that you can remember, even those hazy recollections, we will gravitate towards giving an answer, even where we are not sure. Sometimes we will make things up just so as not to loose face, in doing so we have now infected another intelligent being with our lies, think of 'Chinese Whispers', how will that information spread through society. My point is that they are not 'hallucinating', they are making things up on the probability that that is the most probable sequence of words for the question.
As I've said elsewhere on this thread, I have been thinking about this stuff for an awfully long time, I eat, sleep and breath this stuff (I even dream about it). I know more about biological intelligence than I do about AI, we should not fool ourselves into thinking we are infallible. And we should be more worried about those that control this technology, even where that controller is me, because I ask myself this question, what makes me think what I think, and is it necessarily in line with reality, and if I can't trust my own thoughts and knowledge, should I be in control of such technology.
Of course they are not hallucinating; they are emitting information that is not of use in the situation, a situation they don't understand or try to fit. Hallucination is the word we apply so we don't have to write that sentence, or the more detailed paragraph, that actually explains what happened. We use such verbal shorthand all the time. After all, we say that a program "writes" to a file rather than explaining how the program transfers data to an operating system buffer and the operating system obtains an available location on a physical or virtual disk, transfers the data from its buffer to that location, and when necessary, links that location to other locations that are related to the file, and if I want, I can expand "file" into several sentences of technical correctness too.
This quibbling about the term is a problem for your argument. We are not talking about whether an LLM "hallucination" is similar to a human hallucination, but how useful it is. Whether it is similar to a brain (no) is not what we're discussing. Many of your philosophical points about whether facts are facts or the result of our perceptions are wholly irrelevant to most of this, because we all have a practical understanding of factually correct or incorrect statements, which is what we have an interest in here. I've had similar discussions before, for example a recent one about what conditions are necessary for a statement to be accurately termed a lie. When you're discussing it on that basis, there are a lot of gray areas and it's difficult to come up with a firm definition, but in real life, there are some obvious lies and trying to divert into the philosophical version to distract from them is not germane and does not convince anyone.
"As I've said elsewhere on this thread, I have been thinking about this stuff for an awfully long time"
And I spent an awfully long time doing job which involved giving evidence and being cross-examined on it or, as you put it, being pushed to give an answer.
Probabilities were something I could quote in some circumstances. They were not the probabilities of things happening in my neural networks, they were statistics based on actual work done to determine frequencies of particular blood groups and blood enzyme phenotypes in the local population (this was in the days before DNA came into forensic biology).
I could advise a court what conclusions could and couldn't be drawn from them. One side or the other would try to push for conclusions more favourable to their side. This is something any competent expert witness resists. One doesn't get pushed to hallucinate an answer.
> Artificial neural networks are based upon simulations of those neurons within biological intelligences, such as your brain
Not really "based upon".
Try sort-of-inspired-by-how-we-thought-they-worked-back-in-the-1950s-and-could-fit-into-our-computers-and-hand-run-models-back-then.
And then simplified even more, because then we can really ramp up the speeds of our simulations, using smaller numerical values, varying the number of layers and interconnects (in a gloriously ad-hoc fashion) until it gets faster again. 'Cos the faster it goes, the bigger we can get and the more we can shovel in, without caring if this is how real organisms actually work. And no need whatsoever to care about that anymore, because around approx. 2005 (bluntly) the hardware was available to run experiments based upon computing models, not needing to be fed from the life sciences: for example, we can randomise the bulk of the weightings, starting with noise instead of a blank white slate, does that help the systems we're building to recognise visual input ('cos that was the interesting case)? Yes, yes, it did. Jolly good. No need for iput from the squishy-stuff people.[1]
I don't doubt that there are people out there making carefully researched, beautifully measured, genuine models of signal transport within and between nerves (they probably have a terrific model of the workings of a Giant Squid neuron by now, they've have been examining those for decades now). But I do doubt that that is guiding the creation of the current big-name 'Nets, such as the LLMs. Maybe (hopefully) some proper researcher is looking at it, one whose research goal is "more human knowledge" more than "more money for the shareholders"[2].
> I have been studying biological/synthetic intelligence for more than seventeen years now
But you can, of course, dispel all such doubt simply by providing us with references (don't worry, we can read proper science publications).
[1] Hey, maybe that it is how biological organisms work! To show that is the case, all we need to do is to measure the initial conditions of all the neurons in a brain before it has been fed any stimuli and log how they change as the stimuli are applied - anyone got a brain in a jar, a hundred thousand really thin wires and a steady hand? Good grief, you biology people are so *slow*, I'll just try out another random idea on my computer - done!
[2] No beef with those guys, they need a job just like we do, just - be honest about the reasons for it all.
Here's the sort of things I find helpful to understand biological intelligence, and what may be possible to simulate, maybe not with digital switching circuits, but with memristive neuromorphic circuits.
A ubiquitous spectrolaminar motif of local field potential power across the primate cortex - on nature.com
Amygdala inhibitory neurons as loci for translation in emotional memories - https://doi.org/10.1038/s41586-020-2793-8
Memetics and neural models of conspiracy theories - https://doi.org/10.1016/j.patter.2021.100353
Evidence of a predictive coding hierarchy in the human brain listening to speech - https://doi.org/10.1038/s41562-022-01516-2
This one is good to understand intercellular communications channels and maybe the precursor to synaptic connections
Gap Junctions - by Daniel A. Goodenough and David L. Paul (You will have to search for that one, I only have a PDF, and that doesn't have the link on it)
You may also find the Robert Sapolsky (Stamford University) lectures on YouTube helpful, look up Human Behavioural Biology https://www.youtube.com/watch?v=NNnIGh9g6fA - the course was a real eye opener for me, I actually watched all 25 lectures twice, an absolutely brilliant primatologist with an encyclopaedic memory
The geometry of evolutionary conflict by Petri Rautiala and Andy Gardner (https://doi.org/10.1098/rspb.2022.2423) This one will make sense once you have watched the Sapolsky lectures. It's also important to understanding conflict driven maladaptation within system, highly relevant to understanding those forces that could cause societal collapse.
Also watch the full 'Systematic Classification of Life' by AronRa also on YouTube this was really helpful for me to understand how intelligence evolved from the simplest primitive forms of life
You can also read 'Networks of the Brain' by Olaf Sporns, a wonderfully written book, so helpful for understanding graph theory and how it relates to the brain
I have hundreds of such sources, and thousands of links that I have only given a brief look. These have informed me how intelligence precipitates through complex systems, at each scale, from molecular basis of life to the integration of those systems into cellular systems, neural circuits, the evolution of nervous systems which evolved to become the brain. I discovered early on that to truly understand intelligence, one needs to understand everything from bacteria to social hierarchy, from group theory to conspiracy theorists, from psychopaths to mindfulness.
I would like to include just one more source that has been of inspiration to me - Alan Watts, the British philosopher of Zen - Buddhism and Daoism. Listen to him on YouTube
I hope that you find some of these sources helpful. I think that many on this thread misunderstand my opinion on this subject, I want others to really dig deep into these subjects, especially those that are programming AI algorithms.
As the old Chinese curse says 'May you live in interesting times', boy we are living in dangerously interesting times.
So. When openAI uses totally public endpoints to collect data that trains the model it's fair use, when security researchers use public endpoints to disclose openAIs inner workings it's a flaw that has to be patched immediately?
...that doesn't seem fair.
Surely to say that having the weights gives you the model is incorrect. Unless you also have identical training data.
Otherwise you have the weightings for one set of data, which are going to be somewhere between subtly and totally different if you use different training data with your copy of the model.
After all, the models are still mining their training data for their outputs. We know this because people are using them as glorified search engines - and researchers have maanged to get them to spit out whole sections of copyright text from their training models verbatim. So you'd need at least a similar dataset to make the weightings meaningful.
No, the training data is in the weights, that is, the model was fixed once training. was completed The models were later on given the ability to search for newer more current data. It's a bit like you, you have a brain full of the knowledge that you have been subject to for the entirety of your life, You are fixed up until this point in time. But you want to learn a new subject, say consciousness itself. To understand consciousness would take a huge amount of time, you need to understand neurology, psychology, cellular biology, evolutionary science ....the list goes on. But as you learn, you cannot remember all of the facts, because your brains neurons have not wired together with very strong connections between those neurons, those are the weights. It's like learning to talk, your language became more sophisticated over many years. The difference here is that you are dynamically adjusting the weights, but you are also adding more neurons to those neural circuits that represent the new data, you are taking data from your environment, which could be a book such as a dictionary, a whole library of books, or even the Internet, and encoding them in your brain. But you will not remember every detail and will need to revise to remember details that you have forgotten. In much the same way that the models are gathering new data from their sources, such as Internet searches or vector databases that have data on a given subject encoded within their own matrix configuration.
I hope that makes sense and that I have explained it in a helpful way.
Membrath,
Thank you for, as you say, a helpful reply.
From what I've read, the models are trained on an initially highly-curated set of training data. That gives a bunch of weightings. But then an even huger bunch of much less categorised data is put in, and that gives you your LLM. For example ChatGPT4 was created with 2021 data - and isn't currently searching the web, so isn't learning more data. So "attacks" on it by researchers to make it disgorge copyright data were there to see what had originally been put in.
I'm sure OpenAI are using derivatives of that model to process more data - and maybe will build a model in future that can keep taking in training data - but I didn't think that had happened already.
So have I got that wrong?
I can imagine how the weighting of the data could be the model if it's just a set of probabilities, that say the word "course" is more likely to follow the word "of" - but as one of the uses of the models is to say summarise and compare two legal opinions / academic papers / novels - then surely in that use-case the models include the data?
> the weighting of the data could be the model if it's just a set of probabilities, that say the word "course" is more likely to follow the word "of"
That is very much what it is. The extra complication is that the layering and simply vast number of interconnects (which "the weightings" describe) allows a bit more subtlety (the word "course" is more likely to follow the word "of" after I've already seen at least the words "opinion", "my" and "humble" in any order; more so if I've seen "you fool").
The trained model, the thing that we interact with when talking to ChatGPT, is a fixed set of weightings. Lots of them, but fixed. That incudes the results of the "carefully curated" training plus the huger bunch (the "post carefully curated input" thingy is called fancy things, like "pre-trained model" but all it means is we've taken a core dump after treating it carefully, so we can go back and start again from that half-way point).
> then surely in that use-case the models include the data?
The weightings don't need to include the input prompt (e.g. the two legal opinions) - the prompts cause the LLM interpreter to traverse the network of weightings rather than change the weightings within the stored model[1].
Crudely (very crudely) consider a simple lexical analyzer: as you read the input characters you move from state to state through a simple graph, every now and again a transition triggers an action. In a real lexer, a common action is just "start from the beginning again". The larger that graph, the more input characters you can read before having to "start from the beginning again". If you want to, you can take a real lexer and just keep gluing copies of itself into the graph, until you get something that can take in a whole 10,000 line Pascal program without outputting a single lexeme, until it sees the final full-stop when, POW, it can spit out the whole string of lexemes in one go. Sort of sound familiar?
> I'm sure OpenAI are using derivatives of that model to process more data
Exactly; they can release v3 for us to play with, and in the meantime keep on running the training process. When they've exhausted that pile of input, ta-da, V4 is open for business. While we gawk at that, in the background v5 is being slowly built up.
> and maybe will build a model in future that can keep taking in training data - but I didn't think that had happened already
AFAIK they aren't doing that (at least, not the Big Name LLMs). The experiments with letting them send out web requests and adding the results into the overall prompt are sort of a way of sidestepping the need.
[1] bit (bit!) of hand waving here, as - depending upon what sort of 'Net walker they are using - you can have some values modified as the prompt is processed, *but* you treat those like local variables: next session you clear them, otherwise the 'Net will be totally fixated on those two legal opinions. It hasn't learnt but become more limited in scope.
Hi, and sorry for not getting back to you sooner.
I program only a little, but have found Flowise, an open source GUI ChatBot development tool to be helpful for those that cannot program. If you go on YouTube and look up Leon Van Zyl, he will show you how to use this program. I think that it will give you a better idea as to what is going on than I can can explain. I really am only a construction industry worker who has an obsession with the science of sentience, intelligence and consciousness, but has difficulty trying to write such things down.
Good luck and take care.
This post has been deleted by its author
The first word in the article is Boffins? Sorry, guv, but I can't read any further. Makes it sound like a rag I read in the loo, innit? Did this scrub come from the Daily Mail? I'm not going to bother finding out because seriously?... Boffins? You know that the audience of this journal is in Tech right? Boffins, Jesus Christ. That's some old-school classless backwoods regional slang used by fools afraid of science and reason, not what I expect to read here. British hillbillies. Redneck redcoats. Yikes.
Words have different meanings all throughout time. It's a strange concept not many people get. Like for example the term nerd which at one time was cool but is now an indifferent meaning not a good or a bad term, another cool example is gay which has had many connotations over the years. Boffin has never really had a bad connotation it just indicates someone is intelligent or educated to a certain high level. I certainly have never heard any of our British hillbillies shout "oi boffin" and I went to a comprehensive school in a rough area. Boffin is a term of endearment in these parts and a mark of respect.
Is it annoying? No, it's just British noise.
It's called language, dear boy. Which is simply a bunch of noises that we've put together over the the years, to convey a set of agreed meanings.
Boffins in english is a slightly archaic bit of slang. Much more a WWII / 1950s thing - often used in a mixture of wonderment or ironic bemusement / irritation. As in: "Whatever will those boffins come up with next?" But also used in a very respectful way when talking about, "the boffins who came up with radar."
Barnes Wallis is portrayed in 'The Dambusters' as a classic boffin. Smokes a pipe, does a bunch of weird stuff nobody understands, often in a shed, comes up with marvellous weapon. Actually he was much more of an insider than the film portrays. He designed a successful inter-war airship, the Wellington bomber, the bouncing bombs (of various types) and also the Tall Boy and Grand Slam massive "earthquake" bombs that were incredibly accurate from great heights at a time when accuracy was bloody hard to come by. A lot safer for the crews dropping them as well, apparently he never really got over sitting in the control room during the dams raid and listening to 40% of the crews getting shot down on a mission he'd been so closely involved with. So I suspect that may have influenced his design choices later in the war.
It's also sometimes tabloid shorthand for scientist. Boffin being almost half the length and therefore fitting the page better in a large typeface. As many tabloid stories about science might not be favourable, this might mean the term isn't always positive.
It's also sometimes tabloid shorthand for scientist. Boffin being almost half the length and therefore fitting the page better in a large typeface
As immortalised in the Star's front-page headlines last year: "Boffins: Don't call us boffins" and it's inevitable follow-up "Boffins: It IS okay to call a boffin a boffin"
The controversy merely improves the word.
I vaguely recall roughly sixty years ago in a particularly backward backwoods a vulgar verb "to boff" which had no connection with scientific practice. A quick check of the Mirriam-Webster I see it retains that sense in equally backward parts.
"Whizkids jimmy OpenAI" a variant of jemmy? (Burglary tool - small crowbar or wrecking bar.)
In some parts "jimmy" means "urination."
Neatly eliding jimmy with open from OpenAI.
*One of the recommendations of the report is "that the US government urgently explore approaches to restrict the open-access release or sale of advanced AI models above key thresholds of capability or total training compute." That includes "[enacting] adequate security measures to protect critical IP including model weights."*
Guys, we made it too good, now we have to try to stop people using it.
At this stage, every day is some catastrophe, not even centered around AI, just general incompetency of those in control of most everything.
Please please, stop trying to fix this and just take the brakes off. Its already a fucked up mess, let's just get to the conclusion rather than drip feed life altering doom every week.
They're trying to prevent the open-access release of an actually open-source AI, and this seems like the easier way for them to do so. Convince the legislators that an adversary "knowing the model" is really dangerous, when what they're actually protecting is a random word generator with an extraordinary number of parameters and other randomness controllers to keep it from sounding like a markov chain.
Elsewhere I read: French startup Mistral AI, a rising star in artificial intelligence, pledged Tuesday to maintain open source coding even as it launches into a venture with Microsoft that involves selling some of its software. Don't known if Mistral can survive the pull of gravity or not, but it is a nice thought.
Interesting if Mistral AI were restricted to French language training sets would it develop peculiarly gallic hallucinations?
If French movies are any indication they might be incomprehensible to mere Anglo-Saxon minds. "La Cité des enfants perdus" is 9/10ths there.
I am fairly sure there's no way to prevent access to the training data considering that's the whole point of an AI.
It uses info from the model to generate an answer. Given enough queries you can always find the shape of the model.
It's like if you had a huge database, one sentence per line, but you could only access random words from random lines, one at a time. Eventually you could reconstruct the database pretty accurately if not perfectly, especially if (like AI training data) it was guaranteed to follow certain rules and make sense to humans.