Clever
Pity the output is grammatically incorrect.
</pedantry>
To publish online and remain anonymous, boffins from Bulgaria and Qatar advise being mediocre. And if you can't manage that on your own, they have a technique to make your prose less scintillating. Distinctive writing tends to point to a specific author. That's what stylometry, the study of linguistic patterns, aims to reveal …
"In the original, the writer has pride in him/herself. In both modified versions, he/she has pride an a mysterious 'them'."
Beat me to it. It changes the meaning of literally every part of the quote. "I am proud" is not the same as "I am proud of them", "I carry my love with me" is not the same as "I carry my beloved", and "he shall never know it" is definitely not the same as "he shall ever know it". Overall it goes from "Because I am proud, I will die without ever letting someone know I love them" to "I am proud of someone, and also carrying the corpse of the person I love who will always know it". It's not dull, it's just wrong.
They've been doing exactly this sort of thing for decades.
More recently a company called Ntrepid was hired by the US military to spread pro-American propaganda in mainly Arab-speaking social networks. I believe they describe this service as "persona management", since the idea is that one person posts many comments/articles that convincingly appear to come from many different people.
Or in other words what we old-timers call sockpuppets.
I could careless, I have carelessed, and I will careless. Yesterday I sat on the cat asleep on my comfy chair. Today I put my wine glass on a table that had been moved and wasn't there. Tomorrow? Who knows what kind of carelessness I will commit. And I'm not a yank, but I do live there (here?) Is it catching?
"There is nothing wrong with 'I could care less' it's just a valid as 'I couldn't care less' it just doesn't mean the same thing and the two phrases are not interchangable..."
With the exception of David Mitchell's rather excellent Soapbox piece, in which it is used only to raise the issue, the only times I have ever heard any Overpuddlians use the phrase, they clearly were using it where "I couldn't care less" made more sense.
Yah, that should be "I couldn't care less" but as people got lazy saying it they totally reversed the meaning. There's a point when the words don't mean anything anymore, it becomes a Set Phrase with an encapsulated meaning.
Correction: Nobody here says "I could careless", they all get the separation of Care and Less correct.
As one of those "Yanks," I must say that I don't believe I've ever seen that phrase before. It doesn't make any sense. "Careless" is an adjective, but in this sentence it's not modifying a noun as is the function of an adjective. Instead, it's serving as a main verb to the auxiliary verb of "could".
Not so much because of the nonsensical expression itself, than the extremely annoying tendency to compound words that should really be separate, to form new, even more nonsensical words. Particularly annoying examples include "I do this alot" and "I do this everyday". So basically you apportion commonplace items? Are you trying to say that you're an online groceries picker at Tesco? Parlay voo onglaze, mon sherry?
Another equally annoying Americanism is the use of adjectives as adverbs, e.g. "that is super important". I really hate that one.
But even when not inappropriately compounded, "I could care less" is a frankly bizarre expression (or should that be a "super bizarre expression"?). What exactly is that supposed to mean, anyway: that it's possible that you could care less? So you care more than you might otherwise, for some unspecified reason? Are you deliberately choosing not to care less, because that's exactly what they'll be expecting you to do, and anyway you picked a bad week to give up amphetamines, so caring less than is absolutely necessary might be pushing things a little too far?
If these cheeky colonists insist on having their own language, they could at least make an effort to have it make sense. They also probably shouldn't be calling it "English". Maybe they could call it "Dublish".
This post has been deleted by its author
If this process is - or can be - automated, then I think we are missing the bigger picture: improving online posts.
After all, surely abysmal prose can be identifiable as well so to provide useful obfuscation any such tool must be able move a piece of text towards 'average' from either direction.
Going further, could such technology be packaged as a browser plug-in, enabling the viewer to translate Youtube comments into passable English on the fly?
> The sheer processing grunt needed for such an operation does not exist and probably won't for another 1000 years
It depends upon what the input is. Sometimes we use language to describe something fairly objective, such as how to assemble a desk. Sometimes we might use seek to use language so formally, so free of ambiguity, that computers could follow our instructions. Or people with whom we have a business contract.
And at another point of the spectrum, we have poetry and jokes.
Deed zeey test it veet zee incheffereeser? Bork Bork Bork!
"[...] and what you read, [...]"
That's more difficult. We unconsciously pick up information and style from what we read in our lifetime. We also use explicit quotes or book references in the course of a comment. El Reg threads often contain allusions that are only recognised by the cognoscenti.
> The problem is that in doing so it will make it unreadable so no one would be interested in what you are going to say anyway.
Yep, that'd be a problem for activists and political figures, less so for criminals.
E.g: "if you shoot T. Hancock of 23 Railway Cuttings East Cheam I will deposit 3 BTC in your wallet".
Shit, if the hitman had a webform (like Amazon) instead of an email address, the would less scope still for idiosyncratic language.
You could always use an automated translation routine to translate it to another language, then another, then back to English. There was a site years ago that would use babelfish.altavista.com to do this, with often hilarious results. To say it bears little resemblance to the original language is understating it.
Opposite of the original text's meaning, so not very good at keeping that.
I do know one thing : with text like that, I would very quickly recognize that I am absolutely not interested in reading the rest.
So yeah, makes you anonymous. It also makes you unknown. Not sure that is what people actually want.
So yeah, makes you anonymous. It also makes you unknown. Not sure that is what people actually want.
Yeah, but when I post AC (for demonstration purposes here) there's usually a good reason. Unfortunately from any reasonable length of prose you'd pick up my style cues, and the formatting of the post. I can see a handful of things in this short post that could give me away. Care to guess?
"Care to guess?"
1) "Yeah"
2) Use of parentheses to add an aside.
3) British English spelling of "handful" - is US spelling "handfull"?
4) use of "you'd" - not that common.
5) "prose" is an "educated" word.
6) "my style cues, and the formatting of the post." - the placing of the comma?
However - I'm not in the mood to match that to other posts.
However - I'm not in the mood to match that to other posts.
Zigackly. That's for serious investigators in pursuit of a high-value prize. Or else for software tools. But not casual commentards in the virtual pub.
I'm not the AC in question either. But if anyone here is playing, I did post an earlier comment as AC, also for reasons of making a point. The pint is for the commentard who identifies it.
Certain spelling mistakes, certain repeated phrases, added to timestamps and known websites would really really narrow it down.
There have been a few Anons called out on this site for example, and that is without a computer program checking the statistics for cold hard mathematical probability and certainty.
Anon, just to see if it works. ;)
Certain spelling mistakes, certain repeated phrases, added to timestamps and known websites would really really narrow it down.
On the contrary, that's a well-known and easy-to-implement distraction. Used it myself in the past when having fun on Usenet (dynamic IP and an open usenet server helped).
"In reality, other clues - such as your posts' time stamps as regards your likely timezone - could be used to rapidly shrink down the areas of t'web to be examined."Or in the Git's case that he needed to get up in the wee hours... to wee! IOW he's just an old fart ;-)
"your posts' time stamps"
As a counter-point - more than once have I wondered whether employees of online shops ever doubt the fundamental compatibility of the shipping address I gave with the time the order was made at... I know I certainly would.
One fundamental problem, no matter what the confidence level in an identification; that which does not fit will often carry more weight than what does. Much like a suspect fits a crime perfectly, other than their skin colour not matching what witnesses described.
Open with "I have a doubt" and the suspicion is that the author is not a native English speaker. Using "english" would add to that. Throw in a few "color", "can not", or suggestions of having voted for Trump or Clinton, and the notion the author is British drops quite quickly. A few deliberate misuses of "their", "there", "they're" and doubt quickly creeps in. Switching "that" and "which", increasing or decreasing their use, swapping "they" for "it" for a collective institution, changes things subtly but significantly.
Just look at the debate on whether Obama wrote his own speech when he said "back of the queue" and not "back of the line".
Anonymity is used to create plausible deniability and that seems reasonably easy to achieve and increase. I'm mostly a Windows user ->
"A few deliberate misuses [...]"
It is hard to be consistently wrong. A good analysis would probably detect it as an attempt at obfuscation. Do it too often and it becomes an identifiable style.
IIRC there was an Inspector Morse plot that revolved round the deliberate? misuse of "s" or "z" in some words in a written note.
Possibly we could use a quick Google search to match a sentence to the most common and bland version of it? Or if everyone used a large generic dictionary. Then it would take away the specific footprint. But still leave the discussed items/times/places as identifiable, but not the prose style.
"IIRC there was an Inspector Morse plot that revolved round the deliberate? misuse of "s" or "z" in some words in a written note."
Translating that into Laevopudlianese: There was a Perry Mason plot that hinged upon the ignorant misuse of "mimento" for "memento" in a typed letter. The perp immediately confessed to the murder. Today 60 years on, it would be " ... mento? So what? Get your own candy."
Search Register forums for "kinda" and multiple uses of commas, separating individual words. All caps, but light use. Possible jokes, and similar timestamps.
While this post may be a one off, the temptation to do it again, or the chance you did it in the past is high. Then it's about building up information from that.
But hacking the actual forums, or getting a court order for your ip/login is much, much easier. So it depends on the target and the effort wanted/needed.
This post has been deleted by its author
You can easily obsfucate your writing style by translating your words into another language, then back again.
So: from English to Croatian to Italian to English:
You can easily blur the writing style translate their words into another language and then again.
And if the result is gibberish (frfljanje - yes, really!) then it doesn't matter. Nobody ever reads this stuff, anyway.
So, I will appropriately recaptiualise synergistic schemas to intrinsically pursue transparent technology by appropriately re-architecting cloud-ready testing procedures to leverage progressively benchmark focused products delivering seamlessly matrix value-added sprints with distinctively architectured diverse fungibility while assertively supplying robust meta-services which dynamically synergise intuitive bandwidth.
"As can be seen from the example above, last year's transformation of stands out as odd.
The researchers' revised approach reads better. This particular sample may only feature only minor variations on the original text, but if it can defy stylometric analysis, it has accomplished its job."
Was the article run through the tool it describes?
I found this a little while ago. I ran the test on myself and it was amazingly accurate.
See our best guess as to which world English you speak.
As for improving your writing it is always a good idea to keep the words simpler. Iong $10 words that one can stretch between trees may seem impressive but many people will trip over those words.
I have used the Word Counter etc in the past. I tested it again yesterday. I am not impressed. It stated that my test contains plagiarism. Total bullshit. What I posted in the counter not only is not stolen, it is extremely unlikely that anybody else would post something similar. It was a few sentences about my personal educational history at Berkeley including working on fusion power at age 14.
I have complained that they have a bug, that much is certain. Shall see if they reply.
Other than that the word counter does work well.