Does it have to be anything
Except a way of causing security experts to waist time looking for a none existent needle in a haystack.
Antivirus experts have called on cryptographers and other clever bods for help after admitting they are no closer to figuring out the main purpose of the newly discovered Gauss supervirus. While it's known that the complex malware features many information-stealing capabilities, with a specific focus on capturing website …
From the description on Kaspersky's blog, this is a textbook implementation of Bruce Schneider's "clueless agents" idea [1]. Virus writers had discovered it on their own in the early DOS days, but the encryption used then was sloppy (essentially a trivial Vegenere variant) and easily breakable [2]. The people behinds the Gauss thingy were obviously pros and implemented it properly - as I predicted it would happen in a paper of mine presented at the RSA crypto conference in Tokyo in 2004. [3]
There is no hope breaking the code except by luck (i.e., the anti-virus researchers happen to stumble upon an infected system that contains the file names the virus is looking for) or by breaking the RC4 cypher, which isn't doable by amateurs (i.e., it requires the resources of a nation-state). That, or unexpected advances in cryptanalysis, discovering holes in the RC4 cypher - but I wouldn't bet on that happening any time soon, either.
[1] James Riordan and Bruce Schneider , "Environmental Key Generation Towards Clueless Agents," Mobile Agents and Security, Springer-Verlag, 1998, pp. 15-24.
[2] Dmitry Gryaznov , "Analyzing the Cheeba Virus," EICAR Conference, 1992, pp. 124-136.
[3] Dr. Vesselin Bontchev , "Cryptographic and Cryptanalytic Methods Used in Computer Viruses and Anti-Virus Software," RSA Conference, 2004.
There is no hope breaking the code except by luck (i.e., the anti-virus researchers happen to stumble upon an infected system that contains the file names the virus is looking for)
So why don't they improve on luck by creating a "Could It Be Me" web page and invite all those interested in this sort of thing to try their luck. The page could provide a mechanism for the user to check their own system for files with the required characteristics.
They may then hit on some good candidates for the decryption key.
Something like this has most probably been done already - a custom program that check's the user's file system for file names that would produce the correct hash, offered to the victims. This is exactly the first step Gryaznov took when trying to crack the Cheeba code - and it yielded nothing, so he used better means. Even if this succeeds, I would still classify it as "luck" and wouldn't rely on it.
"There is no hope breaking the code except by luck"
..thre creators of flame have not made an implementation mistake, such as encriphering two plaintexts with the same key (which they haven't according to the kaspersky webpage).
As you are a crypto expert, could the RC4 weakness of the first few bytes being strongly correlated to the key being used in this case ?
It is a pleasure, sir. Genuinely.
And I second your argument. While RC4 does have some known weaknesses, none really apply to this particular style of implementation.
About the best chance possible right now is a Gauss@Home Project in which people donate processing power to a distributed brute-force attempt.
"About the best chance possible right now is a Gauss@Home Project in which people donate processing power to a distributed brute-force attempt."
Given the possible state involvement of this thing, and that breaking open the package might actually yield a clue to that, there could actually be a lot of interest from people in participating in such a project. How easy would it be to set up the software and system to try and brute-force this?
The very same. :-)
Wow, back in the day as as sub-teen kid reading the stories what kind of tricks [the ax=13h, int 21h (virus friendly interrupt)] viruses employed was so damn exciting. I bet I can still quote some phrases.
I wonder if any DOS virus actually preprogrammed 8259 PIC (in nowadays terms that would be the perfect keylogger)?
And no, I never wrote a virus myself.
Presumably the encrypted parts must be unencrypted at some point to be of use. This is a genuine question. How possible is it to monitor this thing and see what they are when they are opened up? Presumably the keys to the package are stored elsewhere. Is it possible to run this thing in a VM under a variety of different circumstances that might trigger it to go get the keys and do whatever it is it's supposed to do, and see what the RAM contains at that point or else grab the keys as they are retrieved?
I wont be the first person who has ever thought of that so what stops it working?
This post has been deleted by its author
That is true be we (I am the first author) suggested a number of fairly precise targeting mechanisms that would require knowledge of the intended execution environment (e.g. the secret is _which_ environment is targeted). The paper is available at
http://www.schneier.com/paper-clueless-agents.html
and I think it is pretty readable as crypto papers go.
I suppose the virus payloads have to be deposed in the pattern of a REGULAR PENTAGRAM around the AXIS OF EVIL, which currently (as per decree of our WISE OVERLORDS including BLACK POTUS, runs through TEHRAN with LEY LINES into DAMASCUS and possibly BEIJING) upon which the STARS WILL BE RIGHT and the simultaneous opening of the CRYPTO PAYLOAD will cause a STRANGE AEONS EVENT ushering in WORLD DOMINATION of the BLUE FORCES allied with the nethermost kraken of DREAMS.
I hope you have CASE NIGHTMARE GREEN one phonecall away.
This post has been deleted by its author
Don't they have a debugger that they can run the virus under until it has unencrypted itself - then they should be able to see what it is looking for (and satisfy its search so they can see what it does when it finds what it is looking for!)
Mine's the one with the assembler card in the pocket...
> the so-called Duqu Framework was developed using plain old Object-Oriented C
Well, "plain old Object-Oriented C" does not really exist because it's not a common way of doing things, is it?
The last I heard was the Duqu framework was written in something extremely similar to SOOC and that SOOC was open-sourced after parties unknown (*cough*) developed Duqu. I don't know whether anyone followed up on this bizarre reverse causality. Maybe someone did and has "fallen off a balcony" or something.
Can someone explain for interested armchair spectators: what exactly is used as the key? (e.g. what different filenames is it trying, everything in a particular folder perhaps, everything longer than a certain length), and how does the program know it has succeeded? (I assume it doesn't continually attempt to execute gibberish, is it testing for a short string?).
I'm surprised Vesselin (that name rings a dim bell for me too) wrote off a website check; i mean, I bet plenty of people looked at the javascript generated message on Securelist's page that told them they weren't infected; if the extra check for the payload conditions could be included in the javascript then that sounds like a better option than waiting for a nation state or botnet owner to take interest.
PS Installing a new font is strange, any theorys?
Jeez, mate!
El Reg wrote in TFA: More details and a technical description of the problem are available in a blog post here.
That's a clue for you to move your mouse cursor to the pretty blue underlined word 'here' and click the left mouse button. Or the right button if you've got it set up for left-handers. If you're reading with lynx, ignore the mouse. Use cursor keys to move the cursor before the word 'here', and press Enter. That should set you on the path!
Oh, and BTW, you can't look for the file name with a Web page. Web pages aren't allowed to access the files on your machine - and for very good reasons. It has to be either an ActiveX object or some other program that you download and run explicitly. A Web page can make the process easier, but the whole thing can't just be done in-page with a few lines of JavaScript, for security reasons.
You really should read the explanation and description of the algorithm on Kaspersky's blog (referenced near the end of the ElReg article). It can't be explained simpler than that, sorry.
The virus knows that it has found the right file because the cryptographic hash of the file name matches a value hard-coded in the virus. But since crypto hashes are not reversible, we can't know what the file name is just by knowing the hash. And when the right name is found, the virus uses a DIFFERENT crypto hash of it as a decryption key. So, we can't find the key without finding the file name.
It is like this. Suppose that a secret agent has been given a locked case with instructions what to do. He doesn't have a key to the case, and doesn't know where to find it, but he's given a pretty good description of the key. So, he wanders around aimlessly, looking for the key. You have captured the agent and have interrogated him. He has told you everything he knows - but he can't tell you what he doesn't know. He's clueless regarding his secret instructions. So, you have two choices. Either start wandering aimlessly around, looking for the key by its description (which the agent has told you), or try to break the locked case, which is very hard to do.
You seem to be saying that there is no point looking for the right filename as it is just a matter of luck, and also that there is no point trying to crack the encryption.
That seems to exhaust the possibilities of a direct approach, so what, then, do you suggest?
> The virus knows that it has found the right file because the cryptographic hash of
> the file name matches a value hard-coded in the virus
OK, I've not read the blog post, so this might be a somewhat misguided comment, but is this the sort of thing that could be crowdsourced?
If we've got the hash - and that's the bit I've not checked - it would be entirely possible to write a hash-checker, to test each file in the system against that hash and report any matyches. Distribute that program - with source, to satisfy us paranoid types - and see who reports matches, and against what...
It's a targetted brute-force attack; we can be reasonably sure that the hash will match a file on someone's computer.
Vic.
@Vic
The AV company did try millions of file names they have in their database. The filename needs special character (basically anything non asci, except '~') in the path, it is likely to be targeting non English speaking country, hence greatly shrinking the available crowd.
However there is an easy way to protect from the virus - just all files have to have ASCI names (no ~, though) --- even w/o running the hashcode checker.
> The filename needs special character (basically anything non asci, except '~') in the path
Yes, that's why I suggested crowd-sourcing it. That gives you a much higher probability of having the target file on your system than doing the test in a single locale...
> it is likely to be targeting non English speaking country
Indeed. It would make sense to look for it in that sort of locale, then, wouldn't it?
Vic.
Heh. I had to smile reading that, given that Mr. Bontchev has been posting responses here atm. I'm thinking, of course, of his paper "Possible Virus Attacks Against Integrity Programs and How to Prevent Them":
http://www.people.frisk-software.com/~bontchev/papers/attacks.html (search for "Kuang").
As for the concept of multi-partite, oblivious agent-style viruses... super interesting. Even though the concept is very old, there are lots of fairly new techniques that could be applied. Chaffing and Winnowing (perhaps together with an all-or-nothing transform, and/or time-dependent hashing algorithms or cryptographic time servers) looks like one way of approaching it. Secret sharing schemes (Shamir, Rabin) with cryptographic accumulators (to validate a collection of parts as constituting a whole) is another. Then there's homomorphic encryption combined with polymorphic engines, but I don't think that's practical yet, despite recent advances.
It is very interesting to consider how a swarm of agents can combine to become greater than the sum of their parts and survive as a collection even when individual components are being teased apart and eradicated. Mathematically and architecturally, at least. It's equally important to remember, though, that these "perfect" (in some senses of the word) systems are being controlled by external agents, increasingly for nefarious purposes, as opposed to latter-day virus writers who did it purely for the technical challenge. That, in my opinion, is the weakest point. Sure, it would be nice to crack the key in this case, but wouldn't it be even nicer if we could trace the swarm back to its controllers?
It searches for a pair of file/directory names which hashes to a particular value, then uses a salted hash of that name as the key. Because the added salt changes the hash completely, there's no realistic way to get from the (known) unsalted hash value to the unknown hash which is used as a key besides finding the filename it's looking for.
One interesting aspect is that the path entries themselves seem to be being used as a sort of salt, or one of them at least - further obfuscating the file it's actually looking for in %PROGRAMFILES%.
Parts of it are eerily familiar to me - I implemented something vaguely similar a few years ago as a simple licensing hack (the published code checks for a filename with a particular SHA1 hash). Of course, it's not the first time this has been used, malware was doing this (with CRC checksums of filenames) back in DOS days...
As nobody of this level is that stupid and windows developers still use ascii 8.3 naming (habit, tradition), Unix developers won't touch anything UTF in filename, someone really thinks others are dumb.
Hebrew (Mossad) or Arabic (who? Iran uses Persian) is really lame joke.
The problem I see from the approach being suggested is that it may not be a commonly found file or even set of files and thus would not exist on standard home/office systems.Given the sophistication of the virus its more likely to be a specialist i.e. a secure file system thats its looking for so this approach would be a fruitless time wasting action.
In that case, perhaps the better approach would be to enquire with developers of such high-end custom/proprietary applications for a list of file names/paths, rather than finding the unknown target end-user.
Unless that is the CIA/others would have objections to the wider web knowing about their "Destabilise Government.exe" or "Stuxnet for Dummies (Hebrew).pdf"...
....the 'uncommon' filename is that of a file which has already been distributed amongst the target parties, like a report. Seems like an easy way to cherry pick all the people at or above a certain (fairly high) security clearance level who are thus likely to be codgers with f all knowledge of IT.
The spy bit is knowing that this file would be in circulation amongst the people they're interested in (or that they'd be interested in people who were interested in this file) or even planting it themselves in that agency's intel 'network'. Perhaps it's just a report about a new, top secret asymmetric screw design and some smartarse wants to see how it spreads through an enemy's intel network to get an idea of who's connected to who. Social traffic analysis, if you like.
If it's in foreign characters and might have a few directories in front of it, it still shouldn't be that hard to build a list of all posiibilities, especially as this trigger is likely to be shared by maybe a few thousand people.
Is it possible to see it attempting to match it, does it really not glance at stuff and then move on to another check but instead hash everything possible and see it if matches?
'cos the filename it's looking for is encrypted. It hashes (non-reversibly encrypts) all the file filenames in the system it's on and tests them to see if it matches the stored hash. Currently the only way to find what that filename is is to generate a hash of every possible filename and test if it matches the stored hash in the virus.
Amateur here so bear with me...
If the virus hashes every filename on the system until it finds a match then would it not just be possible to do the same on computers that are known to be infected as the file is it looking for should be on there? Would it not be common on all computers that are infected?
Could it be possible that the specific file is a temp file created as part of the custom font installation module as it seems that Lagrange is the least encountered module. Isn't it the consensus that this program has very specific targets?
If someone wrote a work units package like seti@home or protein folding I would lend kaspersky or whoever my processors and GPUs while I'm not using them.
I don't think I'd be alone either.
There must be enough nerds, with enough compute power, even around here, to brute force this eventually
(for a given value for "eventually" of course)
Presumably the likely targets know or suspect who they are. The obvious prophylactic is the simple, though inconvenient and tedious, exercise of renaming all the program files on sensitive computers to start with a Latin ASCII character. So if the virus hasn't done its job already, it isn't going to succeed now, and cracking the encryption is just an interesting and entertaining exercise.
To block further spread, inoculate infected systems, or to understand the scope of the damage that may already have been done?
I think the people at whom this has been targeted are going to want some idea of what information has been compromised and by whom. And the only way of doing that is to find the systems for which the encrypted payload targeted. Then, use information from those systems to decrypt the payload and examine it.
Since the owners of targeted systems are highly motivated to understand the scope of the impact, they will probably volunteer some system information to Kaspersky to assist them it their analysis. This assistance will probably be provided under some non disclosure terms.
One thought that seems to have not occurred to people yet is simple - perhaps the file with the correct hash has not been released into the wild yet?
Look at the timescales for things like Stuxnet to get to its intended target, and the sheer number of infections required to get the penetration into the right place.
Perhaps the author is waiting for enough propgation of Gauss in the wild, perhaps for this intial "hoo-haa" to die down a bit, then release "THE FILE" that everyone is speculating about.
What's the odds that "THE FILE" is another virus like self propogating thing? When it meets Gauss, the payload is unlocked and away we go....
Just a guess....