I have a better way to save space on phones
Stop shipping Facebook garbage in the firmware so the read-only partition can be smaller.
Facebook has developed promising asymmetric compression technology and aspires to share it with the world. Dubbed "Superpack" and detailed in a post by Facebook Engineering software engineer Sapan Bhatia, the tech "combines compiler analysis with data compression to uncover size optimizations beyond the capability of …
As usual, there's nothing original in what Facebook is doing.
In Windows, compressing dll's and putting the compressed result in an executable's resource area has been around almost as long as Windows itself. Used more back in the days of floppies when space was at a premium. When the exe starts, inflate the embedded compressed dll's in the exe's local directory, and when the exe finished, delete them. Allows you to still architect using custom dynamic-link libraries yet give the appearance of the program being only a single exe, and more importantly, saves space on the media. Only penalties are the main exe has to contain the decompression routine (or calls to Windows Compression API if Win 8 or later), which isn't a big deal, and the program temporarily will require room for the decompressed dll's.
Is it a better mousetrap?
It depends. Arch Linux recently moved to .xz from .bz2 or .tar.gz (can't remember). Anyway, the additional compression is a roughly 10% saving. However, these things take something like 15x longer to create.
Now, the distro sources are created once and then downloaded millions of times, so a small saving in size repays the additional effort of compression, many times over. I think they also take longer to decompress too but with SSD these days - who knows? I don't really notice. The even more recent parallel download thing offsets even that.
.xz (lzma2) is *significantly* faster to decompress than either .gz (gzip) or .bz2 (bzip2), and it improved install time along with ISO size. Big win all around, the old standbys might be familiar but they're still 70's tech.
But they moved to .xz in... 2010? The one the recently moved to is .zstd, at the end of 2019, which is very slightly larger than .xz but ten times faster to decompress, again making for faster installs, especially on high end SSDs that can soak up data that quick.
In addition: Zstandard (zstd) is also from Facebook and is known for being fast. It compresses and decompresses really fast, unless you use the higher level compressions, beyond 15, more or less. Default compression level is only 3.
I highly recommend it as a replacement for gzip. I use it to compress 16 GB files at level 6 with multithreading (-T0).
If you're running Archlinux and building packages yourself, you may want to lower the default compression level and enable multithreading in your /etc/makepkg.conf with:
COMPRESSZST=(zstd -czv -T0 -9)
> Packing the executable with DLLs wasn't the novelty. The novelty was in how they packed it, removed redundancy and encoded entropy to improve compression ratios.
From reading the article, it sounds like Facebook have come up with the idea of embedding some sort of programming language into the compression system, so that it can run decompression routines which have been perfectly tailored for the content, rather than a generic "one size fits all" algorithm.
But that's not something new - the RAR format has had a VM system for years, that lets people hand-craft code to improve compression ratios. And as is ever the case, someone even worked out how to write a Hello World program within said VM.
To be fair, when you've got around 2 billion people using your app on a daily basis, even knocking a few kilobytes off each download makes a difference. Especially when at a glance, they've rolled out over 30 updates just for Android phones in the last 9 months.
And it's not like Facebook is short on resources. But I'd still be inclined to get someone to take a long hard look at why the app itself is so "bloated", rather than faffing around with ways to reduce the impact of said bloat.
I thought they only started writing to disk when page protection FINALLY appeared on XP SP2. Before that, they would just decompress to RAM, and change the pointer to the EXE/DLL to the new location, since there was zero protection and all memory was read/write/execute outside of the kernel. Meant they couldn't page it out, so it took up more memory, but it's a tradeoff when disk is also at a premium.
You beat me to it. It is probably also going to make it harder to work out what their apps are really doing versus their proclaimed purpose, and I trust Zuck about as much as the average heroin dealer. Not that O know any, but I don't know Zuck either..
If they are offloading all the monitoring and decoding of your facial and voice expressions to your device. assisting you when you go to to the toiler (private time for you to be with Zuck) there is a lot of code that just won't fit on the traditional flash based devices.
I can still remember this well from the C64, where every bit counted. Android is so “efficient” that this is coming back.
Only a few years ago, a machine with 10 cores, 12 GB of RAM and 256 GB of disk space, would have been called a “mainframe”. It would have had dozens of users working on it. ;-)
Just what I was thinking. If FB wants to learn about packing get some of those coding/hacking groups from the 80s/90s in to teach these guys how to "hack, crack 'n pack" something.
Those guys were incredible, they could take games with 100+ files over 2 floppies, pack them down into a single executable file that ran off 500kb without the need for any temp unpack space, all done in memory, in real time on CPUs with a fraction of the power in a modern phone.
When I started writing programs I was working in about 48kb of memory, the BIOS and CP/M using the rest of it and virtually everyone in that world wrote programs that worked well, even if they had to use half a floppy disk to store the data - I was writing in assembler with a Z80 and using an 8048 when I built hardware devices.
There are quite a few factors that have resulted in massive program size increases - for example think about adding 1+1+1+1+1 in a 64-bit environment where the default calculations are always floating point.
First there were applications and data and they were small to fit the space available
Then there was too much data so compression was invented
Then there was more storage and bloatware was developed to fit the space available
Then there was slow internet and compression was reinvented
Then there was broadband and bloatware development was enhanced to fit the bandwidth available
Then there was too much data and compression was reinvented invented ...
Giving developers super fast machines with max everything means they don't care about efficency generally.
Give them yesteryears base models to work with and that will reduce bloat as they will have to work within the framework of what they have.
That's what I'd do!
back in the days of spectrums, c64s etc....software developers had to be so much more efficient to work within the limitations they had.
This is exactly what they did for the game MDK many years back. They developed it on good machines, but then had to test it on machines that were bottom rung of the ladder. If it ran crap, they had to go back and work out why.
It's why it was one if the smoothest games of its time.
Giving super fast machines isn't a problem, the problem is either:
1) Crap devs
2) Crap management overruling the devs doing things right in order to ship the lowest quality product the company can get away with.
You give devs low quality machines they will walk, or your costs will go up as everything will take longer due to either the hardware, or the devs moaning about how shit it is.
And your developers will leave for better jobs.
Making developers lives harder with slow dev machines is not the answeer, especially since we're talking Android here and you don't develop Android apps on Android. Mid-range devices and older devices are useful in the test drawer but you also need the latest devices so you can test against the latest OS features.
The answer is to teach junior devs to care about resource usage, not tie one hand behind their back to hamper them.
At Steve Kerr. I'd give them decent machines to code on, but then force the resulting software to be run on the lowliest machine possible. If they get the job done faster because of their high powered machines, so much the better, but if it runs like shit (or not at all) o, say, a Raspberry Pi with 1Gb of RAM & a 32Gb SD card for file storeage, then make them recode the program until it runs (properly/at all).
"But we're running it on modern smartphones with a zillion cores able to deal with multiple threads & half a Tb of RAM, so why code for something so crap?" Because you can't ensure that the person that wants/needs to run your software will be using it on such a device. What if it's in a third-world-nation where the average smartphone is over a decade old? Do you personally pay for each of those people to get a modern handset, or do you write your code so it runs worth a damn on said limited hardware?
*Hands you a pint* Drink up. It'll help lubricate your brain & limber up those coding muscles to do some serious optimization. =-)
Given the latest revelation that FB has 5.8 million VIPs on a whitelist to whatever they liked without caring about FBs terms and conditions, I think if you don't have an account don't get one, if you have get rid of it and tell your friends and family that you all need to move to something that isn't a cesspit of pond scum that is FB.
Proprietary compression means it may take a while for AV scanning services to be able to accurately scan inside these executables for malicious code. Best to just blocklist anything with that file signature / magic number.
"Bhatia explains that Facebook needs compression because its apps keep bloating"
Maybe that's the first problem to address. I have a few well written specialist applications that have excellent performance and will run on almost any Windows box and they have one other common property - they have a really small code base. For example a powerful optimising C compiler for PIC with an advanced drag and drop IDE - the whole things is a few tens of megabytes apart from the device specification files, and it runs on everything back to XP SP2.
Bloat primarily occurs as a result of "tweaking" as opposed to re-engineering. In modern applications there's often a lot of code that never gets executed, either because it's part of a massively redundant library or because it got left in when it was no longer needed by the next version. In either case it never gets called at run time, so it might as well not be there.
Compressing the redundant stuff - however expertly - seems to be merely first aid rather than medicine.
Biting the hand that feeds IT © 1998–2022