Re: we don't have to keep running the same tests over and over
Either way, Unison is absolutely aware of whether a test (or any code) is able to observe or manipulate the outside world.
6 publicly visible posts • joined 26 Sep 2019
Hello, I’m one of the creators of Unison. It’s true that 512 bits (64 bytes) is a bit larger than what today’s CPUs typically use for pointers (8 times larger to be exact), but this alone is not going to contribute significantly to the memory footprint of the typical Unison program (user data is going to do that). We think this is a fair price to pay for the abilities we get from content-addressing code.
Regarding renaming... if you have e.g. a Java library where you’ve named something x, and lots of user code refers to it as x, then if you rename it to y and republish your library you’re going to break everyone’s code. Names are really important in traditional languages. But in Unison, the name is just metadata. You can rename a function from x to y, republish your code, and everyone else’s code still works! Because they weren’t referring to the name anyway. Their code was referencing the hash.
Because of the hashes, Unison knows a lot more about the structure of your codebase than a typical IDE, so we can make refactoring a very controlled experience. They typical workflow in most languages is that you make a change and your codebase is broken until you finish propagating that change (manually) throughout. But a Unison codebase is never broken that way, even in the middle of a refactoring.
Hello, I’m one of the creators of the Unison language. Happy to answer your questions!
While it’s true that a hash collision is theoretically possible, the chances are extremely remote. If you found one, you would win some kind of cryptography reward and you would be famous. We don’t expect any two Unison programs to have the same hash for at least a few trillion trillion trillion trillion trillion trillion years. :)
And if they do, the language isn’t “hosed”, we just outlaw the offending hash and move on. Every program turns out to have an infinite number of equivalent programs that have a different hash.
I’m not sure I understand the second problem. If you want to have a shared library and publish an update, it’s always going to be true that users of your library have to change (or at least relink) their code to get your latest version. In Unison you accomplish this by publishing a patch that upgrades users’ code. They just apply the patch and now their code references the latest hash. Maybe I’m misunderstanding what you mean.
Hello, I'm one of the creators of Unison. You are of course correct that if a test has a side-effect then we cannot cache the results. This is going to be the case for things like integration tests. So Unison will not cache the results of all tests. But we can actually tell which tests are going to have side effects and which won't (Unison's type system gives us this ability), and it turns out that most unit tests can be run without any side effects at all.
Your comment of this being an "extreme form of dynamic linking" got me thinking about that. I too have nightmares about dynamic linking and "DLL hell". But I think Unison's approach is actually a form of extreme *static* linking. That is, the problem with dynamic linking is that the address of linked code isn't known in advance--you only have a link which is maybe a name, a version, and an offset. Links can be broken so dynamic linking can fail and routinely does. But hashes in Unison are essentially static pointers into a vast shared memory space. So the referent of a given hash is always going to be at the same address in that space, just like in a statically linked program.