What is needed to make C safe
C (and, by extension, C++) is unsafe in so many ways that making a compiler + runtime system for C that makes it safe is bound to make programs run slower. So while this may be a solution for compiling "dusty deck" programs without modification for applications where a ×2 to ×5 slowdown is not important, I can't see a way around replacing C by languages that are safe by design for applications where speed is important.
It has long been known how to make C safe(r):
- Add garbage collection. Replace free() by a no-operation and free memory by GC. Because C can do all sorts of stuff with pointers, this requires conservative GC: Any value that could be a pointer into the heap is considered to be a pointer into the heap. So if an integer by chance happens to be in the range of values that (if it was cast to a pointer) points to the heap, we must preserve the object it points to. But C pointers need not point to the header of a heap allocated object: I can point anywhere from the start of an object to one word after its end. Anything else is undefined behaviour. So to identify objects, we need to know where objects start. This can be done by a global table of star and end addresses of heap objects, where the GC compares a value to these to find the header of the object. This gets expensive if there are many heap-allocated objects. Alternatively, every heap object starts with a 64-bit "magic word", which is a value that is unlikely to be generated by computation. You can then search backwards in memory until you find a magic word, and you have found the header of the object. Not 100% safe, but works most of the time. Alternatively, use fat pointers.
- Fat pointers are represented by two machine words: One that indicates the start of the object into which the pointer points, another that is the actual value of the pointer. This makes it easy to find the headers of objects, and you can also do range checking (as the headers indicate the size of objects). It makes pointers bigger, and range checking costs, but it allows precise GC. Casting integers to pointers (and back) is a problem, though. Like above, you can search for the header of the object to which the new pointer points (and report an error if it doesn't point to any), but this is costly and doesn't give strong guarantees. In addition to explicit casts, storing a non-pointer value into a union and taking it out as a pointer is problematic. So unions should be tagged with field indicators and checked when you store and read values from the union. And since any integer can be cast to a pointer, you can never be sure when a heap object is dead: It may be accessed later when an integer is cast to a pointer. There are coding tricks such as using XOR to traverse lists bidirectionally that will make this happen, so you can not guaranteed 100% memory safety.
So, it is a better solution to design a language where you can not cast integers (or any other value) to pointers, and where pointers always point to the headers of objects. This allows single-word pointers, and by reading size information from object headers, range checks can be made. You can no longer just increment a pointer in a loop to traverse an array (you have to use offsets from the base pointer), but that is a small cost -- usually, base+offset addressing is supported in instruction sets. And the compiler may do strength reduction to transform pointer+offset to direct pointer when it is safe to do so.
Unchecked unions should also be avoided, as should null pointers. You can use option types instead. Compilers can compile these into values where 0 means "none" and any non-zero value is a real pointer, so there is no run-time overhead (apart from checking if the value is 0, which is required to avoid following null pointers). Rust does this.
Implicit casts should also be avoided. An explicit cast need not have any runtime cost and not making them explicit is a sign of programmer laziness. Null-terminated strings are not exactly safe either.
Some will say that GC is costly. Well, malloc() and free() are not exactly free either, and they are prone to fragmentation which can not be avoided as long as you can cast integers to and from pointers, as this prevents compacting the heap to close gaps.