Re: Resiliency – we've heard of it
If you write an XML schema properly, and use proper tools that fully understand XML schema (a rarity), then you can in the schema define valid content. For example, you can constrain the valid values of an integer, or the length of a list, and parsers / serialisers generated by proper tools will generate code that checks that such constraints are honoured.
So far, so good. As I hinted, most XML XSD code generators I've come across ignore the constraints and generate no code for them. Examples include Microsoft's xsd.exe, with a lame excuse (I found their reasoning buried deep in some docs) that amounted to "computers do not constrain the values of integers or the lengths of lists". Well duh, that's why we have software and write it to impose constraints when we want values and lengths constrained.
However, so far as I know XML XSD is far from "complete". I'll explain what I mean. Whilst you can express constraints in an XSD file, you get nothing in the generated code to tell the program that uses the generated code what the constraints actually are. So, touching on your "That data-program may crash because you have coded a stack overflow" point, it's hard for the program or developer to know, for example, how much memory to allow for the length of a list when parsing some XML data. Perhaps not a problem on a server / desktop application where memory is bountiful and memory allocation is automatic in the generated code, more of a problem in an embedded system where perhaps the developer has had to decide how much heap a process will need.
A serialisation technology that is "complete" is ASN.1, though not all the ASN.1 tools fulfil the whole standard. With ASN.1 you can define messages, setting value and length constraints as you do so, just like XML XSD, or JSON schemas. What is unusual is that you can also define static constants of messages, including integer constants, which can be used as the constrains in the definition of other messages, and these static constants also show up in generated code. So for the developer, the generated code contains 1) parsers / serialisers for objects (messages), 2) automatic constrains checking whilst parsing / serialising, 3) constants that can be used to understand the extents of constraints which can then be use for all sorts of purposes in the code the developer writes.
One such purpose might be iterating over the length of a received list. If the list is defined as containing 10 entries, then the for loop can be from 0..listlen-1, where listlen is a constant that comes from the schema, not from the developer.
The consequences can be quite profound. System behaviours related to the constraints on valid / invalid data can all be driven from the schema, not from developer-written source code. This means that all such constants have a single definition - irrespective of programming languages used across the whole system. Change the schema, rebuild the system, and the entire system is updated with the new constants and thus the new behaviour.
This can have profound consequences on how you run projects. If you have a risk of stack overflow due to the amount of data to be received, you can have the stack size driven by the constants in the schema. It either works (there is enough memory), or the code throws an exception when it can't get enough stack in advance of needing it. If you need the extend the length of a list to contain more items, and you need programs to generate / process the extra items, so long as they're using the constraint constants from the schema-derived generated code a single change in the schema brings about the required code change system wide.
That means you no longer need the developer to make the change, the schema author can safely make the change, at any point in the project life cycle, even quite late. You can be agile with the definition of messages in the system right throughout the project development cycle, because changes to message definitions do not have to result in any re-work.
Pretty neat for a useful, old technology. Especially as it can emit / consume binary formats as well as JSON and XML...
It's possible that JSON schema can, in some circumstances, pull off the same trick (I'm less familiar with JSON schemas, but I know that JSON is essentially executable JavaScript so who knows what can be done!). However, when I survey the vast array of serialisers out there it's remarkable how bad a lot of them are in terms of what they can actually do for developers. For example, Google Protocol Buffers is much lauded and widely used, but it does absolutely nothing to help developers valid inputs; there are no constraints (apart from an independent alpha/beta quality extension), developers (if they bother to validate messages at all) have to communicate by email or a word document or comments in the .proto file to understand what the valid range in a message can be. Most serialisers out there have not considered the role of serialisers in project development, or in project management.