What do you optimise for?
"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away."
A long time ago when XML was brand new my client had a client that wanted to send them jobs to process in XML.
Up until that time work might have been received in fairly simple formats, maybe even as simple as CSV. They would take apart the various elements of data and store them as separate columns in a database that probably only needed a single table. They'd accumulate enough jobs to run a production batch. The data that was slated to arrive by XML was more complex than previously. It was going to require more tables. But they wanted to keep the same approach. We were to optimise for familiarity at the expense of complexity.
My solution was to use XSLT. A style sheet would read the XML and write SQL to add the data. XML was new at this time and there were no existing tools for shredding it into a database. This worked fine in a quick PoC. The trouble was that they were planning to send XML documents with op to 1,000 jobs in them. XSLT used the Microsoft DOM and that would require at least one order of magnitude more memory than the XML it was transforming. Testing this at production scale burst the available memory. We also needed to optimise for memory.
The solution to that was to add a SAX parser as a front end to spilt the incoming documents into individual jobs. It needed only to recognise the beginning and end of the element that enclosed the job. Each job could then be wrapped up and sent to the existing code one at a time. It didn't need to recognise anything in between, just shove it in the bag. Actually it recognised one or two other elements which represented housekeeping information and passed those on as well. We were optimising for memory at the expense of a little more complexity.
Realising that there would be more like tike in the future I wrote the SAX parser to use a simple name-space I devised for my client. I added a further step using SAXON to transform the client's name space into my client's name-space. All it needed was a new style sheet for a new client. I was optimising for re-useability at the expense of a little more complexity.
The next problem was that the database design used multi-column keys to stitch the tables together. It wasn't fast enough. I had to re-do it using surrogate keys. The XSLT had to insert strings which would then be substituted by s sort of macro-processor. We were optimising for speed by adding some really complex code. It was also a maintenance problem for my client because the client kept adding more stuff to the jobs that required adding ore tables, more code to handle the loading and, consequently, more code to get the data out again to format the output for production. We were optimising for speed at the expense of complexity.
As expected the client had a new contract. Even from the start it would have to handle a variety of types of job (more were added later). The answer was to chuck out all the detail tables and replace them with a single text column. The style-sheet would simply generate the appropriate format directly from the XML. It only required the selection of the correct style-sheet for the job-type and writing the style sheets was straightforward. The re-usability optimisation meant that only a new style-sheet was needed for the first transform and the SAX parser needed no changes at all. The horrible macro-processing stuff was chucked out as was all the code to compose the production format from the multiple tables. We had optimised for flexibility by making it simpler overall because of, not despite, that extra step at the front to transform the name-space.
TL;DR If you know what to add and what to take out the general-purpose solution can be simpler than that dedicated to a specific task.