Why would you use this?
When it becomes "business essential" Oracle will charge you (a lot) to use it, and Google will drop it...
Google is promising to capture data logs from Oracle and other on-prem SQL data systems for monitoring, data integration and ML pipelines. Among the Chocolate Factory’s latest concoctions is Datastream, designed as a new serverless service to catch changes in data and replicate data where desirable. Gerrit Kazmaier, the …
This post has been deleted by its author
This post has been deleted by its author
This post has been deleted by its author
Every single tool that extracts data from Oracle is a potential candidate to apply one of the many, many different licencing scenarios that Oracle keeps changing and adapting as technology evolves (mullti core CPUs, VMs, cloud...) in order to keep their revenue stream growing. Just bear that in mind before dreaming up that system that allows you to get data out of your application and into a less expensive (or even free of licensing) platform or into many, many more hands. Oracle has a dedicated team devoted to evaluating how big is the "gap" between what they think you should be paying and what you're actually paying. If that gap is big enough, Oracle has no problem suing its own customers. Google for recent examples and you'll see what I mean.
The only way of avoiding Oracle licencing headaches is to completely stay clear of any of its products. Even free ones.
While I can think that this kind of thing might occasionally be useful I do worry about going outside the existing toolchain to do so, certainly when it comes to replication. If you need to replicate the DB, you'll presumably have the setup and the tools, this approach seems to depend upon inferring changes from the log. Where's the guarantee that things work? Any change to the log file format or junk could cause all kinds of problems and leave you in no mans land.
Similarly, you might want to kick off some analysis in certain conditions but DBs already come with the tools to do this. I assume the idea is that you can use this approach to "scale" up when necessary so the precious on site resources don't suffer from a rogue CPU hammer of a query. But anything that offloads data processing to another environment needs to get the data there and network latency is usually a bigger problem that CPU load, especially if this enough data to warrant processing elsewhere.
But maybe the examples just aren't detailed enough?
Been there. At first I had the same opinion as you. But no, these kind of systems are not triggers 2.0 because triggers have a number of drawbacks:
1- Trigger code is database dependent. It is much easier to drop a module of this type already tested by someone else than try to develop DB specific code to push a message into some kind of message bus. For each DB engine and version. And I've been in places with three different DB engines in at least two different versions each.
2- Depending on the DB, not all operations invoke triggers or there are specific commands to disable triggers during execution of other statements. This is usually done for performance reasons, and as such...
3- Withou rigorous inspection and review (and even with those by accident), trigger code easily can become a bottleneck, something you do not want when you're interested in your DB transactions being commited as quickly as possible.
Triggers are wonderful when you have no other means of expressing a business domain constraint and want to enforce it at the database level (those that say that these constraints can be enforced at the app level have a special place in hell filled with junior external consultants banging directly on a database violating all business constraints and repeating "but it works for my use case") Triggers as the foundation of a data bus are one of those ideas that look good in your head but become headaches on the medium term. That's why these kind of tools exist.