Bulk data insert will be worth a try
FDW (Foreign data wrappers) are very convenient as long as they fit in memory. They are not very useful for ETL.
Open-source database fans were given reason to celebrate this week with the release of PostgreSQL 14, an iteration of the RDBMS featuring performance improvements for heavy and distributed workloads. The seasonal autumn release was posted yesterday, and though PostgreSQL 14 may not include a set of big wow new features, more …
Hi W.S.Gosset
Yugabyte YSQL supports both SERIALIZABLE and SNAPSHOT isolation levels (they map to the PostgreSQL isolation level syntax of SERIALIZABLE and REPEATABLE READ respectively). Snapshot Isolation level is the default for YSQL.
SNAPSHOT Isolation guarantees that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made by transactions that committed since that snapshot.
SERIALIZABLE Isolation guarantees that transactions run in a way equivalent to a serial (sequential) schedule.
This is powered by the underlying architecture where the core database uses a modified RocksDB with a hybrid real-time clock and MVCC to ensure transactional consistency.
I don't see retailers like the giant Krogers in the US committing themselves to Yugabyte if what you claim is true - the new generation of PostgreSQL API compatible, distributed databases are going to dominate the largest scale deployments and this great new release of PostgreSQL will consume the smaller and medium size implementations.
Hi datamgmt and thanks for that. But I'd already read through it and more importantly How they'd actually Implemented it. (See my previous comment.)
They've split the logic into 2, with the Transaction aspects handled in (both) the interface layers rather than the engine. The engine only supports locking existing data. Their transaction layer does not lock (can't lock) non-existent data because the engine has no concept of it -- it is tuple-based not set-based. As such, it fails to comply with SQL92's definition of Serializability: multiuser DML can create Integrity/Consistency problems. Doesn't matter what the marketing department badge it as, it's not actually full Serialisation.
Re market acceptability... Well that's a very different story. Oracle sold marvellously well into that segment, and couldn't do Serialisation unless Granularity was Table(!), ie essentially single-user. That was still the case the last time I looked (though marvellously obfuscated in their documentation -- see if there's still the Serializable=Yes option in the engine Settings, impact= all locks are Table, the information is scattered across multiple entries and manuals). Sybase sold on its speed despite massive semantic bugs. And NoSQL (ISAM for the modern age) meets a performance need despite being utterly wide-open to data corruption issues. Hasn't stopped them booming, because very few people understand it, and performance needs are big and visible and immediate for the business.
Yugabyte looks great --I like the intelligence of grafting PostgreSQL syntax and a lot of relational ability on a super-fast engine-- and I wish them the best of luck in the market. It's not fully Serializable, but very few people understand Serializability properly so that shouldn't hold them back. Likewise, PostgreSQL is also moving very intelligently to address a very real and large segment of user/market needs. Great stuff.
Errr... disambiguation note:
> [Oracle] impact= all locks are Table, the information is scattered across multiple entries and manuals
That is, all locks are insta-Escalated (Escalation seems the more likely implementation than Promotion) to Tablelocks: if you touch a single row, you lock the entire table. The information documenting this is scattered etc.
...just musing on whether Yugabyte could hack in a (Settings-optional) hackaround...
Looking at/taking advantage of their split architecture: they could wrap the engine in a thin layer/API and talk to the layer rather than the engine. That layer caches all Serializable DML predicates (' lists of Keys) intra-transaction. All incoming Serializable DML predicates then compared post-retrieval from the engine, and if you get a non-zero set on {NewPredicate NOT IN ExistingPredicates} (ignoring anything basic like existing locks etc.: that should be passed through upstream for normal handling) then the incoming later transaction is rejected/paused (eg until orig.transaction Commit/Rollback/Timeout).
Everything else is already handled by the interface(s) or the engine; just pass-through: ~0 performance hit. This thin layer only handles non-existent data.
That's about the lightest hack-in I can think of offhand, which doesn't require serious architecture work. But even at that lightness, would probably still stuff performance at the margin relative to existing semi-serialised. Meh.
Marketing-wise, wouldn't mention it and don't run it as default. Just something to have up the sleeve in the very rare case of someone knowing enough to ask. In which case the asker will already be well aware just how few databases are fully relational and just how very-few of them can run at speed+scale. So the performance hit won't be a deal-breaker, just normal.
Or just do what everyone does and ignore it or bluster when put on the spot :D
I've worked with MS SQL for 20+ years. Only been working with PostgreSQL for the last year or two.
There's a lot to like, but the management side of things is much poorer. Why focus on scaling big when there are so many other things to fix?
I create a view with lots of comments - the view definition isn't saved, so the comments are lost. Also mucks about with the syntax - I prefer to use CAST(something, datatype) as mycol, it changes it to mycol::datatype. Small change, but one that I find more difficult to read.
When I modify a table, I have to drop and recreate every view that references that table. There are scripts around, but this shouldn't be an issue.
Case effing sensitivity. Or not - end up having to put quotes around everything.
It's so close, but so frustrating.