Leave SQL alone! who tf is DBT?
We didnt even want R
never mind entity framework
why cant people just stick to SQL?
(then I dont have to learn new shit )
Fivetran, the automated data integration company once valued at $5.6 billion, has received a volley of criticism for ending direct support for ubiquitous data language SQL, leading to a frank mea culpa from its CEO. In an email shared on Twitter, the company said that from April 2023, it planned to end support for SQL for …
Depending if you maintain just transactional DB or also warehouse you probably should get know DBT.
DTB is open source project that adds modularity and templating into SQL making it much easier to write and read very complex analytical queries.
Fivetran is only one of the millions different ways of running ETL jobs to datawarehouse. Easy to use one for sure but also expensive.
Seems like it is analogous to SQL as is C to ASM.
It can also infer tables and relationships from the desired SELECT statement.
I am definitely not the most clever when it comes to RDBMS design but don't really see the benefit either. Inferring design from the SELECT statement seems a bit short-sighted and recipe for having a janky structure that is inflexible for future needs. Seems like you need to know all statements now. Or complicating data recovery when you need to look at the tables and they are all named obtuse names like 's_user_3738bec' because they were all programmatically designed.
Or maybe its great, I don't know.
There is room for improvement.
SQL is backwards. "SELECT ..." at this point I'd like to get some code insight. But the editor does not know what table(s) I'm working with. I have to put in something, add "FROM myTable", go back to the first line and finally get the list of available fields.
With Entity Framework:
" myTable.Select(t => t.<ctrl+space>" and the list of available fields appear instantly. No fuzz.
EF has its share of pitfalls, but I have strong types from the database backend all the way to my JSON DTOs. It is a valuable tool to have around.
No, they're dropping it because a company that implements its business logic in SQL can fuck off anywhere else on the market. A company implementing its business logic in DBT needs a DBT provider, which at this point is basically DBT (the company) or Fivetran. Fivetran's core business of shoveling shitloads of stuff into your platforms is good, and they're trying to abuse that to lock their customers in further. Total bullshit. Run away.
So did I misread the article, or are they actually dropping direct use of SQL in favor of letting their customers use this (formerly seperate) dbt software for free which *also* supports SQL?
I mean, I agree, "don't mess with things", but if it's a matter of being able to keep using SQL, just using it through dbt rather than directly, then it seems like it's not really a problem. Of course it could be like two very different SQL dialects, and the management decided "it's SQL" and it's close enough -- THAT could be a PITA for sure.
Having only played with DBT for a little bit last year, the constraint with DBT is it's not general-purpose SQL. You write Python to template SQL transformations, but they can't move the data to another database. It reads and writes and then you can in theory move the data you wrote somewhere else.
But we only used the OSS version; perhaps the newer/paid for versions do more.
Correct. DBT is SQL-ish. SQL if you squint. You're meant to template out the queries, which is, er, nice if you're into that kind of thing. This is honestly one of the biggest points of friction, because j2 templates are an arse way of managing the encapsulation and instantiation of queries. dbt is moving to fix this by enabling Python-based models. But this ultimately ends up feeling like you're working in a hobbled version of Spark (spoiler: you are) rather than a first-class transformation language.
The main difference versus pure SQL is that DBT has an opinionated view on how queries and execution should be structured. In particular DBT requires you to work in terms of materialised models, with each "model" being a select statement and some rules about how/whether to persist it and some constraints. The "magic" is that DBT enables you to perform optimisations across your entire DAG of jobs, and likewise to encapsulate and indepedently test parts of the DAG, in theory enabling a "software-like" workflow for data people. These are the bits 5tran are looking to inherit for minimum effort. For them otherwise building a SQL-centric solution would be much harder and less sticky.
Ultimately though I think both dbt and 5t and the whole "modern data stack" are just fads. There are serious limitations both to their "unbundled" architecture and to their technology implementations. A platform built on unbundled components is always going to have its lunch eaten by one that is able to introspect itself and self-optimise, which more mature/established players like Databricks (DLT), AWS (Glue) etc. are all doing.