Reply to post:

Leaving Spark behind, Databricks enters new territory as it eyes 2021 IPO

Anonymous Coward
Anonymous Coward

Apache Spark is a batch data processing engine, not a data lake. A data lake is a logical architecture centered on sharing or federating raw data on-demand rather than curated marts/warehouses. A data lake is inherently not a "schemaless architecture" - it requires rigorous schema governance to exist.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon