Be thankful Microsoft isn't on board - their version would be "just a little different" enough to ensure lock-in.
Big data vendors introduce Apache Iceberg features in same week
Apache Iceberg has secured renewed momentum in the last week after leading vendors in data warehousing and analytics all announced new features around the open source table format. AWS, Cloudera, Google, and Snowflake came out in support of Apache Iceberg. Iceberg faces off contenders including Databricks' Delta Lake – also an …
COMMENTS
-
Tuesday 15th October 2024 07:53 GMT Anonymous Coward
Starter options?
Hi,
I think I need a basic orientation here. If I've understood properly the Apache Iceberg project is just focused on the file format. They don't provide libraries to read and write the format on a local filesystem or other (no implementation). Each application that uses Apache Iceberg basically rolls its own implementation? So Spark can write in various table formats, of which Apache Iceberg, and Spark SQL provides an SQL interface to the table data. Spark also manages a cluster implementation, with a catalogue? So the catalogue is a JDBC database, basically?
So Spark provides a REST API to applications (and python scripts)?
Sorry, I'm a bit lost. Anonymous so future generations don't embarrass me ...
-
Tuesday 5th November 2024 00:15 GMT rgtheregister
Re: Starter options?
Iceberg table data and metadat stored in a single place. Spark session can read/write Iceberg tables individually. Some Iceberg compatible catalogs like Apache Nessie and Polaris also also evolving to fill this catalog gap. In future these catalog may replace replace hadoop and Hive catalogs.
-