back to article Big data vendors introduce Apache Iceberg features in same week

Apache Iceberg has secured renewed momentum in the last week after leading vendors in data warehousing and analytics all announced new features around the open source table format. AWS, Cloudera, Google, and Snowflake came out in support of Apache Iceberg. Iceberg faces off contenders including Databricks' Delta Lake – also an …

  1. Groo The Wanderer Silver badge

    Be thankful Microsoft isn't on board - their version would be "just a little different" enough to ensure lock-in.

  2. Anonymous Coward
    Anonymous Coward

    Starter options?

    Hi,

    I think I need a basic orientation here. If I've understood properly the Apache Iceberg project is just focused on the file format. They don't provide libraries to read and write the format on a local filesystem or other (no implementation). Each application that uses Apache Iceberg basically rolls its own implementation? So Spark can write in various table formats, of which Apache Iceberg, and Spark SQL provides an SQL interface to the table data. Spark also manages a cluster implementation, with a catalogue? So the catalogue is a JDBC database, basically?

    So Spark provides a REST API to applications (and python scripts)?

    Sorry, I'm a bit lost. Anonymous so future generations don't embarrass me ...

    1. rgtheregister

      Re: Starter options?

      Iceberg table data and metadat stored in a single place. Spark session can read/write Iceberg tables individually. Some Iceberg compatible catalogs like Apache Nessie and Polaris also also evolving to fill this catalog gap. In future these catalog may replace replace hadoop and Hive catalogs.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like