Why I think that Hive metastore is still unbeatable even by modern solutions like Unity or Polaris

Open table formats like Apache Iceberg and Delta are evolving rapidly today. Developers worldwide are creating both open-source and proprietary custom formats for specific tasks such as data streaming, graph data, and embeddings. Additionally, we have numerous legacy and highly specific data sources, such as logs in custom formats or collections of old Excel files. This diversity is precisely why I believe that extensibility, or the ability to implement custom input and output formats, is crucial. Unfortunately, this feature, which is present in Hive Metastore, is missing in modern data catalogs like Unity or Polaris.

October 22, 2024 · 7 min · Sem Sinchenko