Building a management layer to your data lake for structured/ unstructured data

Lecturer: Einat Orr 18.6.2024

Intro: The challenges in managing a data lake for structured and unstructured data.

Achieving manageability:
1. The components of the architecture and their role.
    Opentable formats.
    Catalogs
    Data Version control systems
2. How it all fits together
    Example using Databricks technologies
    Example using Apache Iceberg
    Example using AWS technologies
3. Discussion

Language: English

About the lecturer: Einat Orr is the CEO and Co-founder of Treeverse, the company behind lakeFS, an open source platform that delivers a git-like experience to object-storage based data lakes. She received her PhD. in Mathematics from Tel Aviv University, in the field of optimization in graph theory. Einat previously led several engineering organizations, most recently as CTO at SimilarWeb.

Video

Slides


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

Leave a Reply

Discover more from Big Data Demystified

Subscribe now to keep reading and get access to the full archive.

Continue reading