BIG DATA CONFERENCE

EUROPE 2022

November 23-24

Online

Nir Barazida

Data Scientist

DagsHub, Israel

Biography

**Nir Barazida, Data Scientist at DagsHub.**

Nir focuses his research on improving workflows for data science teams that work in a production-oriented environment. He combines the experience he acquired, with 8 years of experience as a keynote speaker for Global non-profit organizations and a passion for Data Science to provide a fascinating session.

Nir is leading the data science, advocacy, and outreach activity of DagsHub worldwide. DagsHub is a platform for data scientists to host and collaborate on their projects while using powerful open-source tools for data versioning, experiment tracking, extensive file diffing, and many more.

Nir graduated with honors his bachelor’s in Structural Analysis from Ben Gurion University and is currently pursuing his Master’s degree in Data Science from Reichman University.

Talk

Notebook To Production

Born out of IPython in 2014, Jupyter Notebooks have seen enthusiastic adoption among the data science community. Jupyter has become so prominent that it’s now the default environment for research.

But, are Jupyter Notebooks really the best home for data scientists to develop production-ready projects? The non-linear workflow, lack of versioning capabilities, no IDE integration, and inadequate debugging tools make it laborious to productionize a project created in a Jupyter Notebook environment.

Should we just throw our Jupyter Notebooks out the window and move to classic IDEs? Probably not – Jupyter Notebooks are, after all, a great tool that gives us superhuman abilities. We can, however, be more production-oriented when using them. How does this look in practice? That is exactly what we’ll cover in this talk.

**What you’ll learn**

– About the blind spots when using Jupyter Notebook in a production-oriented environment and how you can avoid them.
– When should we use Jupyter Notebooks to develop our project and what’s the right time to move to IDE.
– How to productionize the code from our Jupyter Notebook and what tools can help us do so.

Session Keywords

🔑 Jupyter Notebook

🔑 Data Versioning

🔑 Experiment Tracking

🔑 MLOps

« Back