MesosCon Asia 2017 has ended
Get more information on MesosCon Asia, or register to attend.

Customize your schedule by experience level and/or presentation language:

Refer to the “Filter by Type” list on the right to find a session based on topic and/or experience level.

Presentation Language – Sessions are categorized as [C] Chinese, [C,E] Chinese with English Slides or [E] English at the end of each talk title.

Wednesday, June 21 • 14:20 - 15:05
Jupyter and Spark on Mesos: Best Practices [C,E] - Shuai Lin, Scrapinghub

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.
Many companies make use of Apache Spark and Apache Mesos to do offline data analytics. Before launching these jobs to production, it is important to provide a place for engineers to test their spark code interactively. At Scrapinghub, we have set up a Jupyter notebook server for this purpose. Engineers can launch their spark jobs via Jupyter interactively on top of Mesos while there is no any extra configuration needed and it provides full access to various stateful services (e.g., HDFS/HBase/Kafka). This architecture is used by multiple teams in the company. For instance, the backend application team uses it to aggregate their logs and metrics, and data science team relies on it to do model training while developing the code. In this talk, Shuai Lin will share the experience they learned in details, including caveats they encountered when deploying and maintaining the notebook server.


Shuai Lin

Shuai Lin is an Infrastructure Engineer in Scrapinghub, a company that provides PaaS for running web crawlers. He graduated from Tsinghua University in 2010 and has been working in the software industry since then. He has been contributing to Apache Mesos and Apache Spark for years... Read More →

Wednesday June 21, 2017 14:20 - 15:05
Room 309B

Attendees (3)