May 16, 2016

San Francisco, CA  Buy Tickets

The future is already here, it is just not evenly distributed. But it clearly shows in our 150 talks, comprising 7 conferences, bounded by the 5 days conference matrix. 50+ founders/CEOs/CTOs speaking.

In-depth talks from Google (BigQuery and Translate), Baidu Research, MetaMind, StitchFix (Deep Learning), Microsoft, Bloomberg, Quora, Kaggle, Dato (Machine Learning), Netflix (Recommender Systems), IBM (Watson), Facebook, ClearStory (DataViz), LinkedIn, Yahoo, H2O, Confluent, Mesosphere (Data Pipelines), Samsung, Automatic (IoT), AMPLab, Databricks, Salesforce, Workday, Cloudera (Spark), Pivotal (OSS), Zillow, Pandora, Nitro, Lucidworks, Mattermark, Credit Karma, Alpine Labs, , University of California-Berkeley, Stanford University, City of San Francisco, and many others.

Buy Tickets

Only 300 tickets for each day will be available to have a truly intimate technical community atmosphere.

View the Data By the Bay speakers and schedule.

Data Pipelines By the Bay started in 2015 as Big Data Scala. The key idea is that real-time data science at scale requires data engineering as an integral part of a reactive loop, and effective data scientists must be great data engineers, and vice versa. Streaming and distribited architectures supporting new algorithms such as deep learning are radiclaly different from the typical database-centric solutions of the past. We're proud to present talks from our core constituents of SF Scala, SF Spark, and Reactive Systems companies as well as global leaders and best funded startups in big and fast data, most headquartered here By the Bay.

Conference News

Registration is open

We are using an advanced pass system that allows you to select a pass for several days, from 2 to 5. Each day's capacity is 400, and there are 100 Very Early/Early passes available. Please see the TICKETS page for full details.
Buy Tickets

Schedule is Published

The full schedule is published for all five days, all seven conferences. Very Early Bird registration is in effect, ending March 15.

Data Pipelines By the Bay
 

The emerging realization in both data science and software engineering communities is that data-driven applications need both to work in synchrony. At the Data Science Summit in 2015, Carlos Guestrin, the CEO of Dato, predicted that ML-enabled applications will dominate all others in the near future, and the way to make ML available is building services providing them in a scalable and composable way. Data analysis has the most valiue when the data is fresh, and real-time, streaming approaches are causing the most interest. By the Bay conferences and communities are one of the first to realize the emerging phenomena of Data Pipelines, building them, and establishing the best practices. The Reactive.Community and noetl.org parctitioners use reactive, stringly typed systems where data is always in transit, available for querying. Approaches such as Spark Streaming, Kafka Streaming, and Flink, a streaming-first framework, turn out to be the best for IoT and similar verticals with sensor data coming at high volyume and speed. Big Data becomes Big and Fast Data. Since so any of companies By the Bay and globally are establishing data pipelines as the new backbones of their businesses, we've received unprecedent number of high-quality submissions, and added a whole new conference to the already stellar sequence. Jay Kreps, the creator of Kafka, the system at heart of most of the pipelines presented, will keynote.

Last year, we started with a two-day, three-track, 50-talk conference. We've put together an inspiring program centered around language, Big Data, text and images, deep learning, UI, social networks, and much more.

This year, we're running the first data grid conference sequence with with seven verticals over five days. Each day's attendance is limited to only 400 seats and it will be full. We hope you join us in May By the Bay!

Keynote Speaker
 

Jay Kreps

Jay (@jaykreps) is co-founder and CEO at Confluent. Prior to Confluent, Jay Kreps was the initial developer on several open source projects, including Apache Kafka, Apache Samza, Voldemort. He was the lead architect for data infrastructure at LinkedIn.

Data Pipelines By the Bay

May 16, 2016

Building on Big Data Scala, this is the first conference showing end-to-end unity of Data Engineering and Data Science for big, fast, streaming data.

Text By the Bay

May 17-18, 2016 (Day 2 parallel with Democracy By the Bay and Law By the Bay)

The first applied NLP conference for the Bay Area, building on the highly-acclaimed 2015 edition: 50 talks from 50 top companies, all online at functional.tv.

Democracy By the Bay

May 18, 2016 (parallel with Law By the Bay and Text By the Bay)

NLP and Data Science with focus on politics, society, and government.

Law By the Bay

May 18, 2016 (parallel with Democracy By the Bay and Text By the Bay)

NLP and Data Science with focus on legal data and processes.

Legal search (100% recall), case-specific NLP, ambiguity analysis, etc.

AIoT By the Bay

May 19, 2016

Not everything is text. Multiple talks at Text By the Bay dealt with multi-modal data such as images with text. AI and IoT day is all about sensor data streams, images, vision, speech, music.

Life Sciences By the Bay

May 20, 2016 (Parallel with Data UX By the Bay)

There are several major categories of data mining related to life and health. First, genomics -- Bay Area leads with Spark and ADAM. Second, medical sensor and imaging data, with companies like Enlitic.

Data UX By the Bay

May 20, 2016 (Parallel with Life Sciences By the Bay)

Data should be visualized, with massive datasets distilled into clear and actionable display calling attention to what's really important. And then UX should naturally lead to the appropriate action.

Data By the Bay – Common Thread

May 16-20, 2016

For each conference, we'll have a common horizontal themes: platforms and algorithms.

Our Sponsors
 

Power Sponsors

Friend Sponsors

Media Sponsors


Technology Insights and Events

Be a supporting member of San Francisco's premier Data/AI conference. We want to hear from you! Contact us for a prospectus and sponsorship agreement, or to talk about how we can help you be a contributing sponsor for the Data By The Bay conference!

CONTACT US

The Agenda
 

Come to Data Pipelines By the Bay well-rested and ready to meet your fellow developers. We'll have a full day of talks (keynotes, full-length, and lightning) and build a startup-centric data engineering community for the Bay Area!

Get Updates

Stay informed with the Data Pipelines By the Bay conference news and event updates.

If you'd like to sponsor Data Pipelines By the Bay, contact sponsors@bythebay.io

Map

Conference Schedule
 

View the Data By the Bay schedule & directory.

Conference Tickets
 

You can buy tickets for two or more days of the conference as passes. Once you buy a pass, you will receive an email with instructions on how to redeem the days you want. Each day has the capacity of 400 and will automatically be disabled once full. We'll add the days that are sold out on the TICKETS page as soon as they become unavailable.

Currently available days: Day 1, Day 2, Day 3, Day 4, Day 5.

Pricing works as follows: regular admission is $500/day. Very Early Bird is $400/day, Early Bird is $450/day, and late Bird is $550/day. We will only allocate 100 Very Early/Early Bird tickets for each day, since our capacity is limited and the word is only getting out. The passes are 2/3/4/5-day bundles, discounted $50 per each extra day (so 2-day Very Early Bird Bundle is $750, 2-day Early Bird Bundle is $850, 2-day Regular Admission Bundle is $950, etc.). We use Stripe directly to process all payments.

Full-time students inquiring about discounts: please email proof of enrollment and dates of interest.