May 16, 2016
San Francisco, CA Buy Tickets
The future is already here, it is just not evenly distributed. But it clearly shows in our 150 talks, comprising 7 conferences, bounded by the 5 days conference matrix. 50+ founders/CEOs/CTOs speaking.
In-depth talks from Google (BigQuery and Translate), Baidu Research, MetaMind, StitchFix (Deep Learning), Microsoft, Bloomberg, Quora, Kaggle, Dato (Machine Learning), Netflix (Recommender Systems), IBM (Watson), Facebook, ClearStory (DataViz), LinkedIn, Yahoo, H2O, Confluent, Mesosphere (Data Pipelines), Samsung, Automatic (IoT), AMPLab, Databricks, Salesforce, Workday, Cloudera (Spark), Pivotal (OSS), Zillow, Pandora, Nitro, Lucidworks, Mattermark, Credit Karma, Alpine Labs, , University of California-Berkeley, Stanford University, City of San Francisco, and many others.Buy Tickets
Only 300 tickets for each day will be available to have a truly intimate technical community atmosphere.
Data Pipelines By the Bay started in 2015 as Big Data Scala. The key idea is that real-time data science at scale requires data engineering as an integral part of a reactive loop, and effective data scientists must be great data engineers, and vice versa. Streaming and distribited architectures supporting new algorithms such as deep learning are radiclaly different from the typical database-centric solutions of the past. We're proud to present talks from our core constituents of SF Scala, SF Spark, and Reactive Systems companies as well as global leaders and best funded startups in big and fast data, most headquartered here By the Bay.
Schedule is Published
Data Pipelines By the Bay
The emerging realization in both data science and software engineering communities is that data-driven applications need both to work in synchrony. At the Data Science Summit in 2015, Carlos Guestrin, the CEO of Dato, predicted that ML-enabled applications will dominate all others in the near future, and the way to make ML available is building services providing them in a scalable and composable way. Data analysis has the most valiue when the data is fresh, and real-time, streaming approaches are causing the most interest. By the Bay conferences and communities are one of the first to realize the emerging phenomena of Data Pipelines, building them, and establishing the best practices. The Reactive.Community and noetl.org parctitioners use reactive, stringly typed systems where data is always in transit, available for querying. Approaches such as Spark Streaming, Kafka Streaming, and Flink, a streaming-first framework, turn out to be the best for IoT and similar verticals with sensor data coming at high volyume and speed. Big Data becomes Big and Fast Data. Since so any of companies By the Bay and globally are establishing data pipelines as the new backbones of their businesses, we've received unprecedent number of high-quality submissions, and added a whole new conference to the already stellar sequence. Jay Kreps, the creator of Kafka, the system at heart of most of the pipelines presented, will keynote.
Last year, we started with a two-day, three-track, 50-talk conference. We've put together an inspiring program centered around language, Big Data, text and images, deep learning, UI, social networks, and much more.
This year, we're running the first data grid conference sequence with with seven verticals over five days. Each day's attendance is limited to only 400 seats and it will be full. We hope you join us in May By the Bay!
Jay (@jaykreps) is co-founder and CEO at Confluent. Prior to Confluent, Jay Kreps was the initial developer on several open source projects, including Apache Kafka, Apache Samza, Voldemort. He was the lead architect for data infrastructure at LinkedIn.
Data Pipelines By the Bay
May 16, 2016
Building on Big Data Scala, this is the first conference showing end-to-end unity of Data Engineering and Data Science for big, fast, streaming data.
Text By the Bay
May 17-18, 2016 (Day 2 parallel with Democracy By the Bay and Law By the Bay)
Democracy By the Bay
May 18, 2016 (parallel with Law By the Bay and Text By the Bay)
NLP and Data Science with focus on politics, society, and government.
Law By the Bay
May 18, 2016 (parallel with Democracy By the Bay and Text By the Bay)
NLP and Data Science with focus on legal data and processes.
Legal search (100% recall), case-specific NLP, ambiguity analysis, etc.
AIoT By the Bay
May 19, 2016
Not everything is text. Multiple talks at Text By the Bay dealt with multi-modal data such as images with text. AI and IoT day is all about sensor data streams, images, vision, speech, music.
Life Sciences By the Bay
May 20, 2016 (Parallel with Data UX By the Bay)
There are several major categories of data mining related to life and health. First, genomics -- Bay Area leads with Spark and ADAM. Second, medical sensor and imaging data, with companies like Enlitic.
Data UX By the Bay
May 20, 2016 (Parallel with Life Sciences By the Bay)
Data should be visualized, with massive datasets distilled into clear and actionable display calling attention to what's really important. And then UX should naturally lead to the appropriate action.
Data By the Bay – Common Thread
May 16-20, 2016
For each conference, we'll have a common horizontal themes: platforms and algorithms.
Technology Insights and Events
Be a supporting member of San Francisco's premier Data/AI conference. We want to hear from you! Contact us for a prospectus and sponsorship agreement, or to talk about how we can help you be a contributing sponsor for the Data By The Bay conference!
Come to Data Pipelines By the Bay well-rested and ready to meet your fellow developers. We'll have a full day of talks (keynotes, full-length, and lightning) and build a startup-centric data engineering community for the Bay Area!
Stay informed with the Data Pipelines By the Bay conference news and event updates.
If you'd like to sponsor Data Pipelines By the Bay, contact firstname.lastname@example.org
You can buy tickets for two or more days of the conference as passes. Once you buy a pass, you will receive an email with instructions on how to redeem the days you want. Each day has the capacity of 400 and will automatically be disabled once full. We'll add the days that are sold out on the TICKETS page as soon as they become unavailable.
Currently available days: Day 1, Day 2, Day 3, Day 4, Day 5.
Pricing works as follows: regular admission is $500/day. Very Early Bird is $400/day, Early Bird is $450/day, and late Bird is $550/day. We will only allocate 100 Very Early/Early Bird tickets for each day, since our capacity is limited and the word is only getting out. The passes are 2/3/4/5-day bundles, discounted $50 per each extra day (so 2-day Very Early Bird Bundle is $750, 2-day Early Bird Bundle is $850, 2-day Regular Admission Bundle is $950, etc.). We use Stripe directly to process all payments.
Full-time students inquiring about discounts: please email proof of enrollment and dates of interest.