BeeScala 2016: Holden Karau - Ignite your data with Spark 2.0

This talk was recorded at BeeScala 2016 in Ljubljana, Slovenia. Follow along on Twitter @BeeScalaConf and on the website for more information http://bee-scala.org. Abstract: This talk will start with a quick introduction to the two different building blocks of distributed computing in Apache Spark, as with the relative performance differences. This talk will cover on the performance impacts of Datasets, which are becoming the core building block of much Apache Spark starting with Spark 2.0, as well considerations the RDD API. This talk will finish up with exploring the new structured streaming API. Prior knowledge of Spark isn’t required, but a background with Spark will make it more exciting.