I have written a book! Scala for data science introduces the major libraries for building pipelines to analyze, process and visualize data in Scala.

If you already know a bit of Scala, this book will guide you through:

  • Manipulating arrays of data with Breeze.
  • Querying web APIs in parallel.
  • Accessing SQL and NoSQL databases.
  • Setting up REST APIs to distribute your data.
  • Integrating Scala with D3.js to build data visualizations.
  • Distributed, in-memory processing of large datasets using Apache Spark.
  • Training a spam filter with MLlib.
  • Building a web-crawler using Akka.

The full table of contents and a sample chapter introducing parallel collections and futures is available on the publisher's website.

The code samples are available on GitHub.

You can buy the book on Amazon, or from Packt publishing directly. If you do, send me feedback!