Distributed Deep Learning on Spark (using Yahoo’s Caffe-on-Spark)
Read the article Large Scale Distributed Deep Learning on Hadoop Clusters to learn about Distributed Deep Learning using Caffe-on-Spark:
To enable deep learning on these enhanced Hadoop clusters, we developed a comprehensive distributed solution based upon open source software libraries, Apache Spark and Caffe. One can now submit deep learning jobs onto a (Hadoop YARN) cluster of GPU nodes (using
spark-submit
).
Caffe-on-Spark is a result of Yahoo’s early steps in bringing Apache Hadoop ecosystem and deep learning together on the same heterogeneous (GPU+CPU) cluster that may be open sourced depending on interest from the community.
In the comments to the article, some people announced their plans of using it with AWS GPU cluster.