Machine Learning Frameworks

Machine Learning relies on algorithms. Unless you’re a data scientist or ML expert, these algorithms are very complicated to understand and work with.

A machine learning framework, then, simplifies machine learning algorithms. An ML framework is any tool, interface, or library that lets you develop ML models easily, without understanding the underlying algorithms.

There are a variety of machine learning frameworks, geared at different purposes. Nearly all ML the frameworks—those we discuss here and those we don’t—are written in Python. Python is the predominant machine learning programming language.

TensorFlow
Caffe
H2O
Apache Spark
Microsoft CNTK
Accord .NET
Apache Mahout
MXNet
ONNX
DASK
MLFlow
Chainer

Overview

Library	Developer(s)	Initial Release	Written In	Type
TensorFlow	Google Brain Team	2015	Python, C++, CUDA	Machine Learning
Caffe	Berkeley Vision and Learning Center	2017	C++	Deep Learning
H2O	SriSatish Ambati, Cliff Click	2017	C++, Python	Statistics
Apache Spark	Matei Zaharia	2014	Scala	Data analytics, Machine Learning Algorithms
Microsoft CNTK	Microsoft Research	2016	C++	Machine Learning, Deep Learning
Accord .NET	César Roberto de Souza	2010	C#	Data Analytics, Machine Learning
Apache Mahout	Apache Software Foundation	2009	Java, Scala	Machine Learning
MXNet	Apache Software Foundation	2015	C++, Python, R, Java, Julia, JavaScript, Scala, Go, Perl	Machine Learning, Deep Learning
ONNX	Facebook, Microsoft	2017	C++, Python	Artificial Intelligence Ecosystem
DASK	Matthew Rocklin	2018	Python	Data Analytics
MLFlow	Databricks	2015	Python	Machine Learning
Chainer	Seiya Tokui	2015	Python	Deep Learning

python

TensorFlow

TensorFlow is one of the best framework available for working with Machine Learning on Python. Offered by Google, TensorFlow makes ML model building easy for beginners and professionals alike.

Using TensorFlow, you can create and train ML models on not just computers but also mobile devices and servers by using TensorFlow Lite and TensorFlow Serving that offers the same benefits but for mobile platforms and high-performance servers.

Website python

GitHub python

python

Caffe

CAFFE (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework, originally developed at University of California, Berkeley.

Caffe supports many different types of deep learning architectures geared towards image classification and image segmentation. It supports CNN, RCNN, LSTM and fully connected neural network designs.Caffe supports GPU- and CPU-based acceleration computational kernel libraries such as NVIDIA cuDNN and Intel MKL

Website python

GitHub python

H2O

H2O implements algorithms from the field of statistics , data mining and machine learning ( generalized linear models , K-Means , Random Forest , Gradient Boosting and Deep Learning ). The software is based on the Hadoop Distributed File System , so that a performance gain is achieved compared to other analysis tools.

H2O can be viewed graphically using a web browsercan be operated or used via interfaces with R , Python , Apache Hadoop and Spark and executed in Maven .

Website python

GitHub python

python

Apache Spark

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.

Website python

GitHub python

python

Microsoft CNTK

Microsoft Cognitive Toolkit, previously known as CNTK and sometimes styled as The Microsoft Cognitive Toolkit, is a deprecated deep learning framework developed by Microsoft Research. Microsoft Cognitive Toolkit describes neural networks as a series of computational steps via a directed graph.

Website python

GitHub python

python

Accord .NET

The framework comprises a set of libraries that are available in source code as well as via executable installers and NuGet packages. The main areas covered include numerical linear algebra, numerical optimization, statistics, machine learning, artificial neural networks, signal and image processing, and support libraries (such as graph plotting and visualization).

Website python

GitHub python

python

Apache Mahout

Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily on linear algebra. In the past, many of the implementations use the Apache Hadoop platform, however today it is primarily focused on Apache Spark. Mahout also provides Java/Scala libraries for common maths operations (focused on linear algebra and statistics) and primitive Java collections. Mahout is a work in progress; a number of algorithms have been implemented.

Website python

GitHub python

python

MXNet

Apache MXNet is an open-source deep learning software framework, used to train, and deploy deep neural networks. It is scalable, allowing for fast model training, and supports a flexible programming model and multiple programming languages (including C++, Python, Java, Julia, Matlab, JavaScript, Go, R, Scala, Perl, and Wolfram Language.)

Website python

GitHub python

python

ONNX

The Open Neural Network Exchange (ONNX) is an open-source artificial intelligence ecosystem of technology companies and research organizations that establish open standards for representing machine learning algorithms and software tools to promote innovation and collaboration in the AI sector.

Website python

GitHub python

DASK

Dask is a library composed of two parts. It includes a task scheduling component for building dependency graphs and scheduling tasks. Second, it includes the distributed data structures with APIs similar to Pandas Dataframes or NumPy arrays. Dask has a variety of use cases and can be run with a single node and scale to thousand node clusters.

Website python

GitHub python

MLFlow

MlFlow is a framework that supports the machine learning lifecycle. This means that it has components to monitor your model during training and running, ability to store models, load the model in production code and create a pipeline. The framework introduces 3 distinct features each with it's own capabilities.

Website python

GitHub python

python

Chainer

Chainer is an open source deep learning framework written purely in Python on top of NumPy and CuPy Python libraries. The development is led by Japanese venture company Preferred Networks in partnership with IBM, Intel, Microsoft, and Nvidia.

Chainer is notable for its early adoption of "define-by-run" scheme, as well as its performance on large scale systems. The first version was released in June 2015 and has gained large popularity in Japan since then. Furthermore, in 2017, it was listed by KDnuggets in top 10 open source machine learning Python projects.

Website python

GitHub python