Skip to content

Machine Learning Frameworks

Machine Learning relies on algorithms. Unless you’re a data scientist or ML expert, these algorithms are very complicated to understand and work with.

A machine learning framework, then, simplifies machine learning algorithms. An ML framework is any tool, interface, or library that lets you develop ML models easily, without understanding the underlying algorithms.

There are a variety of machine learning frameworks, geared at different purposes. Nearly all ML the frameworks—those we discuss here and those we don’t—are written in Python. Python is the predominant machine learning programming language.


Overview

Library Developer(s) Initial Release Written In Type
TensorFlow Google Brain Team 2015 Python, C++, CUDA Machine Learning
Caffe Berkeley Vision and Learning Center 2017 C++ Deep Learning
H2O SriSatish Ambati, Cliff Click 2017 C++, Python Statistics
Apache Spark Matei Zaharia 2014 Scala Data analytics, Machine Learning Algorithms
Microsoft CNTK Microsoft Research 2016 C++ Machine Learning, Deep Learning
Accord .NET César Roberto de Souza 2010 C# Data Analytics, Machine Learning
Apache Mahout Apache Software Foundation 2009 Java, Scala Machine Learning
MXNet Apache Software Foundation 2015 C++, Python, R, Java, Julia, JavaScript, Scala, Go, Perl Machine Learning, Deep Learning
ONNX Facebook, Microsoft 2017 C++, Python Artificial Intelligence Ecosystem
DASK Matthew Rocklin 2018 Python Data Analytics
MLFlow Databricks 2015 Python Machine Learning
Chainer Seiya Tokui 2015 Python Deep Learning


TensorFlow

TensorFlow is one of the best framework available for working with Machine Learning on Python. Offered by Google, TensorFlow makes ML model building easy for beginners and professionals alike.

Using TensorFlow, you can create and train ML models on not just computers but also mobile devices and servers by using TensorFlow Lite and TensorFlow Serving that offers the same benefits but for mobile platforms and high-performance servers.

Website python

GitHub python


Caffe

CAFFE (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework, originally developed at University of California, Berkeley.

Caffe supports many different types of deep learning architectures geared towards image classification and image segmentation. It supports CNN, RCNN, LSTM and fully connected neural network designs.Caffe supports GPU- and CPU-based acceleration computational kernel libraries such as NVIDIA cuDNN and Intel MKL

Website python

GitHub python


H2O

H2O implements algorithms from the field of statistics , data mining and machine learning ( generalized linear models , K-Means , Random Forest , Gradient Boosting and Deep Learning ). The software is based on the Hadoop Distributed File System , so that a performance gain is achieved compared to other analysis tools.

H2O can be viewed graphically using a web browsercan be operated or used via interfaces with R , Python , Apache Hadoop and Spark and executed in Maven .

Website python

GitHub python


Apache Spark

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.

Website python

GitHub python


Microsoft CNTK

Microsoft Cognitive Toolkit, previously known as CNTK and sometimes styled as The Microsoft Cognitive Toolkit, is a deprecated deep learning framework developed by Microsoft Research. Microsoft Cognitive Toolkit describes neural networks as a series of computational steps via a directed graph.

Website python

GitHub python


Accord .NET

The framework comprises a set of libraries that are available in source code as well as via executable installers and NuGet packages. The main areas covered include numerical linear algebra, numerical optimization, statistics, machine learning, artificial neural networks, signal and image processing, and support libraries (such as graph plotting and visualization).

Website python

GitHub python


Apache Mahout

Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily on linear algebra. In the past, many of the implementations use the Apache Hadoop platform, however today it is primarily focused on Apache Spark. Mahout also provides Java/Scala libraries for common maths operations (focused on linear algebra and statistics) and primitive Java collections. Mahout is a work in progress; a number of algorithms have been implemented.

Website python

GitHub python


MXNet

Apache MXNet is an open-source deep learning software framework, used to train, and deploy deep neural networks. It is scalable, allowing for fast model training, and supports a flexible programming model and multiple programming languages (including C++, Python, Java, Julia, Matlab, JavaScript, Go, R, Scala, Perl, and Wolfram Language.)

Website python

GitHub python


ONNX

The Open Neural Network Exchange (ONNX) is an open-source artificial intelligence ecosystem of technology companies and research organizations that establish open standards for representing machine learning algorithms and software tools to promote innovation and collaboration in the AI sector.

Website python

GitHub python


DASK

Dask is a library composed of two parts. It includes a task scheduling component for building dependency graphs and scheduling tasks. Second, it includes the distributed data structures with APIs similar to Pandas Dataframes or NumPy arrays. Dask has a variety of use cases and can be run with a single node and scale to thousand node clusters.

Website python

GitHub python


MLFlow

MlFlow is a framework that supports the machine learning lifecycle. This means that it has components to monitor your model during training and running, ability to store models, load the model in production code and create a pipeline. The framework introduces 3 distinct features each with it's own capabilities.

Website python

GitHub python


Chainer

Chainer is an open source deep learning framework written purely in Python on top of NumPy and CuPy Python libraries. The development is led by Japanese venture company Preferred Networks in partnership with IBM, Intel, Microsoft, and Nvidia.

Chainer is notable for its early adoption of "define-by-run" scheme, as well as its performance on large scale systems. The first version was released in June 2015 and has gained large popularity in Japan since then. Furthermore, in 2017, it was listed by KDnuggets in top 10 open source machine learning Python projects.

Website python

GitHub python