In order to cope with this overwhelming speed of evolution and innovation, a good way to stay updated and knowledgeable on the advances of ML, is to engage with the community by contributing to the many open-source projects and tools. by KDNugget. 5 min read.
Getting into Machine Learning and AI is not an easy task. Many aspiring professionals and enthusiasts find it hard to establish a proper path into the field, given the enormous amount of resources available today.
The field is evolving constantly and it is crucial that we keep up with the pace of this rapid development. In order to cope with this overwhelming speed of evolution and innovation, a good way to stay updated and knowledgeable on the advances of ML, is to engage with the community by contributing to the many open-source projects and tools that are used daily by advanced professionals.
Tensorflow has moved to the first place with triple-digit growth in contributors. Scikit-learn dropped to 2nd place, but still has a very large base of contributors.
- TensorFlow, 169% up, from 493 to 1324 contributors
- Deap, 86% up, from 21 to 39 contributors
- Chainer, 83% up, from 84 to 154 contributors
- Gensim, 81% up, from 145 to 262 contributors
- Neon, 66% up, from 47 to 78 contributors
- Nilearn, 50% up, from 46 to 69 contributors
Also new in 2018:
- Keras, 629 contributors
- PyTorch, 399 contributors
Fig. 1: Deep Learning projects on Github.
Size is proportional to the number of contributors, and color represents to the change in the number of contributors – red is higher, blue is lower. Snowflake shape is for Deep Learning projects, round for other projects.
We see that Deep Learning projects like TensorFlow, Theano, and Caffe are among the most popular.
We hope you enjoy going through the documentation pages of each of these to start collaborating and learning the ways of Machine Learning using Python.
- TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization. The system is designed to facilitate research in machine learning, and to make it quick and easy to transition from research prototype to production system.
Contributors: 1324 (168% up), Commits: 28476, Github URL: Tensorflow
- Scikit-learn is simple and efficient tools for data mining and data analysis, accessible to everybody, and reusable in various context, built on NumPy, SciPy, and matplotlib, open source, commercially usable – BSD license.
Contributors: 1019 (39% up), Commits: 22575, Github URL: Scikit-learn
- Keras, a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.
Contributors: 629 (new), Commits: 4371, Github URL: Keras.
- PyTorch, Tensors and Dynamic neural networks in Python with strong GPU acceleration.
Contributors: 399 (new), Commits: 6458, Github URL: pytorch.
- Theano allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.
Contributors: 327 (24% up), Commits: 27931, Github URL: Theano
- Gensim is a free Python library with features such as scalable statistical semantics, analyze plain-text documents for semantic structure, retrieve semantically similar documents.
Contributors: 262 (81% up), Commits: 3549, Github URL: Gensim
- Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and community contributors.
Contributors: 260 (21% up), Commits: 4099, Github URL: Caffe
- Chainer is a Python-based, standalone open source framework for deep learning models. Chainer provides a flexible, intuitive, and high performance means of implementing a full range of deep learning models, including state-of-the-art models such as recurrent neural networks and variational auto-encoders.
Contributors: 154 (84% up), Commits: 12613, Github URL: Chainer
- Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator.
Contributors: 144 (33% up), Commits: 9729, Github URL: Statsmodels
- Shogun is Machine learning toolbox which provides a wide range of unified and efficient Machine Learning (ML) methods. The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools.
Contributors: 139 (32% up), Commits: 16362, Github URL: Shogun
source: KDNuggets, GitHub