Yes. Another ML Library.
But this one is different! The hacker community is wickedly excited about it, and so should you! TensorFlow was released by Google today, and it looks to be a really exciting step forward for open source machine learning, or even the entire computational mathematics community.
What is it? From TensorFlow’s introduction, it is “an open source library for numerical computation using data flow graphs”!
What is “an open source library for numerical computation using data flow graphs”?
Sounds like a mouthful, but “data flow graphs” are just a more-encompassing term for the kind of modeling neural networks use. And the library is described that way for a reason- TensorFlow is designed to not only provide flexible, highly optimized neural networks, but to be able to perform any sort of computation that is organized with a similar graph-like structure.
More on data flow graphs
These graphs are composed of two primary components, nodes and edges.
Nodes are the squares, circles, or ellipses on charts such as the one to the right here. They represent any sort of mathematical operation or function. In a neural network, these are your activation functions (like a sigmoid function).
Edges are the connections between the nodes. As you can see, they are directional, in that data flows from the output of one node and into the input of the next node (or several nodes) through these edges. Edges represent the “tensors”, or multi-dimensional arrays, which contain the weights for each of the outputs from the previous node to the next.
Compare that with a typical neural network model, and you can see how a neural network is just a specialized version of a data flow graph. Back to TensorFlow!
So what exactly is there to get excited about with TensorFlow? There are a jillion machine learning libraries out there, so how does this stick out amongst the crowd (other than it’s created by Google)? Well, a fair amount, actually. Here are some of the things I’m most excited about:
- Easier Transition from Research to Production: Something that was always troublesome in machine learning, especially Neural Networks, was trying to take the model crafted in research and then applying it to a real production setting. Much research is done using Python, R, or MatLab (with accompanying libraries), which allows for faster iterations through the design and testing phase. Before, that code would hardly be touched once the model moved to production, as it needed to be reimplemented with a faster language, such as C++ or Java. Because of the way TensorFlow is designed, we should be able to take what we have and bring it directly to production with minimal, if any, code changes.
Flexibility: This is both a great thing and something to keep in mind. TensorFlow is not a neural network library- it is a data flow graph library. This makes it capable of handling much more nuanced and hand-modeled graphs, but it will require more finagling. While it doesn’t appear too difficult to create a simple neural network now, I expect that there will be some higher-level libraries built on top of TensorFlow to make it extremely easy.
Automatic CPU/GPU Integration: This might be the most exciting one for me. GPUs, or graphics processing units, have enabled much faster learning (especially neural networks), and taking advantage of them is crucial in having the power to create robust models. The problem, however, is that most machine learning libraries out there don’t have GPU support, and those that do are either hard to use or are much less flexible. For example, scikit-learn, one of the most popular libraries for machine learning, while extremely useful for testing out ideas, has no plans for GPU support in the near future. TensorFlow promises to bring both flexibility and power by taking advantage of all of your computing resources.
I’ll be digging into this more over the next few weeks! Check out Google Research’s blog post if you’re interested in reading more about it.