Overview
In this tutorial, we will go through the basics of Tensorflow. By the end of this series, you will have the background in order to use Tensorflow for deep learning models.
Tensorflow is an open source software library for numerical computation using data flow graphs. Nodes in the graph are called ops (short for operations), while the graph edges represent the R multidimensional data arrays (tensors) communicated between them. An op takes zero or more Tensors, performs some computation, and produces zero or more Tensors.
A deep model is just a composite function, so you can see why Tensorflow seems like a good library to implement it. But it is not the only reason. It has many extras that make it superior to the other libraries.
- You can visualize the computational graph, (Tensorboard) which makes debugging really easy.
- It has checkpoints that enable you to manage the experiments.
- It doesn’t require you to calculate the derivatives (which can be a pain in the neck.)
- It is general enough to be applicable in a wide variety of domains (It is used from language translation to early detection of skin cancer and preventing blindness in diabetics).
- It is used by the AI leading companies.
In this tutorial, we will go through the structure of a computational graph, the execution of it, the basic operations, and its visualization at tensorboard.
In Tensorflow, the execution is separated from the definition of the computational graph. We first need to define the structure of the graph. Then we use a session to execute ops in the graph. It is important to remember that data is represented as tensors. To get the data into and out of the arbitrary operations, it uses feeds and fetches.
Structure of The Graph
A computational graph is a series of TensorFlow operations arranged into a graph of nodes. Each node takes zero or more tensors as inputs and produces a tensor as an output. Let’s build a simple computational graph. The simplest type of node is a constant because it takes no inputs, and it outputs a value it stores internally. We can create two floating point Tensors (a and b) as follows:
library(tensorflow)
a <- tf$constant(3, dtype = tf$float32)
b <- tf$constant(5, dtype = tf$float32)
c <- list(a, b)
print(c)
## [[1]]
## Tensor("Const:0", shape=(), dtype=float32)
##
## [[2]]
## Tensor("Const_1:0", shape=(), dtype=float32)
Notice that it doesn’t print out the values of the constants, but instead the structure of the tensor.
Run The Graph in a Session
To actually evaluate the nodes, we must run the computational graph within a session. A session encapsulates the control and state of the TensorFlow runtime.
One way of launching the graph is:
# Launch the default graph.
a <- tf$constant(3, dtype = tf$float32, name = "a")
b <- tf$constant(5, dtype = tf$float32, name = "b")
c <- list(a, b)
sess = tf$Session()
print(sess$run(c))
## [[1]]
## [1] 3
##
## [[2]]
## [1] 5
# Close the Session when we're done.
sess$close()
When you are done with the computation, you need to close the session to release the resources. Another way of launching the graph (and personal preferred) is :
a <- tf$constant(3, dtype = tf$float32, name = "a")
b <- tf$constant(5, dtype = tf$float32, name = "b")
c <- list(a, b)
with(tf$Session() %as% sess, {
print(sess$run(c))
})
## [[1]]
## [1] 3
##
## [[2]]
## [1] 5
You can launch the graph using the “with” block. The Session closes automatically at the end of the “with” block.
- Work with Deep Learning networks and related packages in R
- Create Natural Language Processing models
- And much more
In general, it is not required to specify the compute resources that will run the computations. In this case, you have no GPU; it will run everything on the CPU. In case you have a single GPU, it will run most of the ops in that GPU. In case you have many GPUs, you have to explicitly assign ops to each GPU card.
Devices are specified with strings. The currently supported devices are:
- “/cpu:0”: The CPU of your machine.
- “/gpu:0”: The GPU of your machine, if you have one.
- “/gpu:1”: The second GPU of your machine, etc.
Let’s run all the operations on the CPU.
with(tf$Session() %as% sess, {
with(tf$device("/cpu:0"), {
c <- sess$run(c(a, b))
print(c)
})
})
## [[1]]
## [1] 3
##
## [[2]]
## [1] 5
Operations
Now that we have gone through the basics of graph structures, let’s dive into some of the most the basic operations of Tensorflow; the ones that are absolutely necessary in order to construct a deep neural network.
Element-wise Mathematical Operations
Add
Returns a + b element-wise.
add = tf$add(a, b, name = "add")
with (tf$Session() %as% sess,{
print(sess$run(add))
})
## [1] 8
Mul
Returns a * b element-wise.
mul = tf$multiply(a, b)
with (tf$Session() %as% sess,{
print(sess$run(mul))
})
## [1] 15
Div
div <- tf$div(a, b)
with (tf$Session() %as% sess,{
results <- sess$run(div)
print(results)
})
## [1] 0.6
Matrix Operations
Matmul
Multiplies matrix a by matrix b, producing a * b.
Hold on here; recall that in order to multiply two matrices, the number of columns of the first one should be the same size as the number of rows of the second.
a <- tf$constant(c(1, 2), shape = c(1L, 2L), name = "a")
b <- tf$constant(c(3, 4), shape = c(2L, 1L), name = "b")
c <- tf$matmul(a, b, name = "mat_mul")
with (tf$Session() %as% sess,{
sess$run(c)
})
## [,1]
## [1,] 11
Transpose
Transposes the matrix.
a <- tf$constant(c(1, 2, 3, 4), shape = c(2L, 2L), name = "a")
b <- tf$transpose(a, name = "transpose")
with (tf$Session() %as% sess,{
sess$run(b)
})
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
Matrix_inverse
Computes the inverse of one or more square invertible matrices or their conjugate transposes.
a <- tf$constant(c(1, 2, 3, 4), shape = c(2L, 2L))
b <- tf$matrix_inverse(a, name = "inverse")
with (tf$Session() %as% sess,{
sess$run(b)
})
## [,1] [,2]
## [1,] -2.0 1.0000001
## [2,] 1.5 -0.5000001
Matrix_determinant
Computes the determinant of the square matrices you feed into the function.
a <- tf$constant(c(1, 2, 3, 4), shape = c(2L, 2L))
b <- tf$matrix_determinant(a, name = "determinant")
with (tf$Session() %as% sess,{
sess$run(b)
})
## [1] -2
Array Operations
Concat
Concatenates tensors along one dimension. Keep in mind that the size of the dimension you use to concatenate should be the same at both tensors.
a <- tf$constant(c(1, 2), shape = c(1L, 2L), name = "a")
b <- tf$constant(c(3, 4), shape = c(1L, 2L), name = "b")
c <- tf$concat(list(a,b), axis = 0L, name = "concat") #
with (tf$Session() %as% sess,{
sess$run(c)
})
## [,1] [,2]
## [1,] 1 2
## [2,] 3 4
Slice
Extracts a slice from a tensor. The parameters we need to specify the starting point (“begin”) and the size of the sliced tensor (“size”). The slice size is represented as a tensor shape.
a <- tf$constant(c(1, 2, 3, 4, 5, 6), shape = c(1L, 2L, 3L), name = "a")
c <- tf$slice(a, begin = c(0L,0L,0L), size = c(1L,1L,3L), name = "slice") #
with (tf$Session() %as% sess,{
sess$run(c)
})
## , , 1
##
## [,1]
## [1,] 1
##
## , , 2
##
## [,1]
## [1,] 2
##
## , , 3
##
## [,1]
## [1,] 3
Split
Splits a tensor into sub-tensors. What is the difference from slice? We usually assign the sub-tensors to an object!
a <- tf$constant(1:20, shape = c(4L, 5L), name = "a")
l <- tf$split(a, c(1L, 2L, 2L), axis = 1L)
with (tf$Session() %as% sess,{
sess$run(l)
})
## [[1]]
## [,1]
## [1,] 1
## [2,] 6
## [3,] 11
## [4,] 16
##
## [[2]]
## [,1] [,2]
## [1,] 2 3
## [2,] 7 8
## [3,] 12 13
## [4,] 17 18
##
## [[3]]
## [,1] [,2]
## [1,] 4 5
## [2,] 9 10
## [3,] 14 15
## [4,] 19 20
Shape
Returns the shape of a tensor.
a <- tf$constant(1:20, shape = c(4L, 5L), name = "a")
with (tf$Session() %as% sess,{
sess$run(tf$shape(a))
})
## [1] 4 5
In this tutorial, we have learned how to execute a computational graph and some of the most popular operations used for deep learning models. (We will go through the rest on the upcoming tutorials.)
Leave a Reply