Videos

TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)



TensorFlow

On the forefront of deep learning research is a technique called reinforcement learning, which bridges the gap between academic deep learning problems and ways in which learning occurs in nature in weakly supervised environments. This technique is heavily used when researching areas like learning how to walk, chase prey, navigate complex environments, and even play Go. This session will teach a neural network to play the video game Pong from just the pixels on the screen. No rules, no strategy coaching, and no PhD required.

Rate this session by signing-in on the I/O website here → https://goo.gl/mh5Wi8

Watch more TensorFlow sessions from I/O ’18 here → https://goo.gl/GaAnBR
See all the sessions from Google I/O ’18 here → https://goo.gl/q1Tr8x

Subscribe to the TensorFlow channel → https://goo.gl/ht3WGe

#io18

Source

Similar Posts

34 thoughts on “TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)
  1. Form the code, refer to the following line
    loss = tf.reduce_sum(processed_rewards * cross_entropies + move_cost)

    Could I know the reason processed_reward is passed in as it is instead of negating it? Cause to my understanding, even it is normalised, negative or small reward indicate losing point or result of bad action and it should be discouraged. And from the code it minimize loss in optimization function, so it seems to encourage bad action?

  2. But when they take the reward and multiply by the cross_entropy, won't a negative reward ( loss ) turn the cross entropy negative? And by minimizing this, they actually encourage the algorithm to lose? I notice in the slides that they do: loss = – R( … ), but I can't see this reflected in the code?

  3. I can't quite get the loss function to work with TF2.0,

    loss = tf.reduce_sum(R * cross_entropies)
    model.compile(optimizer="Adam", loss=loss, metrics=['accuracy'])

    TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.

    Anyone got some advice? Thanks!

  4. I must say something about the math, There are two ways teaching ML:
    1. Pre-requiring from your audience the underlying math needed.
    2. Explain the needed math during the lecture.

    Ignoring the math or showing it without explaining it carefully in a detailed fashion, is not helpful.
    I'm 3rd year college student and this is not a clear lecture for me.
    All I got is that Tensorflow can do Reinforcement with NN, that we use softmax in the last layer.

    What I'm missing is the full understanding of the pipeline/graph and the derivation part.

  5. My parents' scarf store in San Francisco was just about to go bankrupt because nobody buys scarfs in SF. Then Martin Gorner moved there, and the business is thriving again!

Comments are closed.

WP2Social Auto Publish Powered By : XYZScripts.com