Videos

Connections between physics and deep learning

January 2, 2019Artis Modus

Center for Brains, Minds and Machines (CBMM)

Max Tegmark – MIT

Source

Similar Posts

22 thoughts on “Connections between physics and deep learning”

William Barksdale says:

August 31, 2016 at 9:07 am

41:30 Opossum? Laughed out loud that this was the word that was used as an example, and even more amazing that someone from the crowd got it!
Max Tegmark says:

August 31, 2016 at 7:25 pm

Here are the two papers I'm describing: https://arxiv.org/abs/1608.08225 & http://arxiv.org/abs/1606.06737
spurious says:

September 4, 2016 at 5:54 am

Max is absolutely brilliant, and a scientist of the absolute highest caliber, but his categorization of the different tasks within machine learning is incorrect. Modeling a joint probability, or p(x,y) is categorically referred to as a generative modeling, not unsupervised modeling, which is a different, though potentially overlapping, concept. Classification, correspondingly, is returning a class label for a given input, in standard notation, this is p(y|x). Prediction, or forecasting, is similarly p(x(t)|x(t-1),….x(1)). Unsupervised learning, by contrast, does not have some conventional notation, it refers to a scheme where a class label y is not fed to the training system. The joint probability that he wrote for unsupervised learning actually says nothing about the presence or absence of supervision, unless the y is a label, in which case the formalism is just plain wrong.

I say this because there are lots of students looking at the work of brilliant scientists like Max, and they owe it to the students to have consistent and correct formalism, given that the students may still be learning.
superjaykramer says:

September 15, 2016 at 9:19 pm

Max you're the man!!
danodet says:

September 21, 2016 at 9:17 am

I personally see a similarity between physics and deep learning in the way the world is made of encapsulated layers of realies. For example, as shown in thermodynamics, the macroscopic layer don't need to know every position and velocity of every particles. It only need to know certains computed features like temperature and pressure.
Robin Aldridge-Sutton says:

September 26, 2016 at 6:38 am

Really cool talk Max. I'm curious who the audience for your talk was. I think they must have a really good foundation of knowledge in mathematics.
Jordan Bennett says:

September 29, 2016 at 11:00 am

Max is a bit 'late', for it has been known, for quite a while, of neural networks' compression-bound nature:

https://www.quora.com/How-are-hidden-Markov-models-related-to-deep-neural-networks/answer/Jordan-Bennett-9

.

.

Albeit, we need subsume of larger problems, inclusive of Marcus Hutter's temporal difference aligned lemma, via hints from quantum mechanics, deep reinforcement learning (particularly deepmind flavoured) and causal learning (ie uetorch):

http://www.academia.edu/25733790/Causal_Neural_Paradox_Thought_Curvature_Quite_the_transient_naive_hypothesis

.

.

A code sample that initializes the confluence of temporal difference regime, abound the causal horizon:

https://github.com/JordanMicahBennett/God
jurvetson says:

October 8, 2016 at 2:39 am

I think it's a fascinating summary of the tie between the power of neural networks / deep learning and the peculiar physics of our universe. The mystery of why they work so well may be resolved by seeing the resonant homology across the information-accumulating substrate of our universe, from the base simplicity of our physics to the constrained nature of the evolved and grown artifacts all around us. The data in our natural world is the product of a hierarchy of iterative algorithms, and the computational simplification embedded within a deep learning network is also a hierarchy of iteration. Since neural networks are symbolic abstractions of how the human cortex works, perhaps it should not be a surprise that the brain has evolved structures that are computationally tuned to tease apart the complexity of our world.

When he says "efficient deep networks cannot be accurately approximated by shallow ones without efficiency loss," it reminds me of something I wrote in 2006: "Stephen Wolfram’s theory of computational equivalence suggests that simple, formulaic shortcuts for understanding evolution (and neural networks) may never be discovered. We can only run the iterative algorithm forward to see the results, and the various computational steps cannot be skipped. Thus, if we evolve a complex system, it is a black box defined by its interfaces. We cannot easily apply our design intuition to the improvement of its inner workings. We can’t even partition its subsystems without a serious effort at reverse-engineering." — from https://www.technologyreview.com/s/406033/technology-design-or-evolution/
jason anderson says:

October 12, 2016 at 10:25 am

Max – loved your recent book: "Mathematical Universe" Good to see you have interest in ML!! Keep up the great work!!!
Matt Siegel says:

October 27, 2016 at 8:26 am

34:00 more advanced than my kindergarten 😀
David Sewell says:

November 6, 2016 at 10:49 am

I am having difficulty understanding step 11 of the paper in which Max goes from the taylor series expansion form of the activation function to the multiplication approximator. Does anyone know of a more detailed explanation of this?
Sandeep Srikonda says:

December 3, 2016 at 12:10 pm

awesome
Mikhail Franco says:

December 17, 2016 at 11:05 pm

The interleaving of linear evolution and non-linear functions is also how quantum mechanics works:
1. The propagation step is perfectly linear, conservative, unitary, non-local and time-reversible. It is a continuous wave with complex amplitude specified by the Schrodinger equation. There is no loss of information. There are no localized particles in this step. There is no space in this step.
2. The interaction step is discrete, non-linear, local and time-irreversible. It is a selection/generation/collapse of alternatives based on the Born Rule. There is a loss of information, as complex values are added and amplitudes squared to give non-negative real probabilities. The result is an interaction, the creation of space-time intervals from the previous interactions, identification of localized entities which might be called particles, and some outgoing waves that are correlated (entangled). Go to 1.

Einstein complained that the non-locality of QM was "Spooky action at a distance", but in the Quantum Gravity upgrade, space is only created by interaction, so it becomes "Spooky distance at an action".
Chao Shi says:

February 5, 2017 at 11:07 am

I haven't read the paper but the presentation is terrible. The examples are not clearly illustrated.
Paulo Constantino says:

January 3, 2018 at 4:24 am

He should definitely learn how to speak without those annoying pauses and weird mouth warps.
Bob Salita says:

January 25, 2018 at 4:16 am

Max shows why AI is a cross-disciplinary science. Really stimulating video.
Leo Chen says:

April 1, 2018 at 2:21 pm

什么玄学。。。拉黑了
Grant J says:

April 30, 2018 at 4:40 pm

In the beginning he says classification as the probability of the pixel data given a label y? But then he shows a convnet classifying with a probability of a label given the pixel data (the usual formulation). Which is the correct way to look at this?
Eugen Barbula says:

June 16, 2018 at 1:31 am

What is the difference between random forests and DNN?
Arun Kannan says:

June 23, 2018 at 6:08 am

Just after 7:05 whose talk is he referring to ? Ends with *ski.
William Kyburz says:

July 28, 2018 at 3:37 am

the fact that his first neural network diagram is upside down is poetic … classic Physics space cadet
拿大熊 says:

December 31, 2018 at 7:06 am

Uh-mazing talk!

Comments are closed.