Data Science Trends

ICML 2017 – Part One

Originally published 22/08/2017 as part of the Bulletproof Technical Blog

Machine Learning Hits Sydney

The 34th International Conference for Machine Learning was held, if you can believe it, in Sydney at the ICC in Darling Harbour from Sunday 6th August through to Friday the 11th. I knew this was on but assumed that, as is so often the case, it’d be over in California somewhere. But this time we got it and believe me there are worse places to spend a working week in winter…

Looking at the schedule, there was obviously going to be a lot of interest in the buzz phrase of the year “Deep Learning”. However it was also clear that other topics, just as important in Machine Learning, weren’t going to be left out. The week ahead promised Information Theory, Monte Carlo Methods, Bayesian Optimisation, Gaussian Processes, Reinforcement Learning, Causal Learning, Kernel Methods, Large Scale Learning etc. etc. I would need several clones to listen to all the talks that interested me and several months to digest it all.

Tutorials, Talks and Tea Breaks

The conference started on Sunday with a series of tutorials. These weren’t tutorials in the sense that you actually do anything but rather three hour talks in depth on a particular paper or series of papers (with a break in between for some fairly pedestrian coffee).

The first one I went to was called “Recent Advances in Stochastic Convex and Non-Convex Optimization” straight out of Microsoft Research at Redmond. This was an in-depth look at methods for speeding up Stochastic Gradient Descent (the training backbone of Deep Neural Networks or DNNs). Having only ever attended industry conferences I wasn’t clear what to expect of an academic one but the first few slides of the presentation gave me a pretty good idea.

I followed this auspicious start with “Robustness Meets Algorithms (and Vice-Versa)” which was interesting in that it focused on real world algorithmic stability and statistical validity issues that occur when machine learning models are deployed in the real world at scale.

Outside in the main hall the sponsors were setting up booths for the week – these included NVIDIA, Facebook, Netflix, Google Research, Uber and AWS. It was a unique opportunity to meet some of the finest minds in the industry, to have some (really out there) discussions and see some demos of the cutting edge work these companies are doing.

NVIDIA was showing off their DIGITS framework running on the latest K80 GPU (Graphics Processing Unit) graphics cards – in this case, a highly optimized DNN allows them to drive through carparks at full speed detecting every car in real time in detail (including number plates).

Google Research, Facebook, Netflix and AWS were also busy with programs for academic assistance, industry research roadmaps and generally contributing to our national brain drain with exciting job opportunities in a wide range of other countries.

Collapsing Black Holes

Bernhard Schölkopf from Yale gave the opening talk on Causal Learning. This is an approach to Machine Learning that leverages causal domain knowledge, transfer and semi-supervised learning, in addition to underlying statistical models resulting in predictions that are more robust in real world scenarios.

He’s an ex-physicist so some of the best examples of applying these techniques came from cosmology. I sat there in disbelief as he explained how they were able to detect gravity waves released by the collapse of two 30 solar mass black holes into each other a billion light years away.

The signal, detected in 2015 and lasting only 0.2 seconds, revealed that as they rotated around each other at 250 times per second in their final death spiral, the amount of energy released was 1050 Watts. For reference, this is 10 billion times the number of atoms in the observable universe.

Frankly I don’t care whether they used Machine Learning for this or not – it was the largest known event in the history of the universe since the Big Bang itself.

What a start to the week.

Short, Sharp and Sweet

The next few days consisted of nine parallel streams where researchers presented papers in 20 minute time slots. Even when broken up into three sessions a day, this was a lot to take in. Many of the topics were fairly complex and required a lot of background in Machine Learning so without going into unnecessary detail, here’s just a small selection of some of the more interesting presentations:

  • “Analytical Guarantees on Numerical Precision of Deep Neural Networks”,
  • “Deep Tensor Convolution on Multicores”
  • “Failures of Gradient-Based Deep Learning”,
  • “How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?”
  • “On Calibration of Modern Neural Networks”

My personal award for the all-time best paper and title for the week goes to a talk from the Game Theory and Multi-Agents stream called “Deep Centralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability”. This research explores an approach that allows a group of autonomous agents (for example a fleet of robots with limited communication between them) to learn learning optimal policies to achieve tasks even when they cannot fully observe their environment and despite possible interference by their teammates.

I kept a Google doc running throughout these sessions, compiling lists of phrases, nouns, people, companies and technology stacks to look up and understand later. It was really growing rather long…

On Wednesday night there were drinks and nibblies in the main hall after the sessions had completed. There was a wide mix of some very smart people from all over the world and I spent much of the evening arguing about the quality of Australian beers (and sampling without replacement) with a Norwegian mathematician who swears he’s going to ski to ICML 2018 in Stockholm.

Straight after that I had a long talk with one of the head researchers from Google Brain on their work using genetic algorithms to evolve neural network topologies for image classifiers. All in all it was great fun and I think I may have even understood a few parts of what he was saying.

At the risk of understating this, these weren’t really every day conversations. ICML was a nexus of the finest minds in Machine Learning coupled with research that was so new some of the results from many of the papers were still being finalised.

The first half of the week was a mind expanding overload. And if the second half was going to be anything like the first it was time to go home and get some sleep.

Leave a Reply