Learning From Data - Course Review

Jun 03, 2012

A must-have online course for beginners in machine learning.

Machine learning students and practitioners looking for a solid foundation on the subject probably heard about Learning From Data, a real Caltech course taught by Professor Yaser Abu-Mostafa and broadcast live, for free.

As a contribution to this wonderful initiative, I gather here my impressions regarding some aspects of the course.

Prerequisites

Basic probability, matrices, and calculus. According to the course website, this is all you need to know. I would add programming to that list though, since it’s necessary to write some tricky programs in order to answer to some questions.

Broadcast

The Ustream platform were chosen to broadcast the course and, apart from some small glitches, it worked nicely the times I tried it. This is how I’d watch the lectures if I lived somewhere with a more favorable time zone. (2:30 P.M. to 3:30 P.M. in Brazil)

Material

Lucky for us, the lecture videos were made available in several formats – in low and high quality versions – just a couple of days after its broadcast. The recorded Ustream session was accessible shortly after each lecture. The lectures were also posted on YouTube and iTunes U course app for iOS devices. This was really impressive.

I liked the quality of the slides, which were made available in PDF format. All pictures, plots, and equations are sharp and easy to read. Another plus: the notation chosen to explain each concept and algorithm relate to the notation adopted in popular articles on the same subject, for example, in Wikipedia. This makes things easy for those who seek to enrich their knowledge even further.

I ordered the textbook for the course, but unfortunately it didn’t arrive in time for this post. However, according to the few reviews on Amazon and the feedback given in the course forum, it does its job pretty well.

As the course website states, this really isn’t a watered-down material and I’m glad for that.

The subjects I enjoyed the most: VC Theory (Growth Function, VC Dimension), Regularization, Validation, and Support Vector Machines.

Professors

I must say the course teacher, Professor Yaser Abu-Mostafa, is one of the most talented teachers I’ve ever had, online or otherwise. He truly masters the subject and knows what is relevant and what is noise when explaining something.

Another impressive thing was how he was available in the Q&A forum, together with the Associate Professors, answering student’s questions and collecting feedback in order to keep the course at its highest level. Associate Professor Malik Magdon-Ismail also helped a lot in some discussions by writing some real gems. Some of his posts could be turn into nice articles very easily.

Assignments

I enjoyed every single one of them. They were thorough, challenging, and fun to work on. Some of them made me “spend” the entire weekend in order to answer all the questions! Well, I wasn’t looking for an easy time anyway. :-)

In fact, I would stop using the word homework because what we did at the end of each week was actually an experiment. We played with every aspect involved in the design and implementation of simple machine learning systems, either by implementing everything from scratch or by using third-party packages.

The ambiguity present in several answers (they being intentional or not) made me more skeptical about what was coming out of my programs. This encouraged students to engage in enlightening discussions in the Q&A forum.

To be honest, I don’t like vBulletin, the platform chosen for the Q&A forum. I think even more users would have joined the discussions if a more friendly (and reward oriented) option were chosen instead. (some Stack Overflow clone would have been better)

Still, I’m impressed by the quality of the discussions started there. Thanks to the very gifted and dedicated colleagues who helped me clear my doubts – and even change my mind. I owe you! :-)

My Solutions

I chose Octave to be my programming language throughout the course, and this choice proved to be the right one. Some questions involved heavy matrix-based calculations and quadratic programming problems, and this is where Octave shines.

Unfortunately, I’m not allowed to post my solutions online because of the honor code, but ask me if you want to discuss any question in particular.

Update (Jan 20, 2014): I participated in the reedition of this course on edX so I could earn a certificate as a reward for my hard work. In this edition, the course staff encouraged students to discuss their solutions after each week’s deadline. My solutions are available here.