2016 Review | Pavel Surmenok

I treat this blog as my lab journal. I will keep posting random thoughts along with better-written articles on particular topics. This post is an attempt to summarize my activity in some areas in 2016 (see overview of ML and AI in 2016 general in my previous post).

I’ve been working on data related things for last two years. 2015 was a year of data: SQL Server, ETL, data architecture, data integrity at JustAnswer. And playing with machine learning at home.

Last year was still about data, but accents changed a bit. In 2016 I was spending most of the time building chatbot software and making it smarter by applying machine learning for natural language understanding.

These two things, conversational interfaces (chatbots) and natural language understanding, became the main focus for my research. A couple of more concrete goals:

Get two deep learning projects running on production.

Almost done. Two machine learning models are on production, but it was not quite deep learning One of them was using a neural network but with just one hidden layer. It was important learning: simple algorithms often outperform more complex models, especially if the problem is simple and the dataset is small. I started paying more attention to simpler models after reading a paper about character-level convolutional neural networks. See this blog post.

Understand how to build machine intelligence systems which contain multiple parts, e.g. multiple ML models or ML models combined with rule-based models.

I built some theoretical understanding of this. See my articles about intelligence platform stack, chatbot architecture and NLP for chatbots. I didn’t finish validating all these ideas. It is a topic for ongoing research. I expect to post more about it in 2017.

A good strategy should include things to do and things NOT to do. My 2016 strategy prescribed me to stay away from architecture and development of systems which are not related to chatbots, ML or data. I did this part well.

I wrote nine new blog posts in 2016, same as in 2015. Started cross-posting to Medium and some Medium publications like HackerNoon. Due to Medium, Twitter, Facebook, and Hacker News new blog posts get more attention now. Most popular posts in 2016:

Reading is important. I read or started reading a few books in 2016. AI related books:

TensorFlow For Machine Intelligence. A comprehensive (271 pages) introduction to using TensorFlow for Machine Learning.
Machine Learning Yearning by Andrew Ng. The book is not finished yet, but Andrew published drafts of a few chapters for those who signed up on the website. This is very valuable and unique content that you won’t find anywhere else. Andrew focuses on tips and tricks for applying machine learning algorithms to real problems: how to split the dataset, how to decide what to do to improve the model: get more training data, get more validation data, use a larger model, etc., how to analyze errors.
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos. An overview of 5 major schools of machine learning: symbolist, connectionist, evolutionist, bayesian, analogizer. There is no math there, and no deep dive into any algorithm in particular. It is good just as a reminder that there is something else besides neural networks, and the Master Algorithm may need to use a combination of ideas from a few different schools of machine learning.
Superintelligence: Paths, Dangers, Strategies by Nick Bostrom. I heard a lot of criticism about this book from machine learning experts, so I avoided reading this book for a while. But after I’ve read this Yoshua Bengio’s post, I decided that I shall read the book to understand the context of the discussion Neil Lawrence commentary should be as interesting as the book itself.

Management related:

Good Strategy Bad Strategy: The Difference and Why It Matters. This book was interesting to read, and it was useful for me to think about strategy when I was building a Machine Learning roadmap at JustAnswer.
Under New Management: How Leading Organizations Are Upending Business as Usual by David Burkus. David describes a few innovative management ideas: put customers second (and employees first), make salaries transparent, close open offices, ditch performance appraisals, use unlimited vacation policy, etc. I can agree with some of them, and some are controversial. It’s definitely worth to think about pros and cons of these policies.
Why Greatness Cannot Be Planned: The Myth of the Objective. Interesting read, challenging the myth of objectives. Often it is not efficient to have an objective if you don’t have a concrete path to reach it. The innovation is not driven by focused effort. It is better to embrace serendipitous discovery.

Science fiction:

Snow Crash by Neal Stephenson. I read it a few years ago in Russian translation, this time I decided to re-read the original text. After all, it is one of the most influential science fiction books.
Oceanic by Greg Egan. Greg Egan writes the hardest science fiction I have ever seen. Mathematical and quantum ontology themes.
Glasshouse by Charles Stross. I would classify it as post-singularity science fiction, picturing the world after the technological singularity.
The Atrocity Archives by Charles Stross. This is the first book in The Laundry Files series. It is quite entertaining, but I feel that it is more of a spy thriller than science fiction.

Other:

Transcend: Nine Steps to Living Well Forever by Ray Kurzweil. Ray Kurzweil summarized the state of research on longevity and provided recommendations spanning from diet and exercise to drugs and supplements. As immortality is one of my long term goals, this book was quite interesting for me.
Prisoners of Geography: Ten Maps That Explain Everything About the World. The author explains many events happening in the world by geography.

Shorter reads: 1500+ articles on Medium and countless articles from other sources.

My favorite ML/AI podcasts in 2016: Talking Machines, TechEmergence, TWIMLAI.

How was your 2016?