I recommend checking this series out. It offers a very insightful and more intuitive explanation of linear algebra concepts. Beautiful visualisations too.
I've heard thoughts that probabilistic programming will be bigger in the future as it develops further, although with its Bayesian emphasis I'm sure it'll work better with smaller data sets and domain experts encoding priors. I have heard people using hierarchical models using their domain expertise to help determine priors although it'd be nice to see more examples of this work in order to learn the process for it.
As others have earlier echoed, this is a fairly good representation(get it?) of the history of AI up until the time this introduction was written. In particular, I enjoyed the emphasis on representation learning and how the actual representation of the data can greatly affect the results. I have personally experienced this in any work with data that I have done - small tweaks to the representation of the data can often lead to drastic improvements in performance.
On that note however, I wonder if greater abstraction is possible - if deep learning's strength is that it's 'automatic' representation learning and that greater advances are made through changing the structure of the network itself to better represent the data(CNNs, RNNS), is the next step abstracting this process?
I have seen a few ideas related to automatic search of neural network architectures though I was wondering whether if any of you have seen particularly promising results.
Interested. Thanks for organising.