With a global pandemic going on, and all of us stuck at home, now is a great time to lock yourself away from the world and catch up on reading. While there are many different genres of books I enjoy reading, the types of books I find most productive are few and far between examples that explain statistics and machine-learning in spectacular detail. As a lifelong learner, books are great mediums to extrapolate a whole surplus of information, and in certain topics can even outpace the information available on the internet.
By Joel Grus
Data Science modules are a great tool to experiment and get familiar with Data Science. The big downside that comes along with this, however, is that it can be incredibly easy to get familiar with the modules rather than the models. What’s great about Data Science From Scratch is that it takes a deep dive into precisely how several industry-standard models in a comfortable 330 pages. Though subjective opinions on this book seem to be mixed, it is certainly one I enjoyed a lot, and you might too.
Firstly, the book offers a crash-course in Python, while a ‘ crash course’ might not be a great way to learn programming, I think that coming from any other programming languages Joel’s explanations are certainly apt. In addition to a Python course, the book also comes with a great rundown on linear algebra in Data-Science, as well as the statistics involved with probability, basic and advanced.
Throughout the book, models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and unsupervised clustering are all discussed in vivid detail and very well explained in my opinion. The book finishes with an overview of natural language processing and network analysis, and is definitely one I would recommend!
by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani
As opposed to the last book I discussed, this one takes an entirely different spin on things using R instead of Python. This book presents some of the most important modeling and predictive techniques available in R with the proper relevant applications to do so. Compared to Data Science From Scratch, I would argue that this book is much more beginner-friendly. The book also has really nicely designed colored graphics and real-world scenarios which it combines for an easy to comprehend machine-learning lesson.
Each chapter of this book has a great tutorial on implementing different modules for analysis and modeling in R. Two of these authors are also co-writers for our next book,
by Trevor Hastie, Robert Tibshirani, and Jerome Friedman
Many topics are covered very well in The Elements of Statistical Learning, including model-graphing, neural networks, support-vector machines, classification trees and gradient boosting. The book takes a deep dive into what you might expect: supervised learning, but also unsupervised learning. This book is widely considered to be one of the first major treatment of this topic comprehensively in any book. Hastie, Tibshirani, and Friedman are all professors of statistics at Stanford University, and are prominent researchers in statistics.
by Larry Gonick and Woollcott Smith
Taking a full one-hundred and eighty degree turn from the previous books is The Cartoon Guide To Statistics, a book which uses beautiful illustrations and simple explanations to explain statistics in a really unique and beautiful way. Not only did I have a lot of fun reading this book due to the sometimes quite hilarious illustrations, but there is also a lot of information inside. Though it might not have quite the heavy-hitting statistical levels of the other books on this list, I think it definitely fits in with it’s charming art style and beginner-friendly nature.
By Steven S. Skiena
Though often Data-Science is faced more-so with statistics then writing algorithms, there are many different situations where being an algorithm wizard is certainly a significant benefit for any Data Scientist. The Algorithm Design Manual takes a deep dive into creating advanced, complex algorithms from scratch.
What’s great about this book is that it takes genuine, practical experience and comprises it into a small length of only four-hundred and eighty-six pages. The book is divided into two large parts, the first of which is a broad introduction to algorithm design and analysis. The second part of the book is a reference section, a ‘glossary’ of sorts containing seventy-five of the most important algorithms to be familiar with. This is not only a great book to read once, but also a great one to keep around for reference!
Those were some of my personal favorite books ranging in genre from statistics to machine-learning and developing algorithms. Overall i’d say that these books are especially beneficial for prolonged use, as I often find myself referencing books like Data Science From Scratch and The Algorithm Design Manual.
Fortunately, the world of data-science literature is nowhere near as sparse as it once was, and there are books being written every day with more information to decipher. Though this is definitely a great situation for a life-long learner, the entire library of machine-learning books can be very daunting to go through. And with that, these are my favorite books about Data-Science, machine-learning, statistics, and computer-science, but I’d be very interested to know what you recommend reading! Is there a particular book that you attribute to a significant amount of knowledge you’ve obtained?