Wednesday, July 10, 2013

Introduction to Data Science books and courses

    I was asked about books and courses that will help to get started with learning Data Science (Data Mining, Machine Learning or Data Analysis).
   My main toolchain is Python, NumPy/SciPy/Pandas/Scikit-learn, Hadoop and MRJob. Based on this I put together a list of books that will good to start with:


Learning Python. Mark Lutz.
Book to learn Python before jumping to data science. 

Python for Data Analysis. Wes McKinney.
Book from the author of pandas module. Great book to learn how to do descriptive stats with Python.

Programming Collective Intelligence. Toby Segaran.
Introduction to self written Machine learning algorithms with Python.

Machine Learning in Action. Peter Harrington.
k-Nearest neighbors, naive Bayes, SVM, decision trees with examples in Python


Definitive guide from one of the early contributors to Hadoop source code and person with wast experience working with it.

R & Stats

Data Analysis with Open Source Tools. Phillipp K. Janert.
Sometimes Python is just not enough and this book will help to start working with R.

Think stats. Allen B. Downey.
If you are coming from Computer Science major you better get this book about probability theory and stats.

Good read on Data Science

Predictive Analytics Power Predict. Eric Siegel.
Good read on Predictive Analytics philosophy and examples of real world tasks that people solved with it.


  • Introduction to Data Science - Good introduction to all main concepts that data scientist should know (SQL, NoSQL, Hadoop, R, Machine learning algorithms and visualization and etc).
  • Computing for Data Analysis - Course about learning R and solving real problems with it.
  • Machine Learning - Basics of Machine learning from Andrew Ng (Founder of Coursera and Director of AI Lab in Stanford).
  • Computational Investment - Course that will teach how building a trade-robot for stock exchange in Python using all the tools that Data Scientist uses (see as practical examples).

    This list of books and courses will be updated when I'll find something worth reading or watching on this topic. If somebody knows a good book that I should add to this list - please, let me know.


  1. Attend The Data Science Course Bangalore From ExcelR. Practical Data Science Course Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Course Bangalore.
    Data Science Course Bangalore

  2. I am looking for and I love to post a comment that "The content of your post is awesome" Great work!

    Simple Linear Regression

    Correlation vs Covariance

  3. Very interesting blog. Many blogs I see these days do not really provide anything that attracts others, but believe me the way you interact is literally awesome.You can also check my articles as well.

    Data Science In Banglore With Placements
    Data Science Course In Bangalore
    Data Science Training In Bangalore
    Best Data Science Courses In Bangalore
    Data Science Institute In Bangalore

    Thank you..

  4. Cheap Gucci Handbags Is usually blogengine much better than wp for reasons unknown? Should be which is turning out to be popluar today. Cursos gratis

  5. Highly appreciable regarding the uniqueness of the content. This perhaps makes the readers feels excited to get stick to the subject. Certainly, the learners would thank the blogger to come up with the innovative content which keeps the readers to be up to date to stand by the competition. Once again nice blog keep it up and keep sharing the content as always.

    Data Science Course in Bhilai

  6. Fantastic blog extremely good well enjoyed with the incredible informative content which surely activates the learners to gain the enough knowledge. Which in turn makes the readers to explore themselves and involve deeply in to the subject. Wish you to dispatch the similar content successively in future as well.

    data science training in bhilai

  7. Enhance your abilities in the Data Science industry by enrolling in the top-rated, highly successful Data Science Course in Hyderabad by AI Patasala.
    Data Science Training Hyderabad