As per Wikipedia definition, Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. Data science is the same concept as data mining and big data: "use the most powerful hardware, the most powerful programming systems, and the most efficient algorithms to solve problems".


As per above Venn diagram, we can define Data Science is a intersection of three main fields which are, 1. Maths and Statistics, 2. Computer Science/IT, 3. Domains and Business Knowledge.

To learn Data Science, you should have know basics of Maths and Statistics and good knowledge on domain knowledge and of course you need to learn some of software framework/technology like machine learning libraries(Python scikit-learn, Spark ML, etc.) and programming languages like Python, R, etc.

If someone ask me, how to start about learning Data Science, my suggestion would be as per below steps:

1. Basic Maths and Statistics

2. Programming Language Python or R

3. Machine Learning

4. Visualisation using any tools or Python packages or R packages

5. Deep Learning

6. Natural language processing(NLP)

7. Artificial Intelligence

8. Computer Vision, Chatbot Development, Internet of Things (IoT), etc. the list goes on

9. ETL(Extract Transform Load) Process/Data Warehousing

10. Big Data

Note: Domain knowledge is must for solving any business problems.

We will go into the more details on each step in the future blog.

