Description
Data scientists spend only 20 percent of their time on building machine learning algorithms and 80 percent of their time finding, cleaning, and reorganizing huge amounts of data. That mostly happen because many use graphical tools such as Excel to process their data. However, if you use a programming language such as Python you can drastically reduce the time it takes for processing your data and make them ready for use in your project. This course will show how Python can be used to manage, clean, and organize huge amounts of data.
This course assumes you have basic knowledge of variables, functions, for loops, and conditionals. In the course you will be given access to a million records of raw historical weather data and you will use Python in every single step to deal with that dataset. That includes learning how to use Python to batch download and extract the data files, load thousands of files in Python via pandas, cleaning the data, concatenating and joining data from different sources, converting between fields, aggregating, conditioning, and many more data processing operations. On top of that, you will also learn how to calculate statistics and visualize the final data. The course also covers a series of exercises where you will be given some sample data then practice what you learned by cleaning and reorganizing those data using Python.
Who this course is for:
Those who come from any technology field that deals with any kind of data.
Those who want to leverage the power of the Python programming language for handling data.
Those who need to learn Python basics and want to quickly advance their skills by learning how to perform data cleaning, analysis and visualization with Python – all in one single course.
Those who want to switch from programming languages such as Java, C, R, Matlab, etc. to Python.
Requirements
A working computer (Windows, Mac, or Linux)
No prior knowledge of Python is required
Last updated 4/2018