Today we’ll leverage Python’s Pandas framework for Data Analysis, and Seaborn for Data Visualization. Sometimes when facing a Data problem, we must first dive into the Dataset and learn about it. Its properties, its variables’ distributions — we need to immerse in the domain.
Web Scraping with ScraPy comes into the scene whenever you need to generate your own dataset. Sometimes Kaggle is not enough.
Sometimes you open a big Dataset with Python’s Pandas, try to get a few metrics, and the whole thing just freezes horribly. Dask Dataframes may solve your problem.
As a Data Scientist, I spend about a third of my time looking at data and trying to get meaningful insights, the discipline some call exploratory data analysis. These are the tools I use the most. Today we will be looking at two awesome tools, following closely the code I uploaded on this github project. One is Jupyter Notebooks, and the other is a Python Framework called Pandas.
Whether you’re a Data Scientist, a Web Developer working in an API, or any other of a long list of roles, chances are you’ll stumble upon Python at some point. If so, List Comprehensions are to be expected. Some of us love Python for its simplicity, its fluidity and legibility. Others hate it for not being as performant as C or pure Assembly, having Duck Typing, or being single-threaded (ish). No matter what group you belong to, if you’re in…