Unsupervised Learning has been called the closest thing we have to “actual” Artificial Intelligence, in the sense of General AI.
K-Means Clustering is one of its simplest, but most powerful applications.
When doing Statistical Analysis, curiosity and intuition are two of a Data Scientist’s most powerful tools. The third one may be Pandas.
Today we’ll leverage Python’s Pandas framework for Data Analysis, and Seaborn for Data Visualization. Sometimes when facing a Data problem, we must first dive into the Dataset and learn about it. Its properties, its variables’ distributions — we need to immerse in the domain.
As a Data Scientist, I spend about a third of my time looking at data and trying to get meaningful insights, the discipline some call exploratory data analysis. These are the tools I use the most. Today we will be looking at two awesome tools, following closely the code I uploaded on this github project. One is Jupyter Notebooks, and the other is a Python Framework called Pandas.