There is a Japanese word, tsundoku (積ん読), which means buying and keeping a growing collection of books, even though you don’t really read them all.
FuzzyWuzzy is a Python library used for measuring the similarity between two strings. Here’s how you can start using it too.
Unsupervised Learning has been called the closest thing we have to “actual” Artificial Intelligence, in the sense of General AI.
K-Means Clustering is one of its simplest, but most powerful applications.
When doing Data Analysis, curiosity and intuition are two of a Data Scientist’s most powerful tools. The third one may be Pandas.
Today we’ll leverage Python’s Pandas framework for Data Analysis, and Seaborn for Data Visualization. Sometimes when facing a Data problem, we must first dive into the Dataset and learn about it. Its properties, its variables’ distributions — we need to immerse in the domain.
Sometimes Kaggle is not enough, and you need to generate your own data set.
Sometimes you open a big Dataset with Python’s Pandas, try to get a few metrics, and the whole thing just freezes horribly. Dask Dataframes may solve your problem.