5 interesting things (8/6/2020)

DBScan practitioners guide – DBScan is a density-based clustering method. One important advantage comparing to K-means is DBScan’s ability to identify noise and outliers. I feel that  DBScan is often under-estimated. See this guide to learn more on how DBScan works, how to choose hyperparameters and more.

https://towardsdatascience.com/a-practical-guide-to-dbscan-method-d4ec5ab2bc99

Fourier Transform for Data Science – When I was in undergrad school I learned FFT out of context, it was just an algorithm in the textbook and I didn’t understand what it was good for. Later I was asked about it in an oral in grad school and was able to mumble something. Much later I tried to pull some analysis on ECG waves and then I finally understood what it was about.Read this post if you want to demystify Fourier transform. 

https://medium.com/swlh/fourier-transformation-for-a-data-scientist-1f3731115097

Bonus – OpenCV tutorial on Fourier Transform

https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_transforms/py_fourier_transform/py_fourier_transform.html

Dataset shift – dataset shift happens when the test set and the train set come from different distributions. There are multiple expressions of this phenomenon, such as covariate shift, concept shift, prior distribution shift. I believe that every data scientist working in the industry came across at least one of those manifestations. This post provides a very good introduction to the topic and useful links if you want to delve.

https://towardsdatascience.com/understanding-dataset-shift-f2a5a262a766

A Practical Framework for AI Adoption: A Five-Step Process – having several years of experience as a data scientist I have noticed that data products are often not deployed, do not meet stakeholders’ expectations, not used as the data scientist intended, etc. This post introduces a framework that tries to remedy some of those problems.

https://datascientia.blog/2020/05/23/framework-for-ai-adoption-a-five-step-process/

Diagrams as code – a code-based tool to draw system diagrams, possibly easier than fighting draw.io. It contains many icons including several cloud providers (AWS, GCP, Azure, etc). common servicers (K8S, Elastic, spark), etc. All in all, seems very promising.

https://diagrams.mingrammer.com/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s