Today I talked about working with missing data at PyCon IL. We started with a bit of theory about mechanisms of missing data –
- MCAR – The fact that the data are missing is independent of the observed and unobserved data.
- MAR – The fact that the data are missing is systematically related to the observed but not the unobserved data.
- MNAR – The fact that the data are missing is systematically related to the unobserved data.
And deep-dived into an almost real-world example that utilizes the Python ecosystem – pandas, scikit-learn, and missingno.
My slides are available here and my code is here.
3 related posts I wrote about working with missing data in Python –