Other pie chart

This morning I read “20 ideas for better data visualization“. I liked it very much and specially I found 8th idea – “Limit the number of slices displayed in a pie chart” very relevant for me. So I jumped into the plotly express code and created a figure of type other_pie which given a number (n) and a label (other_label) created a pie chart with n sectors. n-1 of those sectors are the top values according to the `values` column and the other section is the sum of the other rows.

A gist of the code can be found here (check here how to build plotly)

I used the following code to generate standard pie chart and pie chart with 5 sectors –

import plotly.express as px
df = px.data.gapminder().query("year == 2007").query("continent == 'Europe'")
df.loc[df['pop'] < 2.e6, 'country'] = 'Other countries' # Represent only large countries
pie_fig = px.pie(df, values='pop', names='country', title='Population of European continent')
otherpie_fig = px.other_pie(df, values='pop', names='country', title='Population of European continent', n=5, other_label="others")

And this is how it looks like –

Pie chart
Pie chart
Other pie chart

Advertisement

5 interesting things (21/10/21)

4 Things Tutorials Don’t Tell You About PyPI – this hands-on experience together with the explanations is priceless. Even if you don’t plan to upload a package to PyPI anytime soon those glimpses of how PyPI works are interesting.

https://blog.paoloamoroso.com/2021/09/4-things-tutorials-dont-tell-you-about.html

Responsible Tech Playbook – I totally agree with Martin Fowler statement that “Whether asked to or not, we have a duty to ensure our systems don’t degrade our society”. This post promotes the text book about Responsible Tech published by Fowler and his colleagues from Thoughtworks. It also references additional resources such as Tarot Cards of Tech Ethical Explorer.

https://martinfowler.com/articles/2021-responsible-tech-playbook.html

A Perfect Match – A Python 3.10 Brain Teaser – Python 3.10 was released earlier this month and the most talked about feature is Pattern Matching. Read this post to make sure you get it correctly.

https://medium.com/pragmatic-programmers/a-perfect-match-ef552dd1c1b1

How I got my career back on track – careers is not a miracle. That’s totally ok if you don’t want to have one but if you do and have aspirations you have to own it and manage your way there. 

https://rinaarts.com/how-i-got-my-career-back-on-track

PyCatFlow –  A big part of current data is time series data combined with categorical data. E.g., change in the mix of medical diagnosis \ shopping categories over time etc. PyCatFlow is a visualization tool which allows the representation of temporal developments, based on categorical data. Check their Jupyter Notebook with interactive widgets that can be run online.

https://medium.com/@bumatic/pycatflow-visualizing-categorical-data-over-time-b344102bcce2

Growing A Python Developer (2021)

I recently run into a team lead question regarding how to grow a backend Python Developer in her team. Since I also iterated around this topic with my team I already had few ideas in mind.

Few disclaimers before we start. First, I believe that the developer also has a share in the process and should express her interest and aspirations. The team lead or tech lead can direct and light blind spots but does not hold all the responsibility. It is also ok to dive into an idea are a tool that is not required at the moment. They might come in handy in the future and they can inspire you. Second, my view is limited to the areas I work in. Different organizations or products have different needs and focus. Third, build habits to constantly learn and grow – read blogs and books, listen to podcasts, take online or offline courses, watch videos, whatever works for you as long as you keep moving.

Consider the links below as appetizers. Each subject below has many additional resources besides the ones that I posted. Most likely I’m just not familiar with them, please feel free to add them and I’ll update the post. Some subjects are so broad and product dependent, e.g. cloud so I didn’t add links at all. Additionally, when using a specific product \ service \ package read the documentation and make it your superpower. Know Python standard library well (e.g itertoolsfunctoolscollectionspathlib, etc), it can save you a lot of time, effort, and bugs.

General ideas and concepts

  1. Clean code – bookbook summary
  2. Design patterns – refactoring bookrefactoring gurupython design patterns GitHub repo
  3. Distributed design patterns – Patterns of Distributed Systems
  4. SOLID principles – SOLID coding in Python
  5. Cloud
  6. Deployment – CI\CDdockerKubernetes
  7. Version control – git guide
  8. Databases – Using Databases with Pythondatabases tutorials
  9. Secure Development – Python cheat sheet by SnykOWASP

Python specific

  1. Webservices – flask, Django, FastAPI
  2. Testing – Unit Testing in Python — The Basics
  3. Packaging –  Python Packaging User Guide
  4. Data analysis – pandas, NumPy, sci-kit-learn
  5. Visualization – plotly, matlpotlib
  6. Concurrency – Speed Up Your Python Program With Concurrency
  7. Debugging – debugging with PDBPython debugging in VS Code
  8. Dependency management – Comparison of Pip, Pipenv and Poetry dependency management tools
  9. Type annotation – Type Annotations in Python
  10. Python 3.10 – What’s New in Python 3.10?, Why you can’t switch to Python 3.10 just yet

Additional resources

  1. Podcast.__init__ – The weekly podcast about Python and its use in machine learning and data science.
  2. The real python podcast
  3. Top 8 Python Podcasts You Should Be Listening to
  4. Python 3 module of the week
  5. Lazy programmer – courses on Udemy mainly AI and ML using Python
  6. cloudonaut – podcast and blog about AWS

5 tips to ace coding interview assignments

Now days, it is a very common practice to give a coding home tests as part of interview process. Beside solving the task you are ask to I believe there are few additional things you can do in order to impress the reviewers and ace this step of the process.


1. Push the code to a private repository and share it with the reviewers – this creates a two fold advantage. First, it demonstrates the reviewers that you are familiar with version control tools and second it shows your working process and that you keep track of your work. Don’t forget to write meaningful commit messages.

2. Write README file – the readme file gives a context to the entire project and reflects the way you understand the assignment. One of the annoting things as a reviewer is to guess how to run the code, what are the requirments and so on. Beside packaging or building the code in a way the runs smoothly (e.g in Python if using pip add a requirements.txt) a README file should help me find my way inside the project. In such assignments where you don’t have a direct communication with the reviewers the README files also serves as a place to document your decisions and thoughts.

What should you include in the README file? Short introduction explaining the project purpose and scre.  How to install and run or use it, preferably with some snippet that one can just copy-paste. How to run the tests (see next section :). Additional sections can include explainations about choices you made architecture wise or implementation wise, charts, performance evaluation, future ideas, dependencies, etc. This will help the reviewers get into your code quickly and run, understand your thinking and show that you are eager to share your knowledge with your peers.

For ease of use, check the template suggested here.

3. Write tests – usually unit tests are enough for this scope. This will help you debug your code and make sure it works properly. It will also signal the reviewers that you care about the quality of your code and know your job.

4. Run linters, spell check and do a proof reading for everything – make sure your code uses the common conventions and style for the tools you are using (e.g PEP-8 for python). Bonus points if you add the linters as pre-commit hooks to your repository. This make your code smoother and easier for the reviewers to read. The formatted code indicates that you are used to sharing your code with others and the hooks signal that you are productive and lazy by automating stuff.

5. Document everything – the idea behind this tip is not to annoy the reviewers by letting them guess what you meant. That is document what each module, function and parameter does. For example, in Python, use type annotation and docstrings.