5 interesting things (04/09/2014)

C3.JS – I have previously wrote a post about the importance of visualization in the skill set of data scientist. C3.js is a JavaScript chart library based on d3.js which seems at least in a glimpse to be simple and intuitive. I would like to see a Python client for that but that for the future to come.  

http://c3js.org/

nvd3 also do something  like – charts based 3d.js and also have a Python client which I worked with a bit. Comparing the two c3.js seems a little bit more mature than nvd3, ignoring the lack of Python client but I’m sure that gap would be filled soon. 

nvd3.org

Harvard Visualization course – I went through some of the slides and it was fascinating but what is even more exciting is the great collection of links about visualization examples, theory and tools. Great work.

http://www.cs171.org/#!index.md

textract – I needed high flexibility of input types in a project I do and of course I wanted to deal with as transparent as possible without looking myself to all the relevant packages or adjust my code to the API of each package. Fortunately somebody already did it – textarct. The package is not perfect and there are some “glitches” mostly concerning the parsers themselves (line splitting, non-ascii, etc) and not to the unified API textract provides. However, it is a very good start.

http://textract.readthedocs.org/en/latest/

Visualizing Garbage Collection Algorithms – both very cool visualization and good explanations. Design wise I think the visualization should be larger but the concept itself is very neat.

http://spin.atomicobject.com/2014/09/03/visualizing-garbage-collection-algorithms/

SmartCSV – making CSV reader more structured by defining a model and validating it while reading. Enables skipping rows (and soon skipping also columns). It is on going project and feature requests and issues are currently addressed quickly.