Plotting number of file modifications per extension over time
We can already compute the most popular name of each content (and overwhelmingly often, there is only one), which easily gives us its extension, which is a proxy for the programming language (or a small set of programming languages)
As part of the provenance index, we compute the date of first occurence of each content
We could join both tables, and compute for each extension the histogram of new content that appear within each time interval, ie. the rate of creation/update per extension.
This could give some interesting data.
cc @rdicosmo