Time to look at some insights the Wikipedia API can offer. This view shows top mobile web-access wikipedia sites for each week in 2020. The reason for picking mobile web-access and not all access is mainly to get more relevant data as pinging and hammering of wikipedia pages are likely to show up in the total data set.

To get wiki data, this URL from the API was called for all dates in 2020 and then storing json files locally:
https://wikimedia.org/api/rest_v1/metrics/pageviews/top/no.wikipedia/mobile-web/2020/08/01

From here the next phase was to parse the data and put everything into a Pandas dataframe. The views were aggregated to a weekly level and the image URL for the most frequent wiki pages were found through a separate API. Here is an example of “Norge” page with related main image URL:
https://no.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=Norge&pithumbsize=500&format=json

Finally some dataframe wrangling to get the data in a readable format in Flourish, and in the end tweaking some layout elements to arrive at the visual above.

The pages without images in the viz are missing the main image on the wiki page as well.

Categories:

Tags:

No responses yet

Leave a Reply

Your email address will not be published. Required fields are marked *