Youtube Analysis

The first exploration with youTube was to look how much data there were in 84 language with a simple search in Yotube, using the query "agroecology"
Focos on just the 3 most important language for us, we get:

  1. EN (620) : “agroecology OR agro-ecology”
  2. ES-PT (842) : “agroecologia OR agro-ecologia”
  3. FR (817) : “agroecologie OR agro-ecologie”

This exploration was presented in the first advance of the proyect(You can find the presentation here), as part of the recomendation it was to make a deeply search, looking videos by year or semester, changing the query too, making the Analysis and focus more in how are the people who are making videos. For this page, we will post the results of this search.

The Tools used and how I did it?

For extract data from Youtube I used pytheas and also use some script to format the data in a easy way to read and analyze, also for add the captions that was extracted too from pytheas. Using pytheas was a decision of ease, I made a Tutorial of "how to use it" in medium. If you want to explore it a little bit, its free and opensource

1. Data with English Query

Quantity of video searching by year with the english query

There was 245 videos from 2006-2019 , the following image is the plot of all the videos founded between 2009-2019 in Youtube, ploted by year. I also made the exploration of videos from 2006-2019 but I didn't get any video that fit the query

Quantity of video searching by semester with English Query

Then I made the same search by semester, from 2007 to 2019, as you can see, there isn't videos from 2006-2008.

Network map with the terms and channel in English

You can see in detail the network here

You can see there are colors isolated from other colors, thats means theses terms and channels are connected but are mentions in the data

Geo mapping of the total data in English

This Location terms is extracted from what the people talk in the Title, description and captions of each videos, doesnt means is the place where the people are developing the projects
You can see the map in detail here

2. Data with Spanish Query

Quantity of video searching by year with spanish query

There was 3141 non duplicated videos from 2006-2019,this data is all filtered and cleaned.

Quantity of video searching by semester

There was 4121 non duplicated videos from 2006-2019, like in English from 2016 the activity of the people start to growing up, but it is now in 2019 when you get more videos in Spanish

Network map with the terms and channel in Spanish

You can see in detail the map here

In this one is really funny how almost all the colors are related to the term 'agricultura ecologica' and the channels connect to other terms making a really compact red

Geo mapping of the total data in Spanish

This Location terms is extracted from what the people talk in the Title, description and captions of each videos, doesnt means is the place where the people are developing the projects
You can see the map in detail here


By year is less data than by semester. But in the case of the English I was expecting more video, mainly because the english is the universal language and because in the first exploration I get 4 times the ammount of data making a non data search; but apparently is the only data Youtube wanted to give to me
In the network map is we explorer a little bit more each node, for see who are the people who are talking; we can see more presence of individual people uploading videos in the Spanish Data, more instead in the English Data, there is more like ONG and institution making and uploading videos.


Go deeply in the analysis for look in each year, for see what activities are being developed and that could give us a better understanding of the why there is more data in one year than in another