As a person who cares about environmental issues, coming across a data set about forest fires in Brazil on Kaggle was very exciting. The data set contains the number of forest fires in 23 states in each month over the span of years from 1998 to 2017 reported by the Brazilian government. I decided to analyze the data set to find the trend of the number of forest fires, to find when forest fires occur with the highest frequency, and to create a Tableau dashboard to visualize the change in the number of forest fires over time.
engine = 'python'
When reading the csv file with the data set, I had to set the engine parameter to python because UTF-8 couldn't decode some characters.
I realized that the month names were in Portuguese, so I decided to change all the month names to numeric names. I created a new column in my dataframe that are the numeric equivalents to the month names.
Geocoding
I wanted to have the latitude and longitude for each state in the data set so that I can graph it in the future. So, I used locationiq's API to find the latitude and longitude for each state.
I then was able to create a dictionary of the states and their respective latitude and longitude values. I was able to use two apply functions to apply to each row the correct latitude and longitude based on the state.
Average Number of Forest Fires Per Year in Each State
The visual I wanted to create is one of a map which shows the average number of forest fires per year in each state in the data set with a circle. The larger and darker the circle, the higher the average number of forest fires per year. I would need to create a dataframe with the average number of forest fires per year in each state first. Here is the code, which required use of a groupby:
Total Number of Forest Fires Per Year
It would be great if I could see the overall trend of forest fires throughout the years. I decided to use another groupby to find the sum of all forest fires over the years. I found that there was an increasing trend.
A dashboard where the user could click on a point on the line graph which corresponds to a specific year and the number of forest fires would filter the map of Brazil to the corresponding year and show the average number of forest fires in each state would be pretty nice. So, I did that.
Monthly Trends in Forest Fires
Before I show the Tableau dashboard, I would like to show a graph which represents the monthly trend of forest fires over the years. As you can see, the graph below shows that the number of forest fires are low in the beginning of the year, increases quickly in June, peaks at July, drops a bit in September, then spikes again in October.
Looking at a bit more granular of a level, you can see the shift in the lines upward since 1998 of the number of forest fires over months. This supports the positive trend seen in the number of forest fires over the years.
An Increasing Trend of Forest Fires in Brazil
Lastly, I would like to present the Tableau dashboard I created with a short video. You can see the change in the sizes and shade of the circles that represents the average number of forest fires in each state over time. The trend line of the total number of forest fires each year is also there. Sao Paulo always has a very large average number of forest fires each year.
No comments:
Post a Comment