Wednesday, March 20, 2019

Education vs. Earnings

I was curious about what Ivy League students do once they graduate. I can read anecdotes of cases of what happened to students after they graduated from top universities, but I wanted to look at some data. I did find some more general data on education, specifically the National Household Education Survey.  I don't think the data that I am about to show is particularly new, but creating the visual still required some thinking. A disclaimer I have is that I didn't really write the code. I actually googled to find how I could make my ideas come to life. I tweaked it, liked adding the labels for the x and y axes though. Throughout the process, I did come up with some insights as to how I wanted my data to look to other and that guided me to googling the right questions. I think that is important because there is so much information out there that I can spend the whole day parsing through tutorials and never find out what I need to create exactly what I want.

So, here is the finished product created from matplotlib in Python.

It's a scatter plot that shows 3000 data points from a 2016 survey.

The actual data files are a lot bigger, but I just chose two columns to compare again, specifically highest level of education obtained and the person's earnings in the past 12 months.  As you go from left to right on the x-axis, the higher the level of education obtained by the person is. As you go from bottom to top on the y-axis, the higher the earnings of the person was in the past 12 months. 

There were also people who skipped this question on the survey, which is why there is a Skip row. 

The size of the dots show the relative occurrence of the combination of level of education and earnings in the survey. 

Actually, the whole process of choosing to display the data as a scatterplot was not straightforward. My first instinct was to represent the data as a scatterplot, but the first scatterplot I made just showed the same size and same opacity of dots that covered basically the whole graph that it didn't show anything significant. So, I thought, perhaps I can use a bar graph? I can count how many people were in certain categories and then make a bar graph? It turns out, there are too many categories and perhaps, I could have made intervals for the categories, but then I got an insight. The reason why I could not find anything significant in the first scatterplot was because there wasn't a way to distinguish the intensity or how many people are represented by each dot. The magnitude was a dimension I wanted to show, so if I could show by color or by size the number of people who was in a certain category, then the scatterplot would make sense!

It turns out, others had the same question that I did. How can I change the size or color of my dots in a scatterplot to represent the relative amount of data it represented? It turned out to be a simple line of code that I still don't really understand, but does the job. 

So, a little analysis on the scatterplot.

It seems that there are people with little education who makes a fortune (to me at least) and also makes nothing, and people with a lot of education who makes a fortune and also who makes nothing.  

Either that there is just more people with high school diplomas than no high school diplomas, or the plot shows that getting a high school diploma does increase your chances of getting a higher salary, although there is still a large amount of high school graduates who make very little. But, without a high school diploma, it seems there is little to no chance of making over $150k a year. 

There is still a trend of high school graduates making less money than more money (the size of the dots gets smaller from bottom to top), but the trend reverses for those with a college degree- BA. The size of the dots gets bigger from bottom to top, showing that more people with a college degree are more likely to earn more money than not. In fact, there is a lot of people with college degrees and not higher, who earn between 75-100k a year.  

There is a similar trend to those with a Masters Degree--more likely than not you are earning more money- smaller dots on the bottom, bigger dots on the top.  But, the share of those earning in the 75-100k range is a bit smaller. Maybe it's because there are just less people with Master's degrees though. 

If you get more than a Master's Degree, there is little chance that you will be making less than 20k, but it can still happen. There may be too few people who have higher than a Master's Degree to effectively say that getting more than a Master's will boost earning potential. But, if you compare the relative size of the dots in the Doctorate or Professional Degree Column, you will most likely be richer than poorer if you got that higher degree, it seems.

 I remember reading articles about how getting more education does not equal to getting a higher earning potential, and I can perhaps see why now. But, the scatter plot does seem to show that the chances of being poorer is diminishes as you get a higher education. 

This concludes my second project. And a quote from George Washington Carver: "Education is the key to unlock the golden door of freedom."

Is freedom money? 



No comments:

Post a Comment