This post aims to tell the story of life after graduation. With the help of data I gathered from the beginning of my school and job search, I will show the outcome of a graduate with a degree in the college of liberal arts and sciences, with a major that is not known for its job prospects without further education (Master’s, PhD). What was the data I gathered? Each day, starting on the 9th of September 2015 and ending on September 9th 20016, I logged how I spent my time towards productive ends. That includes: putting together grad school apps (GRE, personal statements, gathering letters, etc), applying for jobs, creating and maintaining this site and its content, continuing to teach myself relevant skills with MOOCs/PDFs, educational reading (professional, science, politics, philosophy, technology, and business), working part time, and so on. How did I log my activity? I just opened a notepad, decided on some sort of ‘entry’ convention, and typed in what I was doing and for how long. I also kept track of how many jobs I applied for on a given day. Here is what my text pad looked like:
Logging each activity wasn’t too much of a hassle; at most it took a few seconds to transcribe my activity. I only logged time spent fully engrossed in an activity. This way I didn’t include inflated numbers that skew the data. E.g., eating, using the restroom, chatting with others, browsing reddit… There is a lot of talk about “time at work” vs “time spent working”. Without hourly pay, I had less incentive to count unproductive activity as time spent working on something. My time was as valuable as I made it. With my recorded time spent strictly working on things, I could be more honest with myself about how I was spending my time, which really did elucidate the days when you don’t feel like doing anything or aren’t particularly focused. Some weeks I struggled just to get 20 good hours in. This project was a way to create some structure in my routine that was missing since graduating. Getting a job is a job.
As you can see my productivity reached a climax around January, as that was around the time I sent in my last application (I applied to a few PhD and Master’s programs). By April, I was logging the least hours of the entire year. Each week after sending in applications was stressful; each month that went by without hearing anything was torture. Then began the recursive search for answers to “what did I do wrong?” or “what part of my application was unsatisfactory?” My mind kept returning to this question if I wasn’t busy doing something. But these aren’t the right questions to ask because there are hundreds of people that apply to programs like these each year and only about 10% can be accepted (they actually make more offers than they can support, knowing some will decline). The only thing I could do as the acceptance/rejection letter deadline drew nearer was figure out a viable career outside of academics, or at least find a way to pay off my loans. Fortunately I spent the last two years developing other skills as well.
I took an introduction to computer science course my senior year and am happy I pushed myself to do so. Computer science is an intense study of problem solving, and solving problems with code became extremely rewarding to me. Humans are notorious for flawed reasoning and cognitive biases, so it is almost therapeutic to sit down and approach a problem from an engineering perspective. How large is the problem? How can we break it down into smaller, simpler problems? Knowing how to program, I augmented myself as a researcher. I also know more about myself, my peak hours of activity, and what I spend my time doing. I am now able to work with larger sets of data, as well as dissimilar sets of data since I can write scripts to transform the data as needed. I parsed the log file I created with a Python script. With matplotlib, a Python library, I created some simple graphs to make the data more human digestible.
(I planned to have more graphics for this post, like comparing how long I tended to work on certain things, but there was surprisingly little variation in the average duration per sitting throughout categories. The majority of times I sit down to work, it lasts between 45-55 minutes. I suspect this is the middle ground of many 20-30 minute sessions, and some longer 1.5-3 hour sessions. I also wanted to create a .gif of the graph above, but for each month, to see if my hours shifted with the season. But it was difficult to queue up 12 graphs.)
I have 1656 hours of activity recorded over the last year. This is roughly about 4/5ths of a 40 hour work week (2,080 hours/year). I recorded applying to 123 jobs, though the actual number may be a little higher. What types of things did I do over the year? I logged my data a little differently in the start compared to the end. In the beginning, I wrote out a descriptive sentence of what I did with details. By the end, I was just entering in keywords (python, blog, etc). So the best thing I could do was look at keyword frequency. It’s not a complete breakdown, but here is some example output printed to the command prompt (hours logged):
I was interviewed 6 times and had 3-4 phone screens. The 6th interview led to a job offer, which I took.
What I learned
This project involved lots of debugging my own code and spotting errors from the data entry portion of the project. I would’ve missed some of the errors in the data if I had not visualized it. For instance, it was only when I created the “hours spent working” graph (first one) that I was able to spot spikes where it said I worked for 22 hours in one day. I knew right away that was wrong, because I’ve only worked that much in a day once in my life.
I went to the date in the log and spotted an entry where I put 11:00am-12:45am instead of 12:45pm, causing to the script to count 12 hours too many. In hindsight, it may have been better to start with a project that would correctly and precisely record the date and time of each entry and write it to a text file. It would only need a simple GUI with a text box and a start/stop button.
Manual data entry is a problem in data science. Humans make mistakes, and they make them reliably. I was going through some public police data from Baltimore, when I noticed that some cells were entered with different formatting, others left completely blank. Some entries would have spaces before or after the data, and other times there would be a random special character injected somewhere as the person typing probably fumbled with using shift. When you have many different people contributing to a spreadsheet, and they each have their own brand of typos, you must be able to clean the data so your results aren’t tainted.
I, like many other students every year, was deeply saddened by not getting into a graduate program. It was a big change of direction, but I am doing my best to be adaptable. There is sort of a silver lining to this – with a year off I had time to explore the world and my interests somewhat freely. I would be much more distraught if I didn’t have this epiphany around when I graduated: the world is a big place, and there are many ways to go about getting what you want in it. And if you want it badly enough, there is probably a way to obtain it.