Exploring COVID-19’s Impact on Activity in New York City
Using taxi, subway and google data to discover trends in Manhattan
As I reach the end of each year, I always like to take some time to reflect…and the final two days of 2020 will be no different. However, 2020 was different, of course, and most insights from this year will be about the impacts of the pandemic. And there are few places where the impacts are more stark than New York City, which suffered the worst outbreak when the virus arrived in the U.S.
I spent much of the last decade in the city and I emphasize last because while I was able to spend time in Manhattan for the first 2.5 months of this decade, I’ve spent a grand total of 39 hours there in the 9.5 months since. I was in NYC on March 11 and March 12, two days that felt like a month as the pandemic’s grip on the nation finally became real. As I walked through midtown on my way out of Manhattan on the afternoon of March 12, the streets still felt as crowded as usual. But norms, behaviors and regulations would change at an incredible pace in the days that followed.
Fortunately, New York City makes a lot of its data publicly available, so my goal is to use transportation data to examine behavioral changes through visualizations. I will focus on Manhattan because this is where some of the most striking changes took place due to its high density of tourists and offices. (And I’m most familiar with the neighborhoods in this borough. And frankly, when folks from Long Island refer to “the city,” they’re talking about Manhattan.)
Examining Yellow Taxi Pickups
The NYC Taxi and Limousine Commission gives access to its monthly trip records, currently updated through June 2020. I used data from yellow taxis because they can pick up passengers all throughout the city (while green taxis are restricted from picking up riders below W110th/E96th St in Manhattan… and thus I decided to also focus on pickups rather than drop-offs).
I am well aware that most people use ride sharing apps to get around nowadays and thus the sample may be biased (older, wealthier). In fact I cannot actually envision a scenario in which I would hail a taxi for myself. However, I still believe this data can act as a signal for movement around the city.
I looked at the decline in yellow taxi rides in Manhattan and broke it down by month and by zone (which is essentially what the TLC uses to designate neighborhoods). I sorted these zones (using only those with at least 200,000 total pickups throughout the eight months in my dataset) from the highest to the smallest decline in year-over-year pickups.
Clearly, the number of taxi rides cratered. You can barely even make out the bars for pickups in April or May 2020. In total, the number of yellow taxi pickups in Manhattan declined by more than 86% from March to June 2019 compared to that time period in 2020. It declined by more than 94% if you eliminate the first two weeks of March in those comparisons. The drop-off seems to be greatest in more touristy neighborhoods (such as SoHo, Little Italy and the Meatpacking District). More office-centric neighborhoods also generally appear in the top half of the list, while mostly residential neighborhoods (many of which are in upper Manhattan) tended to see the most gentle decline.
[I also want to mention that the locations of hospitals (largely because of commuting essential workers) probably contribute to these results as well. For example, this is probably a contributing factor to Lenox Hill’s below-average decline in taxi pickups.]
With travel suspended and offices closed, the observed trends make sense. To better illustrate these differences, I created two groups of neighborhoods:
Residential: East Harlem, Kips Bay, Lenox Hill, Manhattan Valley, Murray Hill, Stuy Town, Sutton Place/Turtle Bay, Upper East Side, Upper West Side, Washington Heights, Yorkville
Work/Tourist: Chinatown, Financial District, Garment District, Little Italy, Meatpacking District, Midtown, Penn Station, SoHo, Times Square/Theater District, World Trade Center
I was then able to more clearly illustrate the contrasting declines in yellow taxi pickups:
This chart also seems to indicate a slightly faster recovery rate in the residential neighborhoods. (Think of some wealthy Upper East Side residents who flocked to the Hamptons or young adults who moved back in with their parents returning earlier than tourists or corporate employees.) This seems like a decent proxy for the activity in these sets of neighborhoods, so businesses in the less-residential areas were probably even worse off.
Since the pandemic only began to alter behavior throughout the course of March, comparing that month with March 2019 becomes a bit more complex. So next I sought to visualize the way in which the pandemic gripped the city in March .
First, it appears that Manhattan pickups declined at a steeper rate than airport pickups. That makes sense since it is more difficult to cancel a flight on short notice than it is to just stay home. (And for those unaware, it is upsettingly inconvenient to travel from JFK or LaGuardia into the city via public transport.)
Meanwhile, it appears that my feeling about “March 12” being a turning point is indeed supported by the Manhattan data. It’s worth noting that the drop-off from the 12th to the 13th is from a Thursday to a Friday, which makes it even more astounding. For reference, a “State of Emergency” was declared in New York on March 7, schools were shutdown on March 15, more regulations were announced on March 16 and a state-wide stay-at-home order was declared on March 20. However, I’m going to stick with using March 12 as an indicator for NYC behavior change in this post.
Next, I looked at Yellow Taxi pickups by time of day a few months into the pandemic:
There did not appear to be an evening rush hour in June 2020. It is fascinating that taxi pickups steadily rose until a peak at 3pm. I suppose there is less of a need to rush out of the house in the morning without a commute and there is not much of a reason to travel at night when nightlife doesn’t exist. Since the MTA shut down the subways from 1–5am, I figured there may be a higher percentage of taxi pickups in the middle of the night (for essential workers), but the data did not support this. Perhaps some of this is negated by the lack of nightlife, but I would expect to see a rise in the percentage of Uber and Lyft rides at this time, even if it does not translate to Yellow Taxis.
Finally, there were a few bits of data I couldn’t pass up exploring. Were taxi passengers less likely to share rides due to Covid?
Yes. Nice social distancing, New York.
I would also conjecture that New Yorkers were using longer taxi rides as a replacement for the subway amidst the pandemic:
Maybe some people walked as a safer alternative to hailing a taxi for a short ride. (Would less drinking at bars alleviate the need for some short taxi rides as well?)
Finally, I was fascinated to find out if New Yorkers were more generous with their tips during the pandemic (to support the taxi drivers) or were they more stingy because of their own difficult circumstances.
It appears to be the latter. Yikes at those average tip percentages in spring 2020. Oh well.
Subway Turnstile Entries
Since the subway is more widely and frequently used than any other form of transportation in New York City (or at least Manhattan), ridership should also tell a story about activity within the city. Unfortunately, parsing through this data can be a lot more daunting. The MTA posts its data in the form of turnstile records. There are six daily checks per turnstile and a wide-ranging number of turnstiles at each station and 472 stations. So, I decided to compare one week in 2019 with one week in 2020.
I chose to look at a week in November in each year because I want to learn more about how New York City is faring in recent weeks (something I cannot yet use the taxi data to find). I chose the third week in November because it does not include a holiday and did not have a major event (like a snow storm) that would cause a disruption.
The dataset was also relatively messy. After using the Open Data Transit Toolkit to help me parse through the data in R, I created a data frame that listed daily subway entries per station. However, there definitely were some incorrect records. (Let’s just say no subway station has ever had 2 billion entries in a day nor has one had a day in November 2020 with more entries than a typical weekday at Penn Station.)
Because I could not figure out the correct entries for these days, I had to make educated guesses. Furthermore, some stations share the same name (there are four “96th St” stations in Manhattan), and many of these station names were not distinct from one another. Nonetheless, this issue did not affect my analysis.
A simple comparison of total subway entries shows about a 70% drop-off from November 16–22, 2019 to November 14–20, 2020. I also broke it down by day of the week:
No surprise that the biggest drop comes Monday-Friday when there is typically the most traffic due to work commutes.
I then aimed to see if the subway data would provide further evidence that supports a more severe decline in activity in areas frequented by white collar workers in Manhattan.
Indeed, each of the first five commuter-centric stations had a significantly steeper decline from a year ago than did the six stations (“96 St” includes three separate stations) in more residential areas. (At least 10% more in each case!)
I also threw in Columbia University’s station to see the impact of my alma mater being almost entirely remote. Well, that station saw the fourth-greatest year-over-year drop for that week in the entire subway system (86.37%) and the greatest drop among stations that averaged at least 15,000 entries per day from November 16–22, 2019!
Using Google Data as Signals
So far I’ve looked at data directly from the city to support claims about the relative reduction in movement throughout Manhattan during the pandemic. But what if I would like to examine specific activities? I sought to find out if I can use Google to help with these insights.
First I want to check if Google can also be a reliable indicator for movement throughout the city. Google has made a worldwide mobility dataset available during the pandemic and you can track movement in regions (and in some cities) throughout the U.S. I took a closer look at the change in activity at transit stations and at workplaces in NYC.
These trends do mirror both each other and the transportation data, which is ideal because they are all essentially measuring the same thing. But could I also measure activity through Google’s search data? I tried to see what would happened if I found the relative popularity of googling “subway time” via Google Trends. I then compared that to the transit station mobility graph:
The pattern was very similar which helps support using Google Trends as a decent signal for activity. The Mobility and Trends data both had the week of April 12th as the minimum for activity (a decline of about 78% from the week of March 1 for Mobility and 85% between the same weeks for Trends). From the weeks of March 1 to June 7, they each had approximately a 65% decline in activity and from the weeks of March 1 to September 20, Mobility had a 50% decrease while search had a 58% drop.
There is no publicly-available mobility data from 2019, but Google Trends had about a 60% drop in search interest for “subway time” between the weeks I used for my November subway data. My calculated drop in subway entries was closer to 70%, but the search data still seems to be a useful signal.
Google Trends allows you to compare the relative popularity of search terms over a given time period. The [day, week or month] the term was most popular gets an Interest value of 100 and if there were no such searches, the value would be 0.
You have to be very careful about how you utilize Google Trends though. I specifically used “subway time” because that is what someone might search if they want to know when the next (and nearest) train is arriving. Whereas a search for “subway” may also take into account the newsworthiness of the subway system. Also, to be clear, many of the “Search Terms” I’m about to examine represent categories of similar searches that would lead to the same results. (For example: Grand Central Terminal also would include searches like Grand Central, Grand Central Station and GCT.)
So what about searches for specific locations in Manhattan that we have previously explored?
The subway stations or locations in more residential neighborhoods tended to have a smaller relative drop in search activity. Now, with search terms as a tool, I can explore trends in more specific types of activity in NYC. Let me begin with the most important activity, eating:
There are a couple of insights to be gleaned here. “Just Salad” and “Sweetgreen” are salad chains that are much more densely located near office buildings and thus they have probably been less frequented. But the appeal of salads may have declined during the pandemic. Most people would prefer to drown their sorrows in pizza or a chocolate chip walnut cookie from the local bakery. The “$15 salad” may have died in 2020.
I am a bit surprised that “Chipotle” interest in NYC has been generally even higher since April. Chipotle is ubiquitous enough that it is found all around the city and it offered free delivery at many points this year, so the lack of a strong decline may make sense. I would guess the spike during the week of Cinco de Mayo has to do with the special promotions that they ran. But this does indicate that a search term’s newsworthiness can partially distort how much of a signal its “Relative Interest” value may be for actual behavior.
But there is no denying that restaurants who relied on office workers suffered to a potentially unrecoverable extent.
Of course, it was a tough year all around the country for the restaurant business. But I’m still amused by the popularity of searching “how to cook:”
The onset of the pandemic was the only major spike in the last five years that did not coincide with Thanksgiving or Christmas. In fact, these holiday spikes were even larger in 2020. (The data comes from the week of January 3, 2016 through the week of December 20, 2020.)
But back to NYC, where the entertainment industry has also bore the brunt of the shutdown.
All Broadway Shows will remain shuttered through at least May 30, 2021 and movie theaters are still closed as well. Theaters were allowed to reopen in the rest of New York state in late October. Maybe the spikes in Interest are slightly higher for movie tickets since NYC residents could travel to the suburbs if they really wanted to risk their health to watch a big screen.
So the big question becomes: what does this all mean for the future of New York City? And I can’t imagine anyone has a great answer to this. But it is not difficult to see that the city will look different in 2021 regardless of the trajectory of the pandemic.
I am most interested by the higher relative interest in “moving companies” in recent months compared to the end of 2019. While I’m sure NYC’s demise is greatly exaggerated (as it always is), there is no doubt that people are fleeing the city. (Hopefully that means rent prices will continue to go down but that is data to explore on another occasion.)
There are still so many questions that are impossible to answer as we close out 2020.
What percentage of workers will return to the office?
Will New York City’s population grow younger?
How many restaurants will be unable to survive and which neighborhoods will be most severely impacted?
Will subway ridership approach its pre-pandemic numbers?
Will New York City elect a mayor that can actually help find solutions to its pandemic-exacerbated issues?
The stories told by the data in 2021 are sure to be captivating.