Why I’m Leaving San Francisco (meta study)

we’ve all read at least 2 or 17 blog posts by someone who is leaving San Francisco for greener pastures.

sadly i failed to write my own retrospective when i left in 2016 after a year of working in venture capital, then a portfolio startup, and then founding my own company.

but recently these posts have become… overplayed? obligatory? even self-loathing at times, in that authors seem to feel guilty about writing them.

this weekend i decided to explore what’s happening through a Very Unofficial and Statistically Insignificant meta study. here goes.

methodology

first i found public blog posts written by someone either in the process of moving from San Francisco, or recently moved. no “my friend moved” heresy or “i think i want to move” speculation.

to find these posts i used Google’s Site Search feature to dig into 3 popular publishing platforms: Medium, WordPress.com, and Quora.

site:medium.com "leaving san francisco"

i used a free Chrome extension, MozBar, to download 24 pages of results for keywords including:

  • leaving san francisco
  • left silicon valley
  • goodbye san francisco
  • moving away from silicon valley
  • etc etc

for good measure i swam into the deep end of search results, aka Pages 5-10, for self-hosted posts by tech-bro.com/ciao-sf.

sample size

after merging ~250 results into a Google Sheet i confirmed which stories were relevant by manually clicking each link. some posts were just vacation recaps or students reflecting on a semester abroad.

after scrubbing we hit n=137. for a copy of all the data go here.

leaving silicon valley metadeta

before we can analyze “why i’m leaving San Francisco” stories for interesting patterns, we need more context.

i decided to grab a few things from each post:

  • title
  • meta description
  • language
  • published at (timestamp)
  • word count

oh, and every single word, minus “5.3k claps” or “182 comments” type junk.

to do this i first considered HTTP parsing libraries like Mechanize and Nokogiri, however Quora and increasingly Medium use JavaScript to plant elements, e.g. article timestamps, after DOM ready.

Ruby Watir FTW. i wrote a ScraperOfLove utility that spun up a headless browser, visited each url, extracted the attributes above, and spit it out into a new CSV (“stories_clean” tab in data dump).

assumptions

for White Claws and giggles i made 3 predictions before crunching the numbers:

  1. people leave San Francisco because it’s expensive, dirty, and dangerous
  2. people leave because they get burned out at work
  3. people leave because they fail to build a big company

take a moment and make your own predictions before moving on, or just keep reading because this is not that serious.

stats about leaving San Francisco

i don’t know anything about math. let’s start with a few basics.

“custom” == custom website, e.g. a self-hosted WordPress. i tried more CMS site searches like “ghost.io” but it appears they don’t index non-vanity URLs.

*2020 is a project; n=6 stories publiished before February 3

this is unfair (correlation vs causation) because blogging, particularly on Medium, has exploded since 2016. still makes me snicker.

“keyword” == search query by which i found the post on Google. there is obviously some overlap here.

trend analysis

the basics suggest two things:

  1. people are increasingly leaving San Francisco
  2. Medium.com, an unofficial non-profit, is the best place to write “your truth” for maximum exposure

but what else can we infer? i want to know why people are leaving.

i combined all 127,252 words (raw here) and with another line of Ruby created a word frequency map:

{"the"=>5162, "to"=>3880, "and"=>3657, ...}

to remove filler words i eyeballed a sorted version and extracted only those words with 5-125 occurrences. i plugged them into MagicCloud and created this totally useless illustration:

a better solution would be some ML or sentiment analysis that scans entire phrases. or at least some regex. but i ain’t got time fo’ dat.

so i threw the frequency map in a Google Sheet and looked for patterns the old fashioned way: with my bare hands (+ =SUMIF()).

i grouped my queries by theme:

  • cities – where are people going to, or coming from?
  • states – same as cities
  • regions – (Northwest, East Coast, etc)
  • titles – what roles did these people have?
  • topics – lifestyle preferences, current issues

here’s what that looks like… word on the left, # mentions on the right.

cities / states / regions
without Real Data Science it’s impossible to understand where people are moving to, but Portland’s disproportionate mention count && smaller population suggest San Franciscans have a soft spot for the “city where young people go to retire.”

NYC most popular mention by far
anecdotally the SFO > NYC move is quite popular, and exactly what i did in 2016. for the bored my entire career: ATL > NYC > SFO > NYC > ATX.

jobs and roles
i was surprised at how seldom “junior” was written, e.g. “jr developer.” either employees stick around awhile before calling it quits on San Francisco, or they lie about their job title.

reasons people leave San Francisco

here we look at just the “topics” results from our frequency word map. after reading at least 20 of these posts as a normal human, i observed they usually go something like this:

  1. moved here to do Y
  2. Z happened which changed my Y
  3. tried A and then B but now i’m C
  4. going to try and do Y somewhere else

by consolidating the topics into categories i think we get a peek into narratives #2-3. here i’ve done just that.

check how i’ve grouped each topic into 1 of 4 categories here (visuals tab).

now, leaving SF on account of it being expensive is obviously a sound reason. but according to Investopedia, NYC is still the #1 most expensive city in the USA. yet with NYC as the leading “city” mention in our story collection, it sort of makes you wonder…

if you’re leaving San Francisco because it’s expensive, why would you move to NYC, where it’s even more expensive?

this didn’t sit right with me.

caveats to “goodbye San Francisco” stories

my personal opinion, backed by absolutely no data, is that social issues like homelessness, poor sanitation, harassment, and lack of safety is the real culprit to San Francisco’s mass exodus. but when i tried to find this in the content (searching “piss” or “dirty”, for example) i got near nil.

maybe techies don’t want to speak poorly of a city that made their career. or maybe we’re in denial. you decide.

another insight from category aggregation is how highly “politics” and namely, Trump, were included in someone’s written decision to leave San Francisco.

by no means am i claiming anyone needs to like the president, but California and San Francisco in particular are run by some of the most progressive democrats in our nation’s history. thus i’m curious how conservatives or Trump have anything to do with one’s departure.

finally, failure. i grouped “funding” and “unicorn” and “equity” (among others) into the success at work category. given 90% of startups fail, i find that just 15% of leave-reasons being related to professional success to be a little low, slash unbelieveable.

that said, i’ve also met engineers (moreso than marketers) who’ve lived in the valley 10+ years, jumping from startup to startup, each running out of money on a near perfect 18 month schedule.

summary

if there’s one thing magical about San Francisco it is this: you can move there, fail over and over again, never “make it,” and never leave either.

if you are smarter than me and want to do something with the data, please go ahead.

to add an article i missed to the index, tweet at me. to complain about my arithmetic, leave a comment.