I read a daily newsletter called TLDR which has been a fun, technically focused email newsletter you can sign up for over at TLDR Signup. In today’s newsletter there was an article by @MosquitoCapital titled I’ve seen a lot of people asking “why does everyone think Twitter is doomed?”. In which this self proported “Semi-retired SRE. Crypto skeptic. I enjoy making things work.” speaks about the technical doom that is forthcoming for Twitter.
It was a very long list of 56 things that could technically go wrong in a big way for a tech company. However it is simply sensational fodder from someone who is trying to be anonymous (no prior post before this one), and many of them seem to be someone who has done things at scale vastly smaller than twitter with a lack of basic understanding of this type of scalable tech.
I used to talk about systems engineers — and the fact that they fall into basically three categories… The first one has a sense of calm when things are perilous because they’re simply blind and ignorant to the threat. The second is the one who is stressed and on edge (which this article reminds me of), because he is very aware of the threat and it places the person into a firefight mode. And the third is the veteran, the who has a deeper understanding of things, the one who stands back and says (hold my beer) and then goes to work.
I spent some time in Wildland Firefighting and saw this occur here as well. You have the rookie who is just thrilled to be a firefighter, but can stumble into a very deadly situation completely unknowingly. There is the captain or supervisor who knows the risk and is often far more edgy. And then there is the overhead who sees this 1 million acre fire and is like, “okay another day at the office.”
It seems like this Mosquito Capital is perhaps a worker bee, who does care and wants to work hard, but sees the boogyman around every corner. Perhaps they are accustomed to having to justify their department or their budget to upper management.
Here are some examples of their ignorance:
Physical issue with the network takes down a DC. I gather Twitter is primarily on-prem, and I’ve seen what happens when a tree knocks out a critical fiber line during a big news event.Twitter Post (link)
Clearly, they have never seen the scale and setup of the Data Centers that Twitter uses. A decade ago I was inside a Twitter Datacenter — they outsourced it. I can imagine either they’re still using the same DC (likely) or they have more robust ones. But in no way would it ever be less than what it was a decade ago… That Data Center was fully N+2 — they had two different utility companies providing power, along with N+2 generators, UPS power generation, and all of that came into each rack at completely redundant power sources. For data (internet) they had 4 different providers all coming into the building using fiber. Of these, two of these were complete fiber loops going north-south. Even their AC system was N+2… And for the record, N+1 is generally considered the gold standard, but this had extra redundancy. Physical security was far better than anything you’ve seen in any movie. Nothing short of a physical instance (a plane going into the building, fire, massive earthquake) would really impact this venue yet… So a “tree” into a fiber line is such an elementary statement it is laughable.
Data loss again. Do you have read-only backups? A bad actor could absolutely try to wipe all your disks. Can they damage your tape drives from software? How long will it take to restore the entire site if everything gets wiped? (Hint: Probably weeks. Best case.)Twitter Post (link)
Again shows the basic lack of understanding of real IT — stuff like this makes me think of a few of my colleagues back in the day. Even then, my medium-sized clients who used tape drives used off-site storage (Datavault I believe was the company). They’d send a courier daily to pick up our encrypted backup tapes daily and take them off-site for storage. For the codebase and essential business data (not the Twitter posts themselves), I would expect that they’re still doing this. As well as distributed storage of the codebase — probably some form of private GIT repository cloned across multiple DCs.
When we combine the thoughts of both of these examples, and like he said “entire site”… I am confident that if you physically destroyed the actual DC that I was speaking about the following is what would actually take place:
Only users in that physical geographical area would be affected.
- A fractional amount of tweets for that day (0.0000001%) would be irrecoverably lost. Basically, anything that was posted from that geographical area and had not yet been set offset yet (we’re talking the number of posts that were sent within a handful of milliseconds).
- Active users would think they lost internet connection (would need to force a reload) if at all.
- Access for those users might be a bit slower because now they need to contact a datacenter that is 20ms further away.
In short — nothing meaningful. Certainly not weeks.
Now I chose those two because they stood out to me, but the following are using a random number generator to choose a few more predictions….
3) Bad code push takes the site down. Preventing this was my day job, and I can tell you that it’s one of the scariest scenarios for any SRE team, much less a completely understaffed and burnt-out one.
Yes, code review and the implications of such. While I do not know of the safeguards that Twitter has in place, I have seen all sorts of variants of this. I do know that there are good ways to safeguard this, as well as lazy ways. I also know that big company that “ough to know better” doesn’t mean that they do. I was coding my own authentication systems back in the late 90s for the web. Back then I knew about the perils of storing credentials in plain text. But yet in 2012 (over a decade later) Yahoo (then a mega company) had a security breach and it was discovered they stored user passwords in plain text. They should never have done that, but they did. Other big companies have done things against industry standards. Because “people do people things” — and laziness is often one of the biggest culprits. Even code reviewers can be the weak link. This is essentially what happened when Solarwinds suffered a hack less than a year ago.
However, I believe the emphasis of this was on the “understaffed and burnt-out” aspect. But the reality is that even if Twitter was to carte blanche cut across the board, everything would downsize, from developers to code review — if you’ve got 50% less coders, then there is less to review. But the reality is that any organization will choose to make logical downsizing choices. They’ll cut more non-critical roles than critical ones.
48) Employee burnout. I know this seems silly in a vacuum, but trust me, this shit is real. Look up the HBR series on it. You will lose critical employees. People you can’t just buy back. No amount of money. People you can’t afford to lose. Not lazy people. Good ones. Great ones.
This is one that I can agree with — but any business supervisor worth their salt should already be anticipating this. More, now than ever, leadership knows that there is employee turnover, greater than ever in the last few years. Yes, the uncertainty of Twitter’s future may accelerate and be a compounding effort. Hard workers and knowledgeable people will be lost. BUT for a company the size and scale of Twitter, the only reason this would have disastrous consequences is that the prior leadership was asleep at the wheel. At this scale, there should be nobody who isn’t replaceable. Everything should be well documented and standardized, with systematic approaches to prevent this from being a big deal.
Again this speaks to the author perhaps being of the mindset of “they’ll be sorry when I leave”… But the reality is that they’ll be glad you left with your toxic mindset. If you are not ensuring that you’re replaceable, then every day more you work for the company is furthering to hurt the company, not help it. They’d be better off with someone who is ensuring the long-term success of the company, not just their own “information is power” mindset. If think that someone else couldn’t do your job, you are part of the problem – and the faster you’re gone, the better.
Even in extremely skilled labor — for example there is a handmade custom boot company in Washington called Whites Boots — well after many, many years, their master bootmaker decided to leave and start his own bootmaking company, called Nicks (I own a pair of their boots and love them!). Well, today both Whites and Nicks are going VERY strong. Even loosing their most talented master bootmaker, who had incredible knowledge and skills did not result in Whites going out of business — because they ensured that he was training up his replacement. And as fate would have it, many years later, Nicks handed the reigns over to a new master bootmaker, Frank — who, you guessed it, went on to startup his own custom boot company. Today neither Nick nor Frank is making boots for the company, but all three companies are thriving. Because they all knew that it could not rely on a single craftsman.
38) Oops! You didn’t hire a content moderation team. Your site is full of very nasty stuff. Everyone leaves because it’s so unpleasant, or (worse for you personally) you get dragged into court for breaking all kinds of decency, piracy, and privacy/harassment laws.
My best guess is that the author is presuming that Musk would fire all of the moderators — otherwise, he is just throwing out a bunch of fear-mongering because Twitter does have a moderation team. This sort of concern isn’t novel, but rather routine. And it does not take a multi-million/billion dollar company to solve these sorts of things — somehow Craigslist resolved this back in 2006 without massive company bloat. In fact, regarding the Equal Housing Protections that Craigslist was sued over, this is a practice that continues to take place (violation of fair housing law) every day on Facebook. Although Facebook did end up in trouble with the DOJ regarding targeting ads in a discriminatory way. Legal moderation is very difficult and is nuanced, and I don’t know of a single tech company that is doing this well across the board. A big part of that is because our laws regarding such things are outdated, and do not take into account the differences between traditional communication means (face-to-face, versus in print, etc), and so there are so many questions about what might be considered illegal discrimination, slander, harassment, privacy, etc., but in the end, Twitter will have to do what every other tech company is doing — which is to basically continue to play whack-a-mole as issues come up. The employee surrounding this issue will likely continue to grow over time. There is some very good information about how Twitter handles Rules Enforcement (link).
Now I do believe that there is a very real threat to employees who are involved in moderation beyond the scope of illegal activity. Looking specifically (but not exclusively) at “false news” type moderation. There is no legal precedent for this, as well as the discretionary banning of users. For example, I’ve yet to see a legitimate argument for the banning of Trump. NOW, on one hand, I do believe some of his tweets were troublesome and definitely controversial. However, any violation of law or the terms of service that people point to as “legitimate reasons” (which there are several!) fall flat because at the same time those exact same terms or laws are broken by others on the same platform who also has a massive following, yet are permitted to keep their accounts. For those employees, if still at Twitter, their days at the bird are likely numbered and should start looking for work. Because I believe that under the current leadership there will be a return to parity and equality under the law. Either it is a censored platform or a free speech platform. Discussions on how “hate speech” is handled will be likely equal protection — not biased protection. Which, unfortunately, is far more difficult than one presumes.
For further reading here is a very interesting article about fake news from 2018: Huge MIT Study of Fake News.
17) Another country is telling you that they want all of your data on their users stored on servers in their country. Do you have policy experts for that country? Do you have a lot of *very* motivated lawyers? Do you have an infra eng who knows how to partition your data just so?
One thing is sure, Musk has a love/hate relationship with layers, but it’s clear that he knows he needs them. But again, this is a situation that presumes (1) mass firings of such people; and (2) that there does not already exist the infrastructure to handle such queries. I believe that these types of requests are commonplace, and this is not new. In the second half of 2018 alone, “Twitter received 6,904 requests for information on 11,112 accounts from authorities across the globe”. (source). As of today, in the second half of 2021 that number is now over 11k requests for the 6 month period. (source). For that same period, here is a screen shot from Twitter regarding the info requests from India alone.
All of this shows that this is something Twitter has been doing and is likely very adapt at handling. They were not this good to be with, but evolved to be so. This will not change — because even if Musk unilaterally fired all lawyers, they’d end up hiring a bunch of new ones. This isn’t a problem that goes away, but it also isn’t an unknown risk. Yes, there is often lawsuits and both wins and losses (mostly losses) for tech companies, but again, this isn’t news or unique. It has been something of a loosing battle for tech companies all along.
One thing that I cannot get away from is the massive sense of catastrophic failure — and if this person had any real experience, this biggest threat is the subtitle, undetected, small things that turn big. The biggest threat isn’t a data center going down because of a tree branch, nor is it a bad code push that causes an immediate break in the system. But rather, it is the employee who leaves a backdoor open, a code push that seems benign but really has a hidden payload for future use, it is an undocumented API that has a vulnerability that leaks information. It is the legacy code that hasn’t been refactored or gone through a modern code review (eg why Yahoo was still storing clear text passwords). But those threats have always existed, and are a threat to any tech company — only now there are perhaps more people who are tempted to do improper things because of their fear of job loss. But our focus should be equally on those employees who are the actual threat to being the perpetrator. Yes, you might be the victim of termination, but retaliation is not therefore justified.
In conclusion — I look at this list of 56 concerns — and the issues they bring up are things that should be considered, but in most cases are no more a threat under new leadership than past leadership. With your comments below I’d be happy to go over any of the other 50 things that I didn’t already discuss, Overall this simply seems to be a well-meaning, but an extremely uninformed individual who does not have sufficient experience working with technology nor businesses of this sort of scale. And certainly no insider knowledge of business acquisitions and mergers. It is either a jaded ex-Twitter employee who was very junior, or perhaps someone else from the industry who is concerned but lacks the actual experience to understand that everything isn’t on fire… In other words, this person is hardly an authority on this subject – but just bloviating on fear instead of facts and reality.
This article reeks of the sentimates of a person who is trying to justify their own job, and saying “this place would fall apart if it wasn’t for me”. But the reality is, that it will certainly not be the same without you — but the real question is — will it be different in a better or worse way. In every situation where I’ve encountered people with this attitude, it was always a LONG TERM WIN for them to leave. The short term might have been difficult, but overall a benefit that they left — and sooner would have been better.Jason Olson