A new and innovative way for Google to kill your SaaS startup

If you are here in a panic because Google Safe Browsing has blacklisted your website or SaaS, skip ahead to the section describing how to handle the situation. There's also a lot of very interesting comments on the Hacker News comments page.

In the old days, when Google (or any poorly tuned AI that Google unleashed) decided it wanted to kill your business, it would usually resort to denying access to one of its multiple walled gardens, and that was that. You've probably heard the horror stories:

I swear I have already checked the FAQ!

They all fit the same mold. First, a business, by choice, uses Google services in a way that makes its survival entirely dependent on them. Second, Google, being the automated behemoth that Google is, does its thing: it ever so slightly adjusts the position of its own butt on its planet sized leather armchair, and, without really noticing, crushes a myriad of (relatively) ant-sized businesses in the process. Third, and finally, the ant-sized businesses desperately try to inform Google that they are being crushed, but they can only reach an automated suggestions box.

Sometimes, the ant-sized CEO knows a higher up at Google because they were college buddies, or the CTO writes an ant-sized Medium post that somehow makes it to the front page of Hacker News mound. Then Google notices the ant-sized problem and sometimes deems it worthy of solving, usually for fear of regulatory repercussions that the ant revolution might entail.

For this reason, conventional ant-sized wisdom dictates that if possible, you should not build your business to be overly reliant on Google's services. And if you manage to avoid depending on Google's multiple walled gardens to survive, you will probably be OK.

All this flat blue surface with a cool red roof thing! So convenient!

In today's episode of "the Internet is not what it used to be", let's talk about a fresh new avenue for Google to inadvertently crush your startup that does not require you to use Google services in any (deliberate) way.

Did you know that it's possible for your site's domains to be blacklisted by Google for no particular reason, and that this blacklist is not only enforced directly in Google Chrome, but also by several other software and hardware vendors? Did you know that these other vendors synchronize this list with wildly variable timings and interpretations, in a way that can make fixing any issues extremely stressful and unpredictable? Did you know that Google's ETA for reviewing a blacklist report, no matter how invalid, is measured in weeks?

This is now your website or SaaS application

This blacklist "feature" is called Google Safe Browsing, and the image here depicts the subtle message your users will see if one of your domains happens to be flagged in the Safe Browsing database. Warning texts range from "deceptive site ahead" to "the site ahead contains malware" (see here for a full list), but they all share an equally scary red background design, and borderline impossible UI for people to skip the warning and use the site anyway.

The first time we experienced this issue, we learned about it from a surge of customer reports that said that they were seeing the red warning page when trying to use our SaaS. The second time, we were better prepared and therefore had some free time to write this post.

For context, InvGate (our company) is a SaaS platform for IT departments that runs on AWS with over 1000 SME and enterprise customers, serving millions of end users. This means our product is used by IT teams to manage issues and requests from their own users. You can imagine the pleasant reaction of IT Managers when suddenly their IT ticketing system starts displaying such ominous security warnings to their end users.

When we first bumped into this problem, we frantically tried to understand what was going on and learning how Google Safe Browsing (GSB from now on) worked while our technical support team tried to keep up with customers reporting the issue. We quickly realized an Amazon Cloudfront CDN URL that we used to serve static assets (CSS, Javascript and other media) had been flagged and this was causing our entire application to fail for the customer instances that were using that particular CDN. A quick review of the allegedly affected system showed that everything appeared normal.

While our DevOps team was working in full emergency mode to get a new CDN set up and preparing to move customers over onto a new domain, I found that Google's documentation claims that GSB provides additional explanations about why a site has been flagged in the Google Search Console (GSC from now on) of the offending site. I won't bore you with the details, but in order to access this information, you have to claim ownership of the site in GSC, which requires you to set up a custom DNS record or upload some files onto the root of the offending domain. We scrambled to do exactly that and after 20 minutes, managed to find the report about our site.

The report looked something like this:

That's… not particularly useful.

The report also contained a "Request Review" button that I promptly clicked without actually taking any action on the site, since there was no information whatsoever about the alleged problem. I filed for a review with a message noting that there were no offending URLs listed, despite documentation indicating that example URLs are always be provided by Google to assist webmasters in identifying issues.

Great! Requesting a review of an invalid report can cause my future reviews to be even slower.

Around an hour later, and before we had finished moving customers out of that CDN, our site was cleared from the GSB database. I received an automated email confirming that the review had been successful around 2 hours after that fact. No clarification was given about what caused the problem in the first place.

Over the week that followed this incident, and despite having had our URL cleared from the Safe Browsing blacklist, we continued to receive sporadic reports of companies having trouble to access our systems.

Google Safe Browsing provides two different APIs for both commercial and non-commercial software developers to use the blacklist in their products. In particular, we identified that at least some customers using Firefox were also running into issues, and both antivirus/antimalware software and network-wide security appliances from customers were also flagging our site and preventing users from accessing it many days after the issue had been resolved.

We continued to move all the customers off the formerly blacklisted CDN and onto a new one, and the issue was therefore resolved for good. We never properly established the cause of the issue, but we chalked it up to some AI tripping on acid at Google's HQ.

My 2 cents: If you run a SaaS business with an availability SLA, getting flagged by Google Safe Browsing for no particular reason represents a very real risk to business continuity.

Sadly, given the oh-so-Googly opacity of the mechanism for flagging and reviewing sites, I don't think there is a way you can fully prevent this from happening to you. But you can certainly architect your app and processes to minimize the chances of it happening, lower the impact of actually being flagged, and minimize the time needed to circumvent the issue if it arises.

Here are the steps we are taking, and I therefore recommend:

  • Don't keep all your eggs in one basket, domain wise. GSB appears to flag entire domains or subdomains. For that reason, it's a good idea to spread your applications over multiple domains, as that will reduce the impact of any single domain getting flagged. For example: company.com for your website, app.company.net for your application, eucdn.company.net for customers in Europe, useastcdn.company.net for customers in the US East coast, etc.
  • Don't host any customer generated data in your main domains. A lot of the cases of blacklisting that I found while researching this issue were caused by SaaS customers unknowingly uploading malicious files onto servers. Those files are harmless to the systems themselves, but their very existence can cause the whole domain to be blacklisted. Anything that your users upload onto your apps should be hosted outside your main domains. For example: use companyusercontent.com to store files uploaded by customers.
  • Proactively claim ownership of all your production domains in Google Search Console. If you do, that won't prevent your site from being blacklisted, but you will get an email as it happens which will allow you to react quickly to the issue. It takes a little while to do, and it's precious time when you are actually dealing with an incident of this sort that is impacting your customers.
  • Be ready to jump domains if you need to. This is the hardest thing to do, but it's the only effective tool against being blacklisted: engineer your systems so that their referenced service domain names can easily be modified (by having scripts or orchestration tools available to perform this change), and possibly even have alternative names available and standing by. For example, have eucdn.company2.net be a CNAME for eucdn.company.net, and if the first domain is blocked update the configuration of your app to load its assets from the alternate domain by using a tool.

Here's what I would recommend:

  • If you can easily and quickly switch your app to a different domain name, that is the only thing that will reliably, quickly and pseudo-definitively resolve the incident. If possible, do that. You're done.
  • Failing that, once you identify the blocked domain, review the reports that appear on Google Search Console. If you had not claimed ownership of the domain before this point, you will have to do it right now, which will take a while.
  • If your site has actually been hacked, fix the issue (i.e. delete offending content or hacked pages) and then request a security review. If your site has not been hacked or the Safe Browsing report is nonsensical, request a security review anyway and state that the report is incomplete.
  • Then, instead of waiting in agony, assuming that downtime is critical for your system or business, get to work on moving to a new domain name anyway. The review might take weeks.

The second time around, months after the first incident, we received an email from the Search Console warning us that one of our domains had been flagged. A few hours after this initial email report, being a G Suite domain administrator, I received another interesting email, which you can read below.

The "sc" in sc-noreply@google.com stands for "Search Console"

Let me summarize what that is, because it’s quite mind blowing. This email refers to the Search Console blacklist alert emails. What this second e-mail says is that G Suite’s automated phishing e-mail filter thinks Google Search Console’s email about our domain being blacklisted is fake. It most certainly is not, since our domain was indeed blacklisted when we received the email. So Google can’t even decide whether its own email alerts about phishing are phishing. (LOL? 🤔)

It's very clear to anyone working in tech that large corporate technology behemoths are to a great extent, gatekeepers of the Internet. But I tend to interpret that in a loose, metaphorical way. The Safe Browsing incident described in this post made it very clear that Google literally controls who can access your website, no matter where and how you operate it. With Chrome having around 70% market share, and both Firefox and Safari using the GSB database to some extent, Google can with a flick of a bit singlehandedly make any site virtually inaccessible on the Internet.

This is an extraordinary amount of power, and one that is not suitable for Google's "an AI will review your problem when and if it finds it convenient to do so" approach.

Entrepreneur, investor and advisor.