When it comes to jumping on trends, I think the sweet spot is that point where you get the benefits of being a relatively early adopter, but the assurance that comes from learning from those who have gone before you. That’s pretty much where we are with the trend of moving to an HTTPS-only world. At some point HTTPS might become so standard that it’s simply the price of admission for doing business on the web, but in the short term, I see it as a positive differentiator for your brand that might even give you a little extra SEO mojo.
Migrating your entire website from HTTP to HTTPS really isn’t that hard when it comes down to it. The redirect itself can be as simple as placing a couple lines of code. But getting it wrong can seriously hurt your search engine traffic, and even a well-conceived plan can hit a few snags as it rolls out. Migrating to HTTPS is definitely not something you want to rush into without careful planning.
Whether you’re ready to pull the trigger on redirecting your site from HTTP to HTTPS, or you’re just considering it and want to understand what’s involved, this guide will serve as a valuable resource.
HTTPS vs HTTP: What’s the Difference?
HTTP is the protocol used for communication on the web. HTTPS is HTTP over TLS, or transport layer security. So HTTPS is the same basic protocol HTTP, but with a secure, encrypted layer. TLS is the latest secure protocol that replaced SSL, so you may still hear a lot of people referring to SSL, or secure sockets layer.
When a site is using HTTPS, you will see it right in the address bar of your browser. In addition to the URI beginning with “https,” you’ll often see a lock icon and perhaps even a colored bar (depending on the certificate type) indicating your connection is secure. Here’s what an HTTPS connection looks like in Chrome when I visit PayPal:
That green area with the lock icon tells you the site’s certificate is valid and its identity has been verified by a trusted third party. If you click on it you can find more detail in the Security Overview:
Are We Really Headed Toward “HTTPS Everywhere”?
A lot of people think HTTPS is overkill for informational sites that don’t handle transactions or transmit sensitive data, plus it can add additional cost and technical complexity, challenges that are hard to absorb for small businesses in particular. I think over time those concerns will be alleviated as HTTPS becomes more standard. In fact, here are some of the main reasons I think we’re going to see HTTPS pretty much everywhere within the next few years.
Google Is Pushing Hard For It
Google called for “HTTPS Everywhere” at Google I/O at the end of June, 2014, just a couple weeks before the announcement that HTTPS would be used as a ranking signal. There’s some good information in the video if you haven’t seen it, and Googlers Pierre Far and Ilya Grigorik clearly lay out Google’s case for going HTTPS:
Regarding the notion that HTTPS is only relevant for transactional sites or sites that otherwise transmit sensitive data, Grigorik, a developer advocate for the Chrome team, makes this point: “while it seems like, individually, the metadata that is available that you can gather by looking at these unencrypted sites is benign, when you actually put it all together it reveals a lot about my intent and it can actually compromise my privacy.” If you have any doubt about what a vivid picture can be painted by your seemingly innocuous web activity, try checking out Google Dashboard and My Activity and see for yourself how much it can reveal about your life.
Google has been beating the drum for moving to “HTTPS Everywhere” consistently since making announcement at Google I/O 2014. I’ve recently even noticed Google going so far as to lend their credibility and resources to other product vendors in an effort to create an army of HTTPS Everywhere evangelists. The push has been effective, too. Dr. Pete Meyers at Moz suggests that Google is “fighting, and winning, a long war,” with more than 30% of page-1 search results now using HTTPS.
Secure Certificates Are Easy to Get…and Free
Let’s Encrypt is a new Certificate Authority (CA) that offers free secure certificates. They’re pushing for getting the web to 100% HTTPS, and they’re making some pretty good progress. They’ve issued more than 5 million certificates since they launched in December 2015, helped in large part by large scale deployments from companies like WordPress.com, Akamai, Shopify, Dreamhost, and Bitly. The chart below shows their impressive growth during the first half of 2016.
Chrome Will Start Shaming Websites Into Migrating
Future versions of Chrome will likely draw attention to unencrypted websites by placing a red “x” over the little padlock icon in the address bar. Although there’s been no official announcement on this from Google, a Google employee (who wished to remain anonymous) told Motherboard that this feature would soon become the default in Chrome.
This is already a feature in Chrome, although it’s not set by default. If you want to see what that looks like now, you can turn it on now by typing “chrome://flags” in your Chrome browser, navigate to “mark non-secure as,” and select “mark non-secure origins as non-secure.” I have this running in my browser now and I can tell you, small as it may be, it really catches my eye every time I go to a non-secure site. If this becomes the default, it will have an impact.
Dark Traffic Will Rapidly Increase for Non-HTTPS Sites
If a site is using HTTPS and it refers traffic to a site that’s not using HTTPS, the HTTP Referrer header is lost (assuming the HTTPS site hasn’t taken specific measures to pass the info) and the site that receives the traffic has no idea where that traffic came from. This is called Dark Traffic, and it’s a growing problem for large publishers, in particular those who rely on content syndication for a significant portion of their traffic. It’s critically important for large publishers to identify the sources of that Dark Traffic. So much so that the Guardian notably created a sophisticated in-house analytics tool call Orphan to better attribute dark traffic to known sources.
The more sites that move to HTTPS, the the darker the referrer data will become for non-HTTPS site, which will create a domino effect that increases momentum toward HTTPS-enabled sites. Anecdotally, some of our publisher clients are seeing about 25% to 30% Dark Traffic, although recently we’ve see large sites like Yahoo and LinkedIn take measures to open up their referrer data after moving to HTTPS themselves.
Even the U.S. Government is Going All-in with HTTPS
When I think of the U.S. Federal government, terms like “efficiency” and “early adopter” don’t immediately come to mind. But in June of 2015, The White House Office of Management and Budget ordered all government agencies to confirm to the HTTPS-Only Standard directive, requiring that all publicly accessible Federal websites and web services only provide service through a secure HTTPS connection by the end of 2016. They even set up a website to provide complete transparency on their progress. At the time of this writing, it appears that about 45% of government websites have moved to HTTPS-only, so they still have a long way to go:
Heavyweights in the private sector are taking the HTTPS Everywhere trend seriously, too. Mozilla is phasing out non-secure HTTP, Apple is pushing all app developers to use HTTPS, and a growing list of companies are supporting initiatives like Encrypt All The Things, which have sprung up due to increasing concerns about government mass surveillance.
The Business Case for Moving to HTTPS
Trends that have support of the government and some big, influential companies are interesting, but you can’t take action on a resource-intensive initiative without having concrete business reasons for jumping on board. Here are some of the biggest reasons you should consider migrating to HTTPS:
- Security: The biggest reason to switch to HTTPS is to prevent malicious attackers from compromising sensitive information or performing other malicious acts. HTTPS ensures the servers involved in a communication are who they say they are, so it’s a good defense against “man-in-the-middle” attacks, ad injection malware, or traffic diversion code, for example.
- Be a Good Web Citizen: Building on #1, by making your own website more secure and removing easy targets for would-be hackers, you make malicious activities less lucrative. Over time, that makes the web a safer place for everyone.
- Privacy: By piecing together the various metadata generated by your user’s web activity, hackers (or perhaps government agencies), can develop a revealing profile of those users’ behaviors, intent, and habits.
- Recapture Referral Data: As more of the web moves to HTTPS, move of you referrer data will be lost and counted as direct traffic by default. Moving to HTTPS will allow you to recapture lost referral data from sites that have already moved to HTTPS.
- Brand Image: If trust is an attribute you want associated with your brand, displaying that little green bar in the address bar like PayPal could be important. If you’re a company selling an email encryption service, for example, demonstrating that you take security seriously on your website, as well as your products, will make a better impression with your current and prospective clients.
- Ranking Boost: Notice this is the last item I mention. Frankly, I’m not seeing a ton of data, either firsthand or through case studies, that shows evidence of a meaningful ranking boost as a direct result of moving to HTTPS. SearchMetrics did a study that showed a correlation between moving to to HTTPS and positive rankings increases for 30,000 keywords they monitor, so I do believe the signal is there, lightweight as it may be. However, I’ve also seen a number of cases where sites are taking at least a temporary hit in organic search traffic post-migration. Nonetheless, Google has the ability to crank up the dial on this particular ranking signal as adoption increases, and they’re pushing pretty hard to get site owners to make the change. Bottom line: if you’re moving to HTTPS to get a ranking boost, just understand that the benefit may not be realized right away.
HTTPS Migration Checklist
The steps below should give you a good idea of the scope involved with moving your site to HTTPS, not to mention some peace of mind since it’s been used to plan HTTPS migrations on some large sites. I based the overall outline on the helpful document written by Chris Palmer from the Google Chrome team, with some additional sections and detail to ensure an SEO-friendly migration. You should also reference Google’s site move documentation, as much of that will apply here as well.
1. Get and Install Certificates
☑ Buy a 2048-bit TLS/SSL SHA-2 secure certificate from a Certificate Authority (CA)
☑ Generate some documents so that the CA can issue a signed certificate
☑ Send the CA what they need (your public key and certificate signing request)
☑ Install certificates on your servers
2. Enable HTTPS on Your Servers
☑ Configure your server for HTTPS. Check out these configuration tips for popular servers.
☑ Test properly functioning using an external testing tool. Here’s a good one.
☑ Set a reminder to update your secure certificate before it expires.
3. Code & Configuration Changes
☑ Update site content to request https resources
☑ Update internal links to point to https pages or consider making internal links relative
☑ Use protocol relative URIs. Example: <script src=”//example.com/script.js></script> (see note below)
☑ Add self-referencing rel canonical tag to every page, pointing to your HTTPS URIs
☑ Change all Ad calls to work with HTTPS
☑ Update any internal tools, such as Optimizely or CrazyEgg, to work with HTTPS
☑ Update legacy redirects to eliminate chained redirects (see note below)
☑ Update OpenGraph, Schema, Semantic markup etc. to point to HTTPS
☑ Update social sharing buttons to preserve share counts
4. Robots.txt, XML Sitemaps, Search Console and Analytics
☑ Create and verify a new property for the HTTPS site in Google Search Console
☑ Create a new XML sitemap file that points to your HTTPS URLs and upload it to the new property in Search Console
☑ Create a new robots.txt file for the HTTPS site and copy over all existing rules. Include a Sitemap link to the new HTTPS XML sitemap.
☑ Remove all rules from the HTTP robots.txt file, except for the Sitemap link, and leave it in place. This is to encourage bots to crawl and follow all redirects.
☑ Copy any existing disavow file and upload it to the new HTTPS property in Search Console
☑ If you’re a Google News publisher and use a News XML sitemap, I recommend updating your existing sitemap with the HTTPS URLs and notifying the Google News team of the change.
Note: Don’t use the “Change of Address” feature in Google Search Console. That’s used for migrations to new domains.
5. Redirect HTTP to HTTPs
☑ Deploy the redirect code
Redirect HTTP to HTTPS on IIS (7.x and higher)
Redirect HTTP to HTTPS on Apache
Redirect HTTP to HTTPS on Nginx
☑ Include exceptions to any global redirect directives for your existing robots.txt and XML sitemap files
6. Follow-Up (after the release)
☑ Use a tools, like SSL Check, to scan your site for non-secure content
☑ Check HTTPS redirects and legacy redirects to ensure they work correctly. Check for long redirect chains using a tools that captures the header responses (I like Redirect Path by Ayima). Check for proper redirect functionality from both www and non-www, with and without trailing slashes, etc.
☑ Use “Fetch as Google” tool and submit your Home page and other key pages to speed up the indexing process. I use the “Crawl this URL and its direct links” option.
☑ Monitor the Index Status report in Search Console. The HTTP property should eventually go to zero, and the HTTPS should increase. Take this a step further by calculating the indexation rates of each XML sitemap and monitor them over time.
☑ Monitor the Crawl Errors report in Search Console and address errors, as appropriate
☑ When most new (HTTPS) URLs are already indexed, remove the legacy sitemap link from Robots.txt
☑ Update incoming links that are within your control to point to HTTPS (eg. links to your site from social media profiles)
7. Turn on Strict Transport Security (HSTS)
☑ Once you’re absolutely sure the entire site is working with HTTPS, use HSTS to improve performance by ensuring the browser “remembers” to send all requests to your site to https based on a policy you set. Keep in mind that this means your site will only use HTTPS, so make sure it works! (see note below)
Consider Moving in Sections
There’s no reason you have to redirect your entire site to HTTPS all at once. For very large sites, it may be wise to redirect specific sections of your site and assess the impact before redirecting all pages. There will be slight differences in terms of your redirect rules, but it may be well worth taking the extra time to get a sense of how your search referrals will be impacted.
Protocol Relative URLs
I included Google’s recommendation to use protocol relative URLs above and I get why they recommend it, but I actually have a different recommendation for you to consider. Although relative URLs make web development easier by greatly reducing those mixed content messages and make it much easier to move your HTTPS staging site into production, it goes against the guidance I’ve given for a long time, which is to use full, absolute URLs. Why do I recommend that? Because so often I’ve encountered situations where a canonicalization issue on a given URL produces a domino effect for the links on the corresponding page since relative links can inherit whatever canonicalization issues appear in the address bar.
Additionally, Paul Irish, a front-end developer on the Google Chrome team, had this to say about protocol relative URLs:
Now that SSL is encouraged for everyone and doesn’t have performance concerns, this technique is now an anti-pattern. If the asset you need is available on SSL, then always use the https:// asset.
You’ll have to decide what’s the best approach for your situation, but I still feel there’s a strong argument to be made for using full, absolute URLs throughout your site.
In an ideal world, all redirects would only have a single hop. For sites with lots of legacy redirects, this might not be feasible. My recommendation is to handle the HTTPS redirect first since that affects all URLs. Then evaluate your legacy 301 redirects. Update legacy 301 redirects to point directly to HTTPS targets as resources allow. The highest priority redirects are any that would result in more than 3 hops as Google may stop following them. From there, I’d consider updating legacy redirects that were most recently added.
And here’s a very important point about implementing blanket redirects to HTTPS: If you choose to maintain separate robots.txt files for the HTTP and HTTPS versions of your site, or if you’re going to maintain your old XML sitemaps as described in the guidance above, you need to take those into account when creating your redirect rules. If you put a blanket redirect from HTTP to HTTPS at the server or CDN level, it would affect your robots.txt and XML sitemaps too, unless you take measures to exclude them. Here’s a good example of how to exclude these files from a global redirect via the htaccess file in Apache. Use that same basic principle to exclude these files using other redirect methods.
HSTS can significantly improve performance of your HTTPS site by ensuring that browsers always use HTTPS for future visits. It can really cause problems, however, if you also need to be able to support regular HTTP requests. As Matt Cutts tweeted, if you turn on HSTS and “your website doesn’t serve only HTTPS, you’re going to have a bad time.” Also, if you have HSTS set to include subdomains, make sure your subdomains are actually using a secure certificate or Chrome will not let you access them.
Are You Ready to Move to HTTPS?
If you’re running a small WordPress site, most of the above considerations are probably overkill; you might be able to redirect your entire site to HTTPS in a matter of hours. But if you’re running a major website that’s critical to your organization’s mission, transitioning to HTTPS requires careful planning and resources allocation. I hope the above has given you a sense of what’s involved in the migration process, and I hope the HTTPS migration checklist above proves actionable and comprehensive. If you have any questions on any of the items I’ve included, or you feel I’ve missed an important step, please share in the comments below.
6 thoughts on “SEO-Friendly HTTPS Migration Guide (Includes Checklist)”
Hi Jim, congratulations and thank you for such a well structured and comprehensive guide!
It has become my personal checklist to properly migrate all my clients sites from HTTP to HTTPS. Now I can be sure I am doing things the right way from both the technical and SEO perspectives.
Hi Diego. Thanks for the positive feedback and I’m glad you found it helpful.
I just switched overy to SSL from [removed]. I didn’t do the robots.txt file though. How do I create a new one and what do I add inside it? “Create a new robots.txt file for the HTTPS site and copy over all existing rules. Include a Sitemap link to the new HTTPS XML sitemap. ”
Great article Jim!
Especially loving the inclusion of the detailed migration checklist.
Keep up the awesome work!
I’ve cut over my Drupal site from http to https only. However, I am having problems with removing remnants of the indexed http links in Google’s search results.
For the time being, I want to create a custom robots.txt for http-only content, whereby Google Search will only see the paths I want (that I want removed from the index), and that https-only content will include the paths I want indexed (including the sitemap). How should I write my .htaccess file so that Google Search will see two different robots.txt files? Does enabling HSTS have any bearing on redirection in the .htaccess file?
Hi Tom. There are a number of ways to dynamically serve robots.txt depending on protocol. One way is to have your htaccess file treat requests for robots.txt as robots.php so you can add some conditional logic. Here’s a good example: https://gist.github.com/mspivak/4165764. Once you do that you can serve two separate versions.
I generally recommend not blocking anything in the HTTP version so bots will find and follow all redirects to the HTTPS version. Strategically blocking paths so Google can focus on the URLs that haven’t switched over yet might work, but I’ve never considered taking things that far. I figure as long as pages are properly redirecting to HTTPS, they’ll switch over in the index eventually.
Using HSTS will ensure that browsers will no longer request HTTP versions for whatever max-age setting you enter. Since the redirect has to be followed in order to receive the HSTS response header, I don’t think it has any bearing.
Comments are closed.