Google has just added a “JavaScript SEO basics” page to its Search developer’s guide. This covers useful general descriptions of how Google processes JavaScript (JS) as well as a range of best practices.
The decision comes at a point when there is wide consensus that JS has become such a hugely potent part of the web platform that it is high time that digital content providers were able to make their JS-powered web applications easily discoverable by Google Search.
Gone are the days when static webpages ruled the web – today, everyone wants the power of JS to make timely content updates and a raft of other attention-grabbing features such as interactive maps, animated 2D/3D graphics, scrolling video jukeboxes, and so on.
The new page clearly explains how Google processes JS web apps: if you weren’t aware, it performs three phases of processing: Crawling, Rendering and Indexing.
Essentially, Googlebot queues all the pages it has explored for rendering. When Googlebot fetches a URL from this crawling queue, it makes an HTTP request to determine whether you allow crawling by reading the robots.txt file. All pages will be available for rendering unless they include a robots meta tag or header that instructs Googlebot not to index the page.
To spare developers any bafflement when trying to use JS to change the robots meta tag (it sometimes doesn’t work the way you expect it to), the guide states:
“Googlebot skips rendering and JavaScript execution if the meta robots tag initially contains ‘noindex’. If you want to use JavaScript to change the content of the robots meta tag, do not set the meta tag’s value to ‘noindex’.”
As soon as Googlebot’s resources allow, a headless Chromium renders the page and executes the JS, parsing the rendered HTML for links again and queuing the URLs it discovers for crawling. Pre-rendering like this ends up making your website faster for both crawlers and human users.
It’s crucial, however, that developers should write compatible code and familiarise themselves with Googlebot’s API, so Google includes a link on the page taking you to its “Guidelines for troubleshooting JavaScript problems” section.
HTTP status codes, the guide explains, let Googlebot know that a page has moved, or tell it not to crawl or index that particular page.
A very welcome addition to Google’s repertoire.