Offline support with service worker#978
Conversation
|
Why not use sw-toolbox ? ... which provides a declarative way of handling routes |
| var swOptions = { | ||
| debug: true | ||
| } | ||
|
|
There was a problem hiding this comment.
It'll be good to use a versioned cache name.
const VERSION = 1;
toolbox.cache.name = "Babel-Cache-" + VERSION;There was a problem hiding this comment.
Also,
toolbox.cache.maxEntries = 20 // or 15There was a problem hiding this comment.
Yes, I will add cache versioning.
What do you think about a max age ? Like 7 days ?
I don't know if limiting entries is a good idea. I pre-cache already all pages.
There was a problem hiding this comment.
TL for sw-toolbox and sw-precache here. Great to see you using it :)
If you're going to set maxEntries to 20, it means that after the 21st entry is cached, the least-recently used entry would be automatically deleted. Unless you're storing a number of really large assets, I'd consider whether this is really that valuable. I don't think the Babel site would benefit from setting this all that much.
I do think there's use in imposing a maxAgeSeconds age. @boopathi can probably comment to the length of time, but a week doesn't sound like a bad ballpark.
|
|
||
| toolbox.precache(preCachedRessources); | ||
|
|
||
| toolbox.router.get('/(.*)', toolbox.cacheFirst, swOptions); |
There was a problem hiding this comment.
I see requests to these domains as well,
toolbox.router.get("/*", toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" });
toolbox.router.get("/*", toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" });
toolbox.router.get("/*", toolbox.cacheFirst, { origin: "unpkg.com" }); // for replThere was a problem hiding this comment.
Ok, I will add these origins.
I'm currently searching a way to respond to Algolia queries. It seems to not support offline mode. For pre-caching I already use a list of all pages, it should be easy to search in it.
There was a problem hiding this comment.
Algolia queries are POST queries - toolbox.router.post(...)
There was a problem hiding this comment.
Depending on how you want to handle post requests, you could do toolbox.router.post('/endpoint', toolbox.networkOnly). This means you're not just returning a cached response but are instead stating that requests made to endpoint need to be network only.
There was a problem hiding this comment.
We could do networkOnly, but @xtuc wanted that the user be able to search even when offline - as mostly it is just static data updated via commits. So, it could be networkFirst with a separate cache name and maxEntries.
| '{{ "/css/main.css" | prepend: site.baseurl }}?t={{ site.time | date_to_xmlschema }}"', | ||
| {% for page in site.pages %} | ||
| '{{ page.url }}', | ||
| {% endfor %} |
There was a problem hiding this comment.
I'm not sure if the following is valid. I see this to be valid at-least for page transitions from and to REPL.
The Service Worker shouldn't change from pageA to pageB if both the pages are under the scope of the same service worker. This is because, service worker updates based on the byte diff, and during a page change, a new service worker will install (if the resources are different -> i.e new service-worker.js).
There was a problem hiding this comment.
Since the strategy is cacheFirst I removed the file main.css which has a time in it.
|
This is all really awesome to see come together. Fwiw, we're seeing a few sites choose the grayed out pattern for indicating offline (coupled with a top or bottom bar with the 'offline' labelling): Some alternatives we've suggested elsewhere include showing a toast when switching online state or just having something prominent in the UI that can be toggled depending on when the user has a network connection again. |
| window.addEventListener('offline', function() { | ||
| offlineIndicator.removeClass("hidden"); | ||
| }); | ||
|
|
There was a problem hiding this comment.
Mostly leaving here as a note in case helpful later :)
The navigator.{online/offline} events may not accurately indicate that you can or can't access the network. There are well documented gotchas where you might need additional means to check that you're really online. One is checking connection loss by making failed XHR requests (retry a request a few times, if it doesn't go through, you're definitely offline).
There was a problem hiding this comment.
@addyosmani flipkart is awesome !
|
@xtuc Awesome, great work you did there. I quite like that there would be a little indicator on the top which shows if the user is offline. |
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" }); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" }); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "unpkg.com" }); // for repl | ||
| toolbox.router.post('/*', toolbox.cacheFirst, { origin: "algolia.net" }); // Cache Algolia search response |
There was a problem hiding this comment.
I mentioned this in your earlier commits, but is the goal here to always return a cached post response from Algolia or to only have this work when the network is available?
There was a problem hiding this comment.
I wanted to keep search even once offline. The goal here is to cache some responses from Algolia because it does not support offline searching. I'm not sure about this since the user will be able to find what he already searched online.
There was a problem hiding this comment.
@addyosmani I guess not. it should be networkFirst.
@xtuc cacheFirst, networkFirst, fastest - all these ultimately hit the cache in some way.
|
'offline fallback' for the last point would do the same. We have to make a offline page for it. We have to add service worker 'fetch' event in index.js . Any idea what should offline page looks like ? |
|
Apologies for the drive-by feedback. I work on I'll leave some inline feedback on this PR, and I've started a series of posts laying out some best practices for this scenario, with part 1 at https://jeffy.info/2016/11/02/offline-first-for-your-templated-site-part-1.html |
|
|
||
| importScripts('/scripts/sw-toolbox.js'); | ||
|
|
||
| const VERSION = 1; |
There was a problem hiding this comment.
How will you make sure that VERSION is incremented whenever any existing content anywhere on the site is updated? If VERSION doesn't get incremented, then the previously cached content for a given URL will never get refreshed, since there's a cacheFirst policy being used.
There was a problem hiding this comment.
Yes sure. We could use the site.date from Jekyll. But this way the entier cache will be invalided even if the content hasn't changed.
| toolbox.precache(preCachedRessources); | ||
|
|
||
| toolbox.router.get('/*', toolbox.cacheFirst); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" }); |
There was a problem hiding this comment.
You'd probably want to configure a different cache name here the runtime CDN caches, so that they're reused indefinitely and don't become irrelevant whenever you change the default toolbox.cache.name value above.
If you're using cacheFirst strategy for them, please make sure that all of the URLs that they refer to are versioned. E.g. https://unpkg.com/react@15.3.1/dist/react.min.js includes 15.3.1 in the URL, so it's fine to use cacheFirst. But if you request URLs like https://unpkg.com/react/dist/react.min.js, once it's cached for the first time, the initial version will be reused indefinitely (at least until the cache name changes).
There was a problem hiding this comment.
Yes good point. This is part of good practice I guess.
There was a problem hiding this comment.
I checked, resources from CDN (Cloudflare and jsdelivr) have a version in the URL.
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" }); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" }); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "unpkg.com" }); // for repl | ||
| toolbox.router.post('/*', toolbox.cacheFirst, { origin: "algolia.net" }); // Cache Algolia search response |
There was a problem hiding this comment.
I don't know exactly what algolia.net is used for specifically, but if it's search-related, then cacheFirst doesn't sound like a good strategy to use. networkFirst would normally be more appropriate, since you'd want fresh search results, but could fall back to stale results if the network is unavailable.
There was a problem hiding this comment.
Yes, I'm now thinking that idea wasn't that good. (#978 (comment))
|
|
||
| toolbox.precache(preCachedRessources); | ||
|
|
||
| toolbox.router.get('/*', toolbox.cacheFirst); |
There was a problem hiding this comment.
This is super broad. Are you sure you're okay with every URL under your origin being served cacheFirst? And aside from the caching strategy, are you sure that any time there's any request made for a resource on your site, you want that added to the cache? If there are large images or other large media, then that's not necessarily a good approach.
There was a problem hiding this comment.
I wanted an offline documentation. This caching strategy might indeed not be the best approach.
|
|
||
| const VERSION = 1; | ||
|
|
||
| toolbox.cache.name = "Babel-Cache-" + VERSION; |
There was a problem hiding this comment.
I don't see anything here that will delete old caches once this cache name changes. That means that you'll end up with many copies of your full site in the SW site, each time you change the name.
Normally, you'd want to clean up unneeded caches inside an activate handler.
There was a problem hiding this comment.
👍 to Jeff's comment. You'll want to iterate through your cache entries and cache.delete doing something along these lines:
// Where urlsToCacheKeys is a map of your items to cache
self.addEventListener('activate', function(event) {
let setOfExpectedUrls = new Set(urlsToCacheKeys.values());
event.waitUntil(
caches.open(cacheName).then(function(cache) {
return cache.keys().then(function(existingRequests) {
return Promise.all(
existingRequests.map(function(existingRequest) {
if (!setOfExpectedUrls.has(existingRequest.url)) {
return cache.delete(existingRequest);
}
})
);
});
}).then(function() {
return self.clients.claim();
})
);
});There was a problem hiding this comment.
Rather than cleaning manually the cache, I think we need to limit the cache using cache.maxEntries or cache.maxAgeSeconds.
The old cached website(s) will be cleaned up given these configurations.
|
|
||
| var preCachedRessources = [ | ||
| {% for page in site.pages %} | ||
| '{{ page.url }}', |
There was a problem hiding this comment.
If you're precaching entire HTML documents, then you're going to end up populating user's caches with a lot of duplicated data. I.e. if your HTML shares common headers/footers (because they're all using some underlying layouts) then the structure of those headers and footers will be take up space on your user's disk when they don't actually add any extra value.
The ideal implementation would just cache the layout elements once, then cache the underlying content once, and then perform the templating logic in the service worker to combine them at runtime.
I've got a very experimental example of this implemented at https://github.com/jeffposnick/jeffposnick.github.io/tree/work/src, but I don't know that it's stable enough to recommend using for a high-traffic site.
| toolbox.router.get('/*', toolbox.cacheFirst); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdnjs.cloudflare.com" }); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, { origin: "cdn.jsdelivr.net" }); | ||
| toolbox.router.get('/*', toolbox.cacheFirst, contentCacheOptions); |
There was a problem hiding this comment.
networkFirst or fastest - cacheFirst is never going to hit the network after first time.
| var offlineIndicator = $("#offline-indicator"); | ||
|
|
||
| window.addEventListener('online', function() { | ||
| offlineIndicator.addClass("hidden"); |
There was a problem hiding this comment.
nit: quotes are inconsistent (single quotes are used elsewhere)
There was a problem hiding this comment.
Yes, thanks. I updated both files with single quotes.
| var VERSION = '{{ site.time }}'; | ||
|
|
||
| var contentCacheOptions = { | ||
| name: "Babel-Cache-" + VERSION, |
|
@boopathi @marcobiedermann @mathiasbynens @jeffposnick what are you thought about this ? |
|
The current implementation, with What I'd recommend is that you try doing some sanity checks in a local deployment by keeping an eye on Chrome DevTool's Cache Storage display in the Applications panel, and the Network panel. The Cache Storage display will give you a list of URLs that are being cached, and you can verify that you're not accidentally caching anything that you don't intend to. (Your |
|
Thanks a lot @jeffposnick for your explanations. |
|
This issue has been automatically marked as |
|
Deploy preview ready! Built with commit 18d2580 |
|
@jeffposnick @xtuc Hello I am new here and trying to contribute. I was thinking of if we can have different named caches for different kind of assets. Like for images we can have a separate cache, so that we can manage it differently and even it may improve the performance while matching requests from the cache. Also I think it would be safer to go with networkFirst for the website assets, with cache maxAge. And do we need maxEntries? Can't we let it be maximum as they will get expired by age and so the the caching of assets won't be limited. |
xtuc
left a comment
There was a problem hiding this comment.
We can remove the scripts/sw-toolbox.js in the sources now.
| , cdnCacheOptions = { | ||
| name: 'cdn', | ||
| maxAgeSeconds: 60 * 60 * 6 | ||
| }; |
There was a problem hiding this comment.
Could you please make multiple var declarations instead here?









First naive implementation to support offline mode.
Refers to issue #647
Some points:
Note that it is necessary to know the files to be cached in advance. That's why I need Jekyll templating in
service-worker.js.@thejameskyle @marcobiedermann What do you think ?