Swift iOS Cache WKWebView content for offline view

Cristi Ghinea picture Cristi Ghinea · Mar 27, 2016 · Viewed 21.4k times · Source

We're trying to save the content (HTML) of WKWebView in a persistent storage (NSUserDefaults, CoreData or disk file). The user can see the same content when he re-enters the application with no internet connection. WKWebView doesn't use NSURLProtocol like UIWebView (see post here).

Although I have seen posts that "The offline application cache is not enabled in WKWebView." (Apple dev forums), I know that a solution exists.

I've learned of two possibilities, but I couldn't make them work:

1) If I open a website in Safari for Mac and select File >> Save As, it will appear the following option in the image below. For Mac apps exists [[[webView mainFrame] dataSource] webArchive], but on UIWebView or WKWebView there is no such API. But if I load a .webarchive file in Xcode on WKWebView (like the one I obtained from Mac Safari), then the content is displayed correctly (html, external images, video previews) if there is no internet connection. The .webarchive file is actually a plist (property list). I tried to use a mac framework that creates a .webarchive file, but it was incomplete.

enter image description here

2) I obtanined the HTML in webView:didFinishNavigation but it doesn't save external images, css, javascript

 func webView(webView: WKWebView, didFinishNavigation navigation: WKNavigation!) {

    webView.evaluateJavaScript("document.documentElement.outerHTML.toString()",
        completionHandler: { (html: AnyObject?, error: NSError?) in
            print(html)
    })
}

We're struggling over a week and it is a main feature for us. Any idea is really appreciated.

Thank you!

Answer

Ernesto Elsäßer picture Ernesto Elsäßer · Nov 11, 2018

I know I'm late, but I have recently been looking for a way to store web pages for offline reading, and still could't find any reliable solution that wouldn't depend on the page itself and wouldn't use the deprecated UIWebView. A lot of people write that one should use the existing HTTP caching, but WebKit seems to do a lot of stuff out-of-process, making it virtually impossible to enforce complete caching (see here or here). However, this question guided me into the right direction. Tinkering with the web archive approach, I found that it's actually quite easy to write your own web archive exporter.

As written in the question, web archives are just plist files, so all it takes is a crawler that extracts the required resources from the HTML page, downloads them all and stores them in a big plist file. This archive file can then later be loaded into the WKWebView via loadFileURL(URL:allowingReadAccessTo:).

I created a demo app that allows archiving from and restoring to a WKWebView using this approach: https://github.com/ernesto-elsaesser/OfflineWebView

EDIT: The archive generation code is now available as standalone Swift package: https://github.com/ernesto-elsaesser/WebArchiver

The implementation only depends on Fuzi for HTML parsing.