How To Download Multiple Files Sequentially using NSURLSession downloadTask in Swift

CraigH picture CraigH · Sep 1, 2015 · Viewed 22.8k times · Source

I have an app that has to download multiple large files. I want it to download each file one by one sequentially instead of concurrently. When it runs concurrently the app gets overloaded and crashes.

So. Im trying to wrap a downloadTaskWithURL inside a NSBlockOperation and then setting the maxConcurrentOperationCount = 1 on the queue. I wrote this code below but it didnt work since both files get downloaded concurrently.

import UIKit

class ViewController: UIViewController, NSURLSessionDelegate, NSURLSessionDownloadDelegate {

    override func viewDidLoad() {
        super.viewDidLoad()
        // Do any additional setup after loading the view, typically from a nib.
        processURLs()        
    }

    func download(url: NSURL){
        let sessionConfiguration = NSURLSessionConfiguration.defaultSessionConfiguration()
        let session = NSURLSession(configuration: sessionConfiguration, delegate: self, delegateQueue: nil)
        let downloadTask = session.downloadTaskWithURL(url)
        downloadTask.resume()
    }

    func processURLs(){

        //setup queue and set max conncurrent to 1
        var queue = NSOperationQueue()
        queue.name = "Download queue"
        queue.maxConcurrentOperationCount = 1

        let url = NSURL(string: "http://azspeastus.blob.core.windows.net/azurespeed/100MB.bin?sv=2014-02-14&sr=b&sig=%2FZNzdvvzwYO%2BQUbrLBQTalz%2F8zByvrUWD%2BDfLmkpZuQ%3D&se=2015-09-01T01%3A48%3A51Z&sp=r")
        let url2 = NSURL(string: "http://azspwestus.blob.core.windows.net/azurespeed/100MB.bin?sv=2014-02-14&sr=b&sig=ufnzd4x9h1FKmLsODfnbiszXd4EyMDUJgWhj48QfQ9A%3D&se=2015-09-01T01%3A48%3A51Z&sp=r")

        let urls = [url, url2]
        for url in urls {
            let operation = NSBlockOperation { () -> Void in
                println("starting download")
                self.download(url!)
            }

            queue.addOperation(operation)            
        }
    }
    override func didReceiveMemoryWarning() {
        super.didReceiveMemoryWarning()
        // Dispose of any resources that can be recreated.
    }

    func URLSession(session: NSURLSession, downloadTask: NSURLSessionDownloadTask, didFinishDownloadingToURL location: NSURL) {
        //code
    }

    func URLSession(session: NSURLSession, downloadTask: NSURLSessionDownloadTask, didResumeAtOffset fileOffset: Int64, expectedTotalBytes: Int64) {
        //
    }

    func URLSession(session: NSURLSession, downloadTask: NSURLSessionDownloadTask, didWriteData bytesWritten: Int64, totalBytesWritten: Int64, totalBytesExpectedToWrite: Int64) {
        var progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite)
        println(progress)
    }

}

How can write this properly to achieve my goal of only download one file at a time.

Answer

Rob picture Rob · Sep 1, 2015

Your code won't work because URLSessionDownloadTask runs asynchronously. Thus the BlockOperation completes before the download is done and therefore while the operations fire off sequentially, the download tasks will continue asynchronously and in parallel.

While there are work-arounds one can contemplate (e.g., recursive patterns initiating one request after the prior one finishes, non-zero semaphore pattern on background thread, etc.), the elegant solution is one of the proven asynchronous frameworks. Historically if you wanted to control the degree of concurrency of a series of asynchronous tasks, we would reach for an asynchronous Operation subclass. Nowadays, in iOS 13 and later, we might consider Combine. (There are other third-party asynchronous programming frameworks, but I will restrict myself to Apple provided approaches.)


Operation

To address this, you can wrap the requests in asynchronous Operation subclass. See Configuring Operations for Concurrent Execution in the Concurrency Programming Guide for more information.

But before I illustrate how to do this in your situation (the delegate-based URLSession), let me first show you the simpler solution when using the completion handler rendition. We'll later build upon this for your more complicated question. So, in Swift 3 and later:

class DownloadOperation : AsynchronousOperation {
    var task: URLSessionTask!
    
    init(session: URLSession, url: URL) {
        super.init()
        
        task = session.downloadTask(with: url) { temporaryURL, response, error in
            defer { self.finish() }
            
            guard
                let httpResponse = response as? HTTPURLResponse,
                200..<300 ~= httpResponse.statusCode
            else {
                // handle invalid return codes however you'd like
                return
            }

            guard let temporaryURL = temporaryURL, error == nil else {
                print(error ?? "Unknown error")
                return
            }
            
            do {
                let manager = FileManager.default
                let destinationURL = try manager.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: false)
                    .appendingPathComponent(url.lastPathComponent)
                try? manager.removeItem(at: destinationURL)                   // remove the old one, if any
                try manager.moveItem(at: temporaryURL, to: destinationURL)    // move new one there
            } catch let moveError {
                print("\(moveError)")
            }
        }
    }
    
    override func cancel() {
        task.cancel()
        super.cancel()
    }
    
    override func main() {
        task.resume()
    }
    
}

Where

/// Asynchronous operation base class
///
/// This is abstract to class emits all of the necessary KVO notifications of `isFinished`
/// and `isExecuting` for a concurrent `Operation` subclass. You can subclass this and
/// implement asynchronous operations. All you must do is:
///
/// - override `main()` with the tasks that initiate the asynchronous task;
///
/// - call `completeOperation()` function when the asynchronous task is done;
///
/// - optionally, periodically check `self.cancelled` status, performing any clean-up
///   necessary and then ensuring that `finish()` is called; or
///   override `cancel` method, calling `super.cancel()` and then cleaning-up
///   and ensuring `finish()` is called.

class AsynchronousOperation: Operation {
    
    /// State for this operation.
    
    @objc private enum OperationState: Int {
        case ready
        case executing
        case finished
    }
    
    /// Concurrent queue for synchronizing access to `state`.
    
    private let stateQueue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".rw.state", attributes: .concurrent)
    
    /// Private backing stored property for `state`.
    
    private var rawState: OperationState = .ready
    
    /// The state of the operation
    
    @objc private dynamic var state: OperationState {
        get { return stateQueue.sync { rawState } }
        set { stateQueue.sync(flags: .barrier) { rawState = newValue } }
    }
    
    // MARK: - Various `Operation` properties
    
    open         override var isReady:        Bool { return state == .ready && super.isReady }
    public final override var isExecuting:    Bool { return state == .executing }
    public final override var isFinished:     Bool { return state == .finished }
    
    // KVO for dependent properties
    
    open override class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String> {
        if ["isReady", "isFinished", "isExecuting"].contains(key) {
            return [#keyPath(state)]
        }
        
        return super.keyPathsForValuesAffectingValue(forKey: key)
    }
    
    // Start
    
    public final override func start() {
        if isCancelled {
            finish()
            return
        }
        
        state = .executing
        
        main()
    }
    
    /// Subclasses must implement this to perform their work and they must not call `super`. The default implementation of this function throws an exception.
    
    open override func main() {
        fatalError("Subclasses must implement `main`.")
    }
    
    /// Call this function to finish an operation that is currently executing
    
    public final func finish() {
        if !isFinished { state = .finished }
    }
}

Then you can do:

for url in urls {
    queue.addOperation(DownloadOperation(session: session, url: url))
}

So that's one very easy way to wrap asynchronous URLSession/NSURLSession requests in asynchronous Operation/NSOperation subclass. More generally, this is a useful pattern, using AsynchronousOperation to wrap up some asynchronous task in an Operation/NSOperation object.

Unfortunately, in your question, you wanted to use delegate-based URLSession/NSURLSession so you could monitor the progress of the downloads. This is more complicated.

This is because the "task complete" NSURLSession delegate methods are called at the session object's delegate. This is an infuriating design feature of NSURLSession (but Apple did it to simplify background sessions, which isn't relevant here, but we're stuck with that design limitation).

But we have to asynchronously complete the operations as the tasks finish. So we need some way for the session to figure out which operation to complete when didCompleteWithError is called. Now you could have each operation have its own NSURLSession object, but it turns out that this is pretty inefficient.

So, to handle that, I maintain a dictionary, keyed by the task's taskIdentifier, which identifies the appropriate operation. That way, when the download finishes, you can "complete" the correct asynchronous operation. Thus:

/// Manager of asynchronous download `Operation` objects

class DownloadManager: NSObject {
    
    /// Dictionary of operations, keyed by the `taskIdentifier` of the `URLSessionTask`
    
    fileprivate var operations = [Int: DownloadOperation]()
    
    /// Serial OperationQueue for downloads
    
    private let queue: OperationQueue = {
        let _queue = OperationQueue()
        _queue.name = "download"
        _queue.maxConcurrentOperationCount = 1    // I'd usually use values like 3 or 4 for performance reasons, but OP asked about downloading one at a time
        
        return _queue
    }()
    
    /// Delegate-based `URLSession` for DownloadManager
    
    lazy var session: URLSession = {
        let configuration = URLSessionConfiguration.default
        return URLSession(configuration: configuration, delegate: self, delegateQueue: nil)
    }()
    
    /// Add download
    ///
    /// - parameter URL:  The URL of the file to be downloaded
    ///
    /// - returns:        The DownloadOperation of the operation that was queued
    
    @discardableResult
    func queueDownload(_ url: URL) -> DownloadOperation {
        let operation = DownloadOperation(session: session, url: url)
        operations[operation.task.taskIdentifier] = operation
        queue.addOperation(operation)
        return operation
    }
    
    /// Cancel all queued operations
    
    func cancelAll() {
        queue.cancelAllOperations()
    }
    
}

// MARK: URLSessionDownloadDelegate methods

extension DownloadManager: URLSessionDownloadDelegate {
    
    func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didFinishDownloadingTo location: URL) {
        operations[downloadTask.taskIdentifier]?.urlSession(session, downloadTask: downloadTask, didFinishDownloadingTo: location)
    }
    
    func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didWriteData bytesWritten: Int64, totalBytesWritten: Int64, totalBytesExpectedToWrite: Int64) {
        operations[downloadTask.taskIdentifier]?.urlSession(session, downloadTask: downloadTask, didWriteData: bytesWritten, totalBytesWritten: totalBytesWritten, totalBytesExpectedToWrite: totalBytesExpectedToWrite)
    }
}

// MARK: URLSessionTaskDelegate methods

extension DownloadManager: URLSessionTaskDelegate {
    
    func urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?)  {
        let key = task.taskIdentifier
        operations[key]?.urlSession(session, task: task, didCompleteWithError: error)
        operations.removeValue(forKey: key)
    }
    
}

/// Asynchronous Operation subclass for downloading

class DownloadOperation : AsynchronousOperation {
    let task: URLSessionTask
    
    init(session: URLSession, url: URL) {
        task = session.downloadTask(with: url)
        super.init()
    }
    
    override func cancel() {
        task.cancel()
        super.cancel()
    }
    
    override func main() {
        task.resume()
    }
}

// MARK: NSURLSessionDownloadDelegate methods

extension DownloadOperation: URLSessionDownloadDelegate {
    
    func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didFinishDownloadingTo location: URL) {
        guard
            let httpResponse = downloadTask.response as? HTTPURLResponse,
            200..<300 ~= httpResponse.statusCode
        else {
            // handle invalid return codes however you'd like
            return
        }

        do {
            let manager = FileManager.default
            let destinationURL = try manager
                .url(for: .applicationSupportDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
                .appendingPathComponent(downloadTask.originalRequest!.url!.lastPathComponent)
            try? manager.removeItem(at: destinationURL)
            try manager.moveItem(at: location, to: destinationURL)
        } catch {
            print(error)
        }
    }
    
    func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didWriteData bytesWritten: Int64, totalBytesWritten: Int64, totalBytesExpectedToWrite: Int64) {
        let progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite)
        print("\(downloadTask.originalRequest!.url!.absoluteString) \(progress)")
    }
}

// MARK: URLSessionTaskDelegate methods

extension DownloadOperation: URLSessionTaskDelegate {
    
    func urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?)  {
        defer { finish() }
        
        if let error = error {
            print(error)
            return
        }
        
        // do whatever you want upon success
    }
    
}

And then use it like so:

let downloadManager = DownloadManager()

override func viewDidLoad() {
    super.viewDidLoad()
    
    let urlStrings = [
        "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/s72-55482.jpg",
        "http://spaceflight.nasa.gov/gallery/images/apollo/apollo10/hires/as10-34-5162.jpg",
        "http://spaceflight.nasa.gov/gallery/images/apollo-soyuz/apollo-soyuz/hires/s75-33375.jpg",
        "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/as17-134-20380.jpg",
        "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/as17-140-21497.jpg",
        "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/as17-148-22727.jpg"
    ]
    let urls = urlStrings.compactMap { URL(string: $0) }
    
    let completion = BlockOperation {
        print("all done")
    }
    
    for url in urls {
        let operation = downloadManager.queueDownload(url)
        completion.addDependency(operation)
    }

    OperationQueue.main.addOperation(completion)
}

See revision history for Swift 2 implementation.


Combine

For Combine, the idea would be to create a Publisher for URLSessionDownloadTask. Then you can do something like:

var downloadRequests: AnyCancellable?

/// Download a series of assets

func downloadAssets() {
    downloadRequests = downloadsPublisher(for: urls, maxConcurrent: 1).sink { completion in
        switch completion {
        case .finished:
            print("done")

        case .failure(let error):
            print("failed", error)
        }
    } receiveValue: { destinationUrl in
        print(destinationUrl)
    }
}

/// Publisher for single download
///
/// Copy downloaded resource to caches folder.
///
/// - Parameter url: `URL` being downloaded.
/// - Returns: Publisher for the URL with final destination of the downloaded asset.

func downloadPublisher(for url: URL) -> AnyPublisher<URL, Error> {
    URLSession.shared.downloadTaskPublisher(for: url)
        .tryCompactMap {
            let destination = try FileManager.default
                .url(for: .cachesDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
                .appendingPathComponent(url.lastPathComponent)
            try FileManager.default.moveItem(at: $0.location, to: destination)
            return destination
        }
        .receive(on: RunLoop.main)
        .eraseToAnyPublisher()
}

/// Publisher for a series of downloads
///
/// This downloads not more than `maxConcurrent` assets at a given time.
///
/// - Parameters:
///   - urls: Array of `URL`s of assets to be downloaded.
///   - maxConcurrent: The maximum number of downloads to run at any given time (default 4).
/// - Returns: Publisher for the URLs with final destination of the downloaded assets.

func downloadsPublisher(for urls: [URL], maxConcurrent: Int = 4) -> AnyPublisher<URL, Error> {
    Publishers.Sequence(sequence: urls.map { downloadPublisher(for: $0) })
        .flatMap(maxPublishers: .max(maxConcurrent)) { $0 }
        .eraseToAnyPublisher()
}

Now, unfortunately, Apple supplies a DataTaskPublisher (which loads the full asset into memory which is not acceptable solution for large assets), but one can refer to their source code and adapt it to create a DownloadTaskPublisher:

//  DownloadTaskPublisher.swift
//
//  Created by Robert Ryan on 9/28/20.
//
//  Adapted from Apple's `DataTaskPublisher` at:
//  https://github.com/apple/swift/blob/88b093e9d77d6201935a2c2fb13f27d961836777/stdlib/public/Darwin/Foundation/Publishers%2BURLSession.swift

import Foundation
import Combine

// MARK: Download Tasks

@available(macOS 10.15, iOS 13.0, tvOS 13.0, watchOS 6.0, *)
extension URLSession {
    /// Returns a publisher that wraps a URL session download task for a given URL.
    ///
    /// The publisher publishes temporary when the task completes, or terminates if the task fails with an error.
    ///
    /// - Parameter url: The URL for which to create a download task.
    /// - Returns: A publisher that wraps a download task for the URL.

    public func downloadTaskPublisher(for url: URL) -> DownloadTaskPublisher {
        let request = URLRequest(url: url)
        return DownloadTaskPublisher(request: request, session: self)
    }

    /// Returns a publisher that wraps a URL session download task for a given URL request.
    ///
    /// The publisher publishes download when the task completes, or terminates if the task fails with an error.
    ///
    /// - Parameter request: The URL request for which to create a download task.
    /// - Returns: A publisher that wraps a download task for the URL request.

    public func downloadTaskPublisher(for request: URLRequest) -> DownloadTaskPublisher {
        return DownloadTaskPublisher(request: request, session: self)
    }

    public struct DownloadTaskPublisher: Publisher {
        public typealias Output = (location: URL, response: URLResponse)
        public typealias Failure = URLError

        public let request: URLRequest
        public let session: URLSession

        public init(request: URLRequest, session: URLSession) {
            self.request = request
            self.session = session
        }

        public func receive<S: Subscriber>(subscriber: S) where Failure == S.Failure, Output == S.Input {
            subscriber.receive(subscription: Inner(self, subscriber))
        }

        private typealias Parent = DownloadTaskPublisher
        private final class Inner<Downstream: Subscriber>: Subscription, CustomStringConvertible, CustomReflectable, CustomPlaygroundDisplayConvertible
        where
            Downstream.Input == Parent.Output,
            Downstream.Failure == Parent.Failure
        {
            typealias Input = Downstream.Input
            typealias Failure = Downstream.Failure

            private let lock: NSLocking
            private var parent: Parent?               // GuardedBy(lock)
            private var downstream: Downstream?       // GuardedBy(lock)
            private var demand: Subscribers.Demand    // GuardedBy(lock)
            private var task: URLSessionDownloadTask! // GuardedBy(lock)
            var description: String { return "DownloadTaskPublisher" }
            var customMirror: Mirror {
                lock.lock()
                defer { lock.unlock() }
                return Mirror(self, children: [
                    "task": task as Any,
                    "downstream": downstream as Any,
                    "parent": parent as Any,
                    "demand": demand,
                ])
            }
            var playgroundDescription: Any { return description }

            init(_ parent: Parent, _ downstream: Downstream) {
                self.lock = NSLock()
                self.parent = parent
                self.downstream = downstream
                self.demand = .max(0)
            }

            // MARK: - Upward Signals
            func request(_ d: Subscribers.Demand) {
                precondition(d > 0, "Invalid request of zero demand")

                lock.lock()
                guard let p = parent else {
                    // We've already been cancelled so bail
                    lock.unlock()
                    return
                }

                // Avoid issues around `self` before init by setting up only once here
                if self.task == nil {
                    let task = p.session.downloadTask(
                        with: p.request,
                        completionHandler: handleResponse(location:response:error:)
                    )
                    self.task = task
                }

                self.demand += d
                let task = self.task!
                lock.unlock()

                task.resume()
            }

            private func handleResponse(location: URL?, response: URLResponse?, error: Error?) {
                lock.lock()
                guard demand > 0,
                      parent != nil,
                      let ds = downstream
                else {
                    lock.unlock()
                    return
                }

                parent = nil
                downstream = nil

                // We clear demand since this is a single shot shape
                demand = .max(0)
                task = nil
                lock.unlock()

                if let location = location, let response = response, error == nil {
                    _ = ds.receive((location, response))
                    ds.receive(completion: .finished)
                } else {
                    let urlError = error as? URLError ?? URLError(.unknown)
                    ds.receive(completion: .failure(urlError))
                }
            }

            func cancel() {
                lock.lock()
                guard parent != nil else {
                    lock.unlock()
                    return
                }
                parent = nil
                downstream = nil
                demand = .max(0)
                let task = self.task
                self.task = nil
                lock.unlock()
                task?.cancel()
            }
        }
    }
}

Now, unfortunately, that isn’t using URLSession delegate pattern, but rather the completion handler rendition. But one could conceivably adapt it for delegate pattern.

Also, this will stop downloads when one fails. If you don't want it to stop just because one fails, you could conceivably define it to Never fail, and instead replaceError with nil:

/// Publisher for single download
///
/// Copy downloaded resource to caches folder.
///
/// - Parameter url: `URL` being downloaded.
/// - Returns: Publisher for the URL with final destination of the downloaded asset. Returns `nil` if request failed.

func downloadPublisher(for url: URL) -> AnyPublisher<URL?, Never> {
    URLSession.shared.downloadTaskPublisher(for: url)
        .tryCompactMap {
            let destination = try FileManager.default
                .url(for: .cachesDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
                .appendingPathComponent(url.lastPathComponent)
            try FileManager.default.moveItem(at: $0.location, to: destination)
            return destination
        }
        .replaceError(with: nil)
        .receive(on: RunLoop.main)
        .eraseToAnyPublisher()
}

/// Publisher for a series of downloads
///
/// This downloads not more than `maxConcurrent` assets at a given time.
///
/// - Parameters:
///   - urls: Array of `URL`s of assets to be downloaded.
///   - maxConcurrent: The maximum number of downloads to run at any given time (default 4).
/// - Returns: Publisher for the URLs with final destination of the downloaded assets.

func downloadsPublisher(for urls: [URL], maxConcurrent: Int = 4) -> AnyPublisher<URL?, Never> {
    Publishers.Sequence(sequence: urls.map { downloadPublisher(for: $0) })
        .flatMap(maxPublishers: .max(maxConcurrent)) { $0 }
        .eraseToAnyPublisher()
}