Post

AVPlayer Local Cache Implementation|Master AVAssetResourceLoaderDelegate for Smooth Playback

Discover how to implement local caching with AVPlayer and AVQueuePlayer using AVURLAsset and AVAssetResourceLoaderDelegate to reduce buffering and ensure seamless video playback on iOS devices.

AVPlayer Local Cache Implementation|Master AVAssetResourceLoaderDelegate for Smooth Playback

点击这里查看本文章简体中文版本。

點擊這裡查看本文章正體中文版本。

This post was translated with AI assistance — let me know if anything sounds off!


Complete Guide to Implementing Local Cache with AVPlayer

AVPlayer/AVQueuePlayer with AVURLAsset Implementation of AVAssetResourceLoaderDelegate

Photo by [Tyler Lastovich](https://unsplash.com/@lastly?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText){:target="_blank"}

Photo by Tyler Lastovich

[2023/03/12] Update

I have open-sourced my previous implementation. Feel free to use it if you need.

  • Custom cache strategies can use PINCache or others…

  • Externally, just call the make AVAsset factory with the URL, and the AVAsset will support caching.

  • Using Combine to implement data flow strategy

  • Wrote some tests

Introduction

Since the last article “iOS HLS Cache Implementation Exploration Journey” more than half a year ago, the team has still wanted to implement a streaming cache feature because it greatly impacts costs. We are a music streaming platform, and if the same song is fully downloaded every time it plays, it consumes a lot of data for both us and users without unlimited plans. Although music files are only a few MBs at most, small amounts add up to significant costs!

Additionally, since Android has already implemented a feature for caching while streaming, previous comparisons showed significant cost savings. After launching on Android, data usage was noticeably reduced; similarly, iOS, with its larger user base, should see even better data saving results.

Based on the experience from the previous article, if we continue to use HLS (.m3u8/.ts) to achieve the goal, things will become very complicated or even impossible; instead, we fallback to using mp3 files, which allows direct implementation with AVAssetResourceLoaderDelegate.

Goal

  • Played music will generate a local cache backup.

  • Check if there is a local cache before playing music; if available, do not request the file from the server again.

  • Cache strategy can be set; when the total capacity limit is exceeded, the oldest cache files will be deleted.

  • Do not interfere with the original AVPlayer playback mechanism
    (The fastest way would be to use URLSession to download the mp3 first and then feed it to AVPlayer, but this loses the original streaming feature that plays as it downloads, causing users to wait longer and consume more data)

Background Knowledge (1) — HTTP/1.1 Range Requests and Connection Keep-Alive

HTTP/1.1 Range Requests

First, we need to understand how data is requested from the server when playing videos or music. Generally, video and audio files are large, so it’s impossible to wait until the entire file is downloaded before starting playback. Commonly, data is fetched as playback progresses, and as long as the data for the currently playing segment is available, playback can continue.

The way to achieve this function is by using HTTP/1.1 Range to return only the specified byte range of data. For example, specifying 0–100 will return only the 100 bytes of data from 0 to 100. Using this method, data can be retrieved in segments sequentially and then combined into a complete file. This method can also be applied to file download resume functionality.

How to Apply?

We will first use HEAD to check the Response Header to understand whether the server supports Range requests, the total length of the resource, and the file type:

1
curl -i -X HEAD http://zhgchg.li/music.mp3

Using HEAD, we can obtain the following information from the Response Header:

  • Accept-Ranges: bytes means the server supports range requests.
    If the response does not include this value or shows Accept-Ranges: none, it means range requests are not supported.

  • Content-Length: The total length of the resource; we need to know the total length to segment the data.

  • Content-Type: File type information required by AVPlayer during playback.

Sometimes we also use GET Range: bytes=0–1, meaning we request data in the 0–1 range, but we actually don’t care about the content in 0–1; we just want to check the Response Header information. The native AVPlayer uses GET to do this, so this article follows the same approach.

It is recommended to use HEAD to check. This method is more accurate, and if the server does not support the Range feature, using GET will force the full file to download.

1
curl -i -X GET http://zhgchg.li/music.mp3 -H "Range: bytes=0–1"

Using GET, we can obtain the following information from the Response Header:

  • Accept-Ranges: bytes means the server supports Range requests.
    If the response lacks this value or shows Accept-Ranges: none, it means it is not supported.

  • Content-Range: bytes 0–1/total resource length The number after “/” indicates the total length of the resource. We need to know the total length to request data in segments.

  • Content-Type: File type information required by AVPlayer during playback.

Once you know the server supports Range requests, you can send segmented range requests:

1
curl -i -X GET http://zhgchg.li/music.mp3 -H "Range: bytes=0–100"

The server will return 206 Partial Content:

1
2
3
4
Content-Range: bytes 0-100/total length
Content-Length: 100
...
(binary content)

At this point, we have obtained data for the range 0–100 and can continue sending new requests to get ranges 100–200, 200–300, and so on until completion.

If the requested Range exceeds the total length of the resource, it will return 416 Range Not Satisfiable.

Additionally, to get the complete file data, you can either request Range 0-total length or simply use 0- :

1
curl -i -X GET http://zhgchg.li/music.mp3 -H "Range: bytes=0–"

You can also request multiple Range data and set conditions in the same request, but we don’t need to use them. For details, please refer to this.

Connection Keep-Alive

HTTP 1.1 is enabled by default. This feature allows real-time access to downloaded data, for example, a 5 MB file can be received in chunks of 16 KB, 16 KB, 16 KB… without waiting for the entire 5 MB to finish downloading.

1
Connection: Keep-Alive

What if the server does not support Range or Keep-Alive?

No need to do so much; just use URLSession to download the mp3 file and pass it to the player directly… but this is not the result we want. We can ask the backend to help modify the server settings.

Background Knowledge (2) — How Does AVPlayer Natively Handle AVURLAsset Resources?

When we use AVURLAsset init with URL resource and assign it to AVPlayer/AVQueuePlayer to start playback, as mentioned above, it will first use GET Range 0–1 to check if Range requests are supported, the total resource length, and the file type.

After obtaining the file information, a second request will be made to fetch data from 0 to the total length.

⚠️ AVPlayer requests data from 0 to the total length and uses the downloaded data chunks (16 kb, 16 kb, 16 kb…) to decide when enough data is available, then it sends a Cancel to stop the network request (so it usually does not download the entire file unless the file is very small).

Only after continuing playback will data be requested forward through Range.

  • (This part differs from what I expected before; I anticipated requests like 0–100, 100–200, etc.)*

AVPlayer Request Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1. GET Range 0-1 => Response: Total length 150000 / public.mp3 / true
2. GET 0-150000...
3. 16 kb receive
4. 16 kb receive...
5. cancel() // current offset is 700
6. Continue playing
7. GET 700-150000...
8. 16 kb receive
9. 16 kb receive...
10. cancel() // current offset is 1500
11. Continue playing
12. GET 1500-150000...
13. 16 kb receive
14. 16 kb receive...
16. If seek to...5000
17. cancel(12.) // current offset is 2000
18. GET 5000-150000...
19. 16 kb receive
20. 16 kb receive...
...

⚠️ For iOS ≤12, it first sends a few shorter requests to test (?), then sends the request for the full length; iOS ≥ 13 directly sends the request for the full length.

There is an additional pitfall: when observing how resources are fetched, I used the mitmproxy tool to sniff the traffic. It showed an error because it only displays the response after the entire content is received, instead of showing chunks or persistent connection downloads. This startled me! I thought iOS was so inefficient that it always downloads the whole file at once! Next time I use a tool, I need to stay a bit skeptical. Orz

When to Initiate Cancel

  1. The second request mentioned earlier requests resources from 0 to the total length. After receiving enough data, it will send a Cancel to stop the request.

  2. Seek will first send a Cancel request to abort the previous one.

⚠️ Switching to the next item in AVQueuePlayer or changing the playback resource in AVPlayer does not trigger a Cancel request for the previous item.

AVQueue Pre-buffering

Actually, it also calls the Resource Loader for processing, but the data range it requests is smaller.

Implementation

With the above background knowledge, let’s look at the principle and method of implementing AVPlayer local cache functionality.

This is the previously mentioned AVAssetResourceLoaderDelegate interface, which allows us to implement a Resource Loader for an Asset ourselves.

Resource Loader is basically a worker. Whether the player needs file information or file data, and the range it covers, it’s the one that tells us. We just follow its instructions.

I saw an example where one Resource Loader serves all AVURLAssets, which I think is wrong. Each Resource Loader should serve one AVURLAsset and follow the AVURLAsset’s lifecycle since it inherently belongs to the AVURLAsset.

One Resource Loader servicing all AVURLAssets on an AVQueuePlayer becomes very complex and hard to manage.

Timing to Enter the Custom Resource Loader

Note that implementing your own Resource Loader does not guarantee it will be used. It will only be invoked when the system cannot recognize or handle the resource.

So before providing the URL resource to AVURLAsset, we must first change the scheme to our custom scheme, not http/https or other system-handled schemes.

1
http://zhgchg.li/music.mp3 => cacheable://zhgchg.li/music.mp3

AVAssetResourceLoaderDelegate

Only two methods need to be implemented:

  • func resourceLoader( _ resourceLoader: AVAssetResourceLoader, shouldWaitForLoadingOfRequestedResource loadingRequest : AVAssetResourceLoadingRequest) -> Bool :

This method asks if we can handle this resource. Return true if yes, return false if we do not handle it (unsupported URL).

We can extract from loadingRequest what is being requested (whether it’s the first request for file info or a data request, and if it’s a data request, the range requested). After knowing the request, we initiate the request ourselves to fetch the data. At this point, we can decide whether to start a URLSession or return Data from local storage.

You can also perform data encryption and decryption here to protect the original data.

  • func resourceLoader( _ resourceLoader: AVAssetResourceLoader, didCancel loadingRequest : AVAssetResourceLoadingRequest) :

As mentioned earlier, when initiating Cancel timing and starting Cancel…

We can cancel the ongoing URLSession request here.

Local Cache Implementation Methods

For caching, I directly use PINCache to handle cache tasks, avoiding issues like cache read-write deadlocks and implementing cache clearing with LRU strategy ourselves.

️️⚠️️️️️️️️️️️OOM Warning!

Because this is for caching music files around 10 MB in size, PINCache can be used as the local cache tool; if it were for videos, this method wouldn’t work (loading several GBs of data into memory at once).

For this requirement, you can refer to the expert’s approach, using FileHandle’s seek and read/write features for handling.

Start Working!

No fuss, here is the complete project:

AssetData

Local cache data objects implement NSCoding because PINCache relies on archivedData methods for encoding and decoding.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import Foundation
import CryptoKit

class AssetDataContentInformation: NSObject, NSCoding {
    @objc var contentLength: Int64 = 0
    @objc var contentType: String = ""
    @objc var isByteRangeAccessSupported: Bool = false
    
    func encode(with coder: NSCoder) {
        coder.encode(self.contentLength, forKey: #keyPath(AssetDataContentInformation.contentLength))
        coder.encode(self.contentType, forKey: #keyPath(AssetDataContentInformation.contentType))
        coder.encode(self.isByteRangeAccessSupported, forKey: #keyPath(AssetDataContentInformation.isByteRangeAccessSupported))
    }
    
    override init() {
        super.init()
    }
    
    required init?(coder: NSCoder) {
        super.init()
        self.contentLength = coder.decodeInt64(forKey: #keyPath(AssetDataContentInformation.contentLength))
        self.contentType = coder.decodeObject(forKey: #keyPath(AssetDataContentInformation.contentType)) as? String ?? ""
        self.isByteRangeAccessSupported = coder.decodeObject(forKey: #keyPath(AssetDataContentInformation.isByteRangeAccessSupported)) as? Bool ?? false
    }
}

class AssetData: NSObject, NSCoding {
    @objc var contentInformation: AssetDataContentInformation = AssetDataContentInformation()
    @objc var mediaData: Data = Data()
    
    override init() {
        super.init()
    }

    func encode(with coder: NSCoder) {
        coder.encode(self.contentInformation, forKey: #keyPath(AssetData.contentInformation))
        coder.encode(self.mediaData, forKey: #keyPath(AssetData.mediaData))
    }
    
    required init?(coder: NSCoder) {
        super.init()
        self.contentInformation = coder.decodeObject(forKey: #keyPath(AssetData.contentInformation)) as? AssetDataContentInformation ?? AssetDataContentInformation()
        self.mediaData = coder.decodeObject(forKey: #keyPath(AssetData.mediaData)) as? Data ?? Data()
    }
}

AssetData stores:

  • contentInformation : AssetDataContentInformation
    AssetDataContentInformation :
    Stores whether Range requests are supported (isByteRangeAccessSupported), total resource length (contentLength), and file type (contentType)

  • mediaData : Raw audio Data (Large files here may cause OOM)

PINCacheAssetDataManager

Encapsulate the logic for storing and retrieving data in PINCache.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import PINCache
import Foundation

protocol AssetDataManager: NSObject {
    func retrieveAssetData() -> AssetData?
    func saveContentInformation(_ contentInformation: AssetDataContentInformation)
    func saveDownloadedData(_ data: Data, offset: Int)
    func mergeDownloadedDataIfIsContinuted(from: Data, with: Data, offset: Int) -> Data?
}

extension AssetDataManager {
    func mergeDownloadedDataIfIsContinuted(from: Data, with: Data, offset: Int) -> Data? {
        if offset <= from.count && (offset + with.count) > from.count {
            let start = from.count - offset
            var data = from
            data.append(with.subdata(in: start..<with.count))
            return data
        }
        return nil
    }
}

//

class PINCacheAssetDataManager: NSObject, AssetDataManager {
    
    static let Cache: PINCache = PINCache(name: "ResourceLoader")
    let cacheKey: String
    
    init(cacheKey: String) {
        self.cacheKey = cacheKey
        super.init()
    }
    
    func saveContentInformation(_ contentInformation: AssetDataContentInformation) {
        let assetData = AssetData()
        assetData.contentInformation = contentInformation
        PINCacheAssetDataManager.Cache.setObjectAsync(assetData, forKey: cacheKey, completion: nil)
    }
    
    func saveDownloadedData(_ data: Data, offset: Int) {
        guard let assetData = self.retrieveAssetData() else {
            return
        }
        
        if let mediaData = self.mergeDownloadedDataIfIsContinuted(from: assetData.mediaData, with: data, offset: offset) {
            assetData.mediaData = mediaData
            
            PINCacheAssetDataManager.Cache.setObjectAsync(assetData, forKey: cacheKey, completion: nil)
        }
    }
    
    func retrieveAssetData() -> AssetData? {
        guard let assetData = PINCacheAssetDataManager.Cache.object(forKey: cacheKey) as? AssetData else {
            return nil
        }
        return assetData
    }
}

Here, the Protocol is extracted separately because other storage methods may replace PINCache in the future. Therefore, other programs rely on the Protocol rather than the Class instance when using it.

⚠️ mergeDownloadedDataIfIsContinuted This method is extremely important.

For linear playback, you can simply keep appending new data to the cached data. However, real scenarios are more complex. A user might play the range 0–100, then directly seek to the range 200–500. How to merge the existing 0–100 data with the new 200–500 data becomes a major issue.

⚠️Data merging issues can cause terrible playback glitches….

The answer here is, we do not handle non-continuous data; since our project only involves audio files, which are just a few MB (≤ 10MB), we decided not to implement this to save development costs. I only handle merging continuous data (for example, if we currently have 0~100 and new data is 75~200, after merging it becomes 0~200; if the new data is 150~200, I ignore it and do not merge).

If non-continuous merging is considered, besides using other methods for storage (to identify missing parts), the Request must also be able to query which segments require network requests and which are retrieved locally. Implementing this scenario will be very complex.

Image from: [iOS AVPlayer Video Cache Design and Implementation](http://chuquan.me/2019/12/03/ios-avplayer-support-cache/){:target="_blank"}

Image source: iOS AVPlayer 视频缓存的设计与实现

CachingAVURLAsset

AVURLAsset weakly holds the ResourceLoader Delegate, so it is recommended to create a custom AVURLAsset class that inherits from AVURLAsset. Inside, create, assign, and hold the ResourceLoader to tie it to the AVURLAsset’s lifecycle. You can also store the original URL, CacheKey, and other information.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class CachingAVURLAsset: AVURLAsset {
    static let customScheme = "cacheable"
    let originalURL: URL
    private var _resourceLoader: ResourceLoader?
    
    var cacheKey: String {
        return self.url.lastPathComponent
    }
    
    static func isSchemeSupport(_ url: URL) -> Bool {
        guard let components = URLComponents(url: url, resolvingAgainstBaseURL: false) else {
            return false
        }
        
        return ["http", "https"].contains(components.scheme)
    }
    
    override init(url URL: URL, options: [String: Any]? = nil) {
        self.originalURL = URL
        
        guard var components = URLComponents(url: URL, resolvingAgainstBaseURL: false) else {
            super.init(url: URL, options: options)
            return
        }
        
        components.scheme = CachingAVURLAsset.customScheme
        guard let url = components.url else {
            super.init(url: URL, options: options)
            return
        }
        
        super.init(url: url, options: options)
        
        let resourceLoader = ResourceLoader(asset: self)
        self.resourceLoader.setDelegate(resourceLoader, queue: resourceLoader.loaderQueue)
        self._resourceLoader = resourceLoader
    }
}

Usage:

1
2
3
4
5
if CachingAVURLAsset.isSchemeSupport(url) {
  let asset = CachingAVURLAsset(url: url)
  let avplayer = AVPlayer(asset)
  avplayer.play()
}

The function isSchemeSupport() is used to determine whether the URL supports attaching our Resource Loader (excluding file://).

originalURL stores the original resource URL.

cacheKey stores the Cache Key for this resource, here we directly use the file name as the Cache Key.

Adjust the cacheKey according to real scenarios. If the file name is not hashed and may cause duplicates, it is recommended to hash it first as the key to avoid collisions. If hashing the entire URL as the key, also be aware of whether the URL may change (e.g., when using a CDN).

Hash can use md5…sha…, iOS ≥ 13 can directly use Apple’s CryptoKit, for others, just check Github!

ResourceLoaderRequest

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
import Foundation
import CoreServices

protocol ResourceLoaderRequestDelegate: AnyObject {
    func dataRequestDidReceive(_ resourceLoaderRequest: ResourceLoaderRequest, _ data: Data)
    func dataRequestDidComplete(_ resourceLoaderRequest: ResourceLoaderRequest, _ error: Error?, _ downloadedData: Data)
    func contentInformationDidComplete(_ resourceLoaderRequest: ResourceLoaderRequest, _ result: Result<AssetDataContentInformation, Error>)
}

class ResourceLoaderRequest: NSObject, URLSessionDataDelegate {
    struct RequestRange {
        var start: Int64
        var end: RequestRangeEnd
        
        enum RequestRangeEnd {
            case requestTo(Int64)
            case requestToEnd
        }
    }
    
    enum RequestType {
        case contentInformation
        case dataRequest
    }
    
    struct ResponseUnExpectedError: Error { }
    
    private let loaderQueue: DispatchQueue
    
    let originalURL: URL
    let type: RequestType
    
    private var session: URLSession?
    private var dataTask: URLSessionDataTask?
    private var assetDataManager: AssetDataManager?
    
    private(set) var requestRange: RequestRange?
    private(set) var response: URLResponse?
    private(set) var downloadedData: Data = Data()
    
    private(set) var isCancelled: Bool = false {
        didSet {
            if isCancelled {
                self.dataTask?.cancel()
                self.session?.invalidateAndCancel()
            }
        }
    }
    private(set) var isFinished: Bool = false {
        didSet {
            if isFinished {
                self.session?.finishTasksAndInvalidate()
            }
        }
    }
    
    weak var delegate: ResourceLoaderRequestDelegate?
    
    init(originalURL: URL, type: RequestType, loaderQueue: DispatchQueue, assetDataManager: AssetDataManager?) {
        self.originalURL = originalURL
        self.type = type
        self.loaderQueue = loaderQueue
        self.assetDataManager = assetDataManager
        super.init()
    }
    
    func start(requestRange: RequestRange) {
        guard isCancelled == false, isFinished == false else {
            return
        }
        
        self.loaderQueue.async { [weak self] in
            guard let self = self else {
                return
            }
            
            var request = URLRequest(url: self.originalURL)
            self.requestRange = requestRange
            let start = String(requestRange.start)
            let end: String
            switch requestRange.end {
            case .requestTo(let rangeEnd):
                end = String(rangeEnd)
            case .requestToEnd:
                end = ""
            }
            
            let rangeHeader = "bytes=\(start)-\(end)"
            request.setValue(rangeHeader, forHTTPHeaderField: "Range")
            
            let session = URLSession(configuration: .default, delegate: self, delegateQueue: nil)
            self.session = session
            let dataTask = session.dataTask(with: request)
            self.dataTask = dataTask
            dataTask.resume()
        }
    }
    
    func cancel() {
        self.isCancelled = true
    }
    
    func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data) {
        guard self.type == .dataRequest else {
            return
        }
        
        self.loaderQueue.async {
            self.delegate?.dataRequestDidReceive(self, data)
            self.downloadedData.append(data)
        }
    }
    
    func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive response: URLResponse, completionHandler: @escaping (URLSession.ResponseDisposition) -> Void) {
        self.response = response
        completionHandler(.allow)
    }
    
    func urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?) {
        self.isFinished = true
        self.loaderQueue.async {
            if self.type == .contentInformation {
                guard error == nil,
                      let response = self.response as? HTTPURLResponse else {
                    let responseError = error ?? ResponseUnExpectedError()
                    self.delegate?.contentInformationDidComplete(self, .failure(responseError))
                    return
                }
                
                let contentInformation = AssetDataContentInformation()
                
                if let rangeString = response.allHeaderFields["Content-Range"] as? String,
                   let bytesString = rangeString.split(separator: "/").map({String($0)}).last,
                   let bytes = Int64(bytesString) {
                    contentInformation.contentLength = bytes
                }
                
                if let mimeType = response.mimeType,
                   let contentType = UTTypeCreatePreferredIdentifierForTag(kUTTagClassMIMEType, mimeType as CFString, nil)?.takeRetainedValue() {
                    contentInformation.contentType = contentType as String
                }
                
                if let value = response.allHeaderFields["Accept-Ranges"] as? String,
                   value == "bytes" {
                    contentInformation.isByteRangeAccessSupported = true
                } else {
                    contentInformation.isByteRangeAccessSupported = false
                }
                
                self.assetDataManager?.saveContentInformation(contentInformation)
                self.delegate?.contentInformationDidComplete(self, .success(contentInformation))
            } else {
                if let offset = self.requestRange?.start, self.downloadedData.count > 0 {
                    self.assetDataManager?.saveDownloadedData(self.downloadedData, offset: Int(offset))
                }
                self.delegate?.dataRequestDidComplete(self, error, self.downloadedData)
            }
        }
    }
}

The encapsulation of Remote Request mainly handles data requests initiated by the ResourceLoader service.

RequestType: Used to distinguish whether this request is the first time requesting file information (contentInformation) or requesting data (dataRequest)

RequestRange: Request the Range; the end can be specified (requestTo(Int64)) or set to the entire range (requestToEnd).

File information can be obtained from:

func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive response: URLResponse, completionHandler: @escaping (URLSession.ResponseDisposition) -> Void)

Obtain the Response Header here. Also, note that if you want to use HEAD requests, this method won’t work; you need to use other approaches.

  • isByteRangeAccessSupported: Check if the Response Header contains Accept-Ranges == bytes

  • contentType: The file type information required by the player. The format is a Uniform Type Identifier, not audio/mpeg, but written as public.mp3

  • contentLength: Check the Content-Range in the Response Header: bytes 0–1/ total resource length

⚠️ Note that the server’s format casing may vary; it is not always written as Accept-Ranges/Content-Range. Some servers use lowercase accept-ranges, Accept-ranges…

Supplement: If case sensitivity needs to be considered, you can write an HTTPURLResponse Extension

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import CoreServices

extension HTTPURLResponse {
    func parseContentLengthFromContentRange() -> Int64? {
        let contentRangeKeys: [String] = [
            "Content-Range",
            "content-range",
            "Content-range",
            "content-Range"
        ]
        
        var rangeString: String?
        for key in contentRangeKeys {
            if let value = self.allHeaderFields[key] as? String {
                rangeString = value
                break
            }
        }
        
        guard let rangeString = rangeString,
              let contentLengthString = rangeString.split(separator: "/").map({String($0)}).last,
              let contentLength = Int64(contentLengthString) else {
            return nil
        }
        
        return contentLength
    }
    
    func parseAcceptRanges() -> Bool? {
        let contentRangeKeys: [String] = [
            "Accept-Ranges",
            "accept-ranges",
            "Accept-ranges",
            "accept-Ranges"
        ]
        
        var rangeString: String?
        for key in contentRangeKeys {
            if let value = self.allHeaderFields[key] as? String {
                rangeString = value
                break
            }
        }
        
        guard let rangeString = rangeString else {
            return nil
        }
        
        return rangeString == "bytes" \|\| rangeString == "Bytes"
    }
    
    func mimeTypeUTI() -> String? {
        guard let mimeType = self.mimeType,
           let contentType = UTTypeCreatePreferredIdentifierForTag(kUTTagClassMIMEType, mimeType as CFString, nil)?.takeRetainedValue() else {
            return nil
        }
        
        return contentType as String
    }
}

Usage:

  • contentLength = response.parseContentLengthFromContentRange() # Parse the content length from the Content-Range header

  • isByteRangeAccessSupported = response.parseAcceptRanges() # Check if byte-range access is supported

  • contentType = response.mimeTypeUTI()

1
func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data)

As mentioned in the preamble, the downloaded data is obtained in real time, so this method will be called repeatedly, receiving data in fragments; we append these fragments into downloadedData for storage.

1
func urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?)

When a task is canceled or ended, this method is called to save the downloaded data.

As mentioned in the introductory knowledge about the Cancel mechanism, the player will initiate a Cancel Request once it has received enough data. Therefore, when entering this method, the actual error will be error = NSURLErrorCancelled. Thus, regardless of the error, if we have received data, we will try to save it.

⚠️ Since URLSession sends requests concurrently, please keep all operations within a DispatchQueue to avoid data corruption (data corruption can cause severe playback glitches).

️️⚠️URLSession will strongly retain objects causing memory leaks if neither finishTasksAndInvalidate nor invalidateAndCancel is called; therefore, whether canceling or completing, we must call one of these to release the request when the task ends.

️️⚠️️️️️️️️️️If you are worried about downloadedData causing OOM, you can save it locally within didReceive Data.

ResourceLoader

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
import AVFoundation
import Foundation

class ResourceLoader: NSObject {
    
    let loaderQueue = DispatchQueue(label: "li.zhgchg.resourceLoader.queue")
    
    private var requests: [AVAssetResourceLoadingRequest: ResourceLoaderRequest] = [:]
    private let cacheKey: String
    private let originalURL: URL
    
    init(asset: CachingAVURLAsset) {
        self.cacheKey = asset.cacheKey
        self.originalURL = asset.originalURL
        super.init()
    }

    deinit {
        self.requests.forEach { (request) in
            request.value.cancel()
        }
    }
}

extension ResourceLoader: AVAssetResourceLoaderDelegate {
    func resourceLoader(_ resourceLoader: AVAssetResourceLoader, shouldWaitForLoadingOfRequestedResource loadingRequest: AVAssetResourceLoadingRequest) -> Bool {
        
        let type = ResourceLoader.resourceLoaderRequestType(loadingRequest)
        let assetDataManager = PINCacheAssetDataManager(cacheKey: self.cacheKey)

        if let assetData = assetDataManager.retrieveAssetData() {
            if type == .contentInformation {
                loadingRequest.contentInformationRequest?.contentLength = assetData.contentInformation.contentLength
                loadingRequest.contentInformationRequest?.contentType = assetData.contentInformation.contentType
                loadingRequest.contentInformationRequest?.isByteRangeAccessSupported = assetData.contentInformation.isByteRangeAccessSupported
                loadingRequest.finishLoading()
                return true
            } else {
                let range = ResourceLoader.resourceLoaderRequestRange(type, loadingRequest)
                if assetData.mediaData.count > 0 {
                    let end: Int64
                    switch range.end {
                    case .requestTo(let rangeEnd):
                        end = rangeEnd
                    case .requestToEnd:
                        end = assetData.contentInformation.contentLength
                    }
                    
                    if assetData.mediaData.count >= end {
                        let subData = assetData.mediaData.subdata(in: Int(range.start)..<Int(end))
                        loadingRequest.dataRequest?.respond(with: subData)
                        loadingRequest.finishLoading()
                       return true
                    } else if range.start <= assetData.mediaData.count {
                        // has cache data...but not enough
                        let subEnd = (assetData.mediaData.count > end) ? Int((end)) : (assetData.mediaData.count)
                        let subData = assetData.mediaData.subdata(in: Int(range.start)..<subEnd)
                        loadingRequest.dataRequest?.respond(with: subData)
                    }
                }
            }
        }
        
        let range = ResourceLoader.resourceLoaderRequestRange(type, loadingRequest)
        let resourceLoaderRequest = ResourceLoaderRequest(originalURL: self.originalURL, type: type, loaderQueue: self.loaderQueue, assetDataManager: assetDataManager)
        resourceLoaderRequest.delegate = self
        self.requests[loadingRequest]?.cancel()
        self.requests[loadingRequest] = resourceLoaderRequest
        resourceLoaderRequest.start(requestRange: range)
        
        return true
    }
    
    func resourceLoader(_ resourceLoader: AVAssetResourceLoader, didCancel loadingRequest: AVAssetResourceLoadingRequest) {
        guard let resourceLoaderRequest = self.requests[loadingRequest] else {
            return
        }
        
        resourceLoaderRequest.cancel()
        requests.removeValue(forKey: loadingRequest)
    }
}

extension ResourceLoader: ResourceLoaderRequestDelegate {
    func contentInformationDidComplete(_ resourceLoaderRequest: ResourceLoaderRequest, _ result: Result<AssetDataContentInformation, Error>) {
        guard let loadingRequest = self.requests.first(where: { $0.value == resourceLoaderRequest })?.key else {
            return
        }
        
        switch result {
        case .success(let contentInformation):
            loadingRequest.contentInformationRequest?.contentType = contentInformation.contentType
            loadingRequest.contentInformationRequest?.contentLength = contentInformation.contentLength
            loadingRequest.contentInformationRequest?.isByteRangeAccessSupported = contentInformation.isByteRangeAccessSupported
            loadingRequest.finishLoading()
        case .failure(let error):
            loadingRequest.finishLoading(with: error)
        }
    }
    
    func dataRequestDidReceive(_ resourceLoaderRequest: ResourceLoaderRequest, _ data: Data) {
        guard let loadingRequest = self.requests.first(where: { $0.value == resourceLoaderRequest })?.key else {
            return
        }
        
        loadingRequest.dataRequest?.respond(with: data)
    }
    
    func dataRequestDidComplete(_ resourceLoaderRequest: ResourceLoaderRequest, _ error: Error?, _ downloadedData: Data) {
        guard let loadingRequest = self.requests.first(where: { $0.value == resourceLoaderRequest })?.key else {
            return
        }
        
        loadingRequest.finishLoading(with: error)
        requests.removeValue(forKey: loadingRequest)
    }
}

extension ResourceLoader {
    static func resourceLoaderRequestType(_ loadingRequest: AVAssetResourceLoadingRequest) -> ResourceLoaderRequest.RequestType {
        if let _ = loadingRequest.contentInformationRequest {
            return .contentInformation
        } else {
            return .dataRequest
        }
    }
    
    static func resourceLoaderRequestRange(_ type: ResourceLoaderRequest.RequestType, _ loadingRequest: AVAssetResourceLoadingRequest) -> ResourceLoaderRequest.RequestRange {
        if type == .contentInformation {
            return ResourceLoaderRequest.RequestRange(start: 0, end: .requestTo(1))
        } else {
            if loadingRequest.dataRequest?.requestsAllDataToEndOfResource == true {
                let lowerBound = loadingRequest.dataRequest?.currentOffset ?? 0
                return ResourceLoaderRequest.RequestRange(start: lowerBound, end: .requestToEnd)
            } else {
                let lowerBound = loadingRequest.dataRequest?.currentOffset ?? 0
                let length = Int64(loadingRequest.dataRequest?.requestedLength ?? 1)
                let upperBound = lowerBound + length
                return ResourceLoaderRequest.RequestRange(start: lowerBound, end: .requestTo(upperBound))
            }
        }
    }
}

loadingRequest.contentInformationRequest != nil means it is the first request, and the player is asking for the file information first.

When requesting file information, we need to provide these three pieces of information:

  • loadingRequest.contentInformationRequest?.isByteRangeAccessSupported : Whether range data access is supported

  • loadingRequest.contentInformationRequest?.contentType : Uniform Type Identifier

  • loadingRequest.contentInformationRequest?.contentLength : Total file length Int64

loadingRequest.dataRequest?.requestedOffset can get the starting offset of the requested Range.

loadingRequest.dataRequest?.requestedLength can get the length of the requested Range.

If loadingRequest.dataRequest?.requestsAllDataToEndOfResource == true, then ignore the requested Range length and fetch to the end directly.

loadingRequest.dataRequest?.respond(with: Data) returns the loaded Data to the player.

loadingRequest.dataRequest?.currentOffset can get the current data offset. After dataRequest?.respond(with: Data), the currentOffset will advance accordingly.

loadingRequest.finishLoading() all data has been loaded, notify the player.

1
func resourceLoader(_ resourceLoader: AVAssetResourceLoader, shouldWaitForLoadingOfRequestedResource loadingRequest: AVAssetResourceLoadingRequest) -> Bool

The player requests data, and we first check if the local cache has the data. If it does, we return it; if only part of the data is available, we return that part. For example, if the local cache has 0–100 and the player requests 0–200, we return 0–100 first.

If there is no local cache or the returned data is insufficient, a ResourceLoaderRequest will be initiated to fetch data from the network.

1
func resourceLoader(_ resourceLoader: AVAssetResourceLoader, didCancel loadingRequest: AVAssetResourceLoadingRequest)

Player canceled the request, canceling ResourceLoaderRequest.

You may have noticed resourceLoaderRequestRange offset is based on currentOffset, because we first respond with locally downloaded data using dataRequest?.respond(with: Data); therefore, we directly use the adjusted offset.

1
func private var requests: [AVAssetResourceLoadingRequest: ResourceLoaderRequest] = [:]

⚠️ Some examples use only currentRequest: ResourceLoaderRequest to store requests. This causes an issue because if the current request is being processed and the user seeks again, the old request will be canceled and a new one started. However, since these actions may not happen in order—sometimes starting a new request before canceling the old one—using a Dictionary to manage requests is safer!

⚠️Ensure all operations run on the same DispatchQueue to prevent data glitches.

Cancel all ongoing requests during deinit
Resource Loader Deinit means AVURLAsset Deinit, indicating the player no longer needs this resource; therefore, we can cancel any ongoing data requests. Data already loaded will still be written to the cache.

Supplement and Acknowledgments

Thanks to Lex 汤 for the great guidance.

Thanks to granddaughter for providing development advice and support.

This article only focuses on small music files

Large video files may cause Out Of Memory issues in downloadedData and AssetData/PINCacheAssetDataManager.

As mentioned earlier, to solve this issue, use fileHandler seek read/write to operate local Cache for reading and writing (replacing AssetData/PINCacheAssetDataManager); or check Github for any projects that handle big data write/read to file.

Cancel Downloading Items When AVQueuePlayer Switches Playback Items

As mentioned in the previous knowledge, switching the playback target does not trigger a Cancel; for AVPlayer, it goes through AVURLAsset Deinit, so the download is also interrupted; however, AVQueuePlayer does not, because the items remain in the queue, and only the playback target moves to the next item.

The only approach here is to listen for playback target change notifications, then cancel the previous AVURLAsset loading upon receiving the notification.

1
asset.cancelLoading()

Audio Data Encryption and Decryption

Audio encryption and decryption can be performed on the Data obtained from ResourceLoaderRequest, and during storage, encryption and decryption can be applied to the locally stored Data in AssetData’s encode/decode methods.

CryptoKit SHA Usage Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class AssetData: NSObject, NSCoding {
    static let encryptionKeyString = "encryptionKeyExzhgchgli"
    ...
    func encode(with coder: NSCoder) {
        coder.encode(self.contentInformation, forKey: #keyPath(AssetData.contentInformation))
        
        if #available(iOS 13.0, *),
           let encryptionData = try? ChaChaPoly.seal(self.mediaData, using: AssetData.encryptionKey).combined {
            coder.encode(encryptionData, forKey: #keyPath(AssetData.mediaData))
        } else {
          //
        }
    }
    
    required init?(coder: NSCoder) {
        super.init()
        ...
        if let mediaData = coder.decodeObject(forKey: #keyPath(AssetData.mediaData)) as? Data {
            if #available(iOS 13.0, *),
               let sealedBox = try? ChaChaPoly.SealedBox(combined: mediaData),
               let decryptedData = try? ChaChaPoly.open(sealedBox, using: AssetData.encryptionKey) {
                self.mediaData = decryptedData
            } else {
              //
            }
        } else {
            //
        }
    }
}

PINCache includes PINMemoryCache and PINDiskCache. PINCache handles reading from files to memory and writing from memory to files. We only need to operate on PINCache.

Find Cache File Location in the Simulator:

Using NSHomeDirectory() to Get Simulator File Path

Finder -> Go -> Paste Path

In Library -> Caches -> com.pinterest.PINDiskCache.ResourceLoader is the Resource Loader Cache directory we created.

PINCache(name: "ResourceLoader") The name here refers to the directory name.

You can also specify rootPath, so the directory can be changed to under Documents (not worried about being cleared by the system).

Set the maximum limit for PINCache:

1
2
 PINCacheAssetDataManager.Cache.diskCache.byteCount = 300 * 1024 * 1024 // max: 300mb
 PINCacheAssetDataManager.Cache.diskCache.byteLimit = 90 * 60 * 60 * 24 // 90 days

System Default Limit

System Default Limit

Setting it to 0 will prevent the file from being deleted automatically.

Postscript

I initially underestimated the difficulty of this feature, thinking it would be done in no time; however, I struggled a lot and spent about two more weeks dealing with data storage issues. On the bright side, I thoroughly understood the entire Resource Loader mechanism, GCD, and Data.

References

Finally, here are the reference materials on how to implement the research.

  1. iOS AVPlayer Video Cache Design and Implementation Principles Only

  2. Implement audio and video playback and caching based on AVPlayer, supporting synchronized video output [ SZAVPlayer ] includes code (very complete but complex)

  3. CachingPlayerItem (simple implementation, easier to understand but incomplete)

  4. Possibly the Best AVPlayer Audio and Video Caching Solution AVAssetResourceLoaderDelegate

  5. Douyin (TikTok) Swift Version [ Github ] (An interesting project that replicates the Douyin app; it also uses Resource Loader)

  6. iOS HLS Cache Implementation Exploration Journey

Extension

If you have any questions or feedback, feel free to contact me.


Buy me a beer

This post was originally published on Medium (View original post), and automatically converted and synced by ZMediumToMarkdown.

Improve this page on Github.

This post is licensed under CC BY 4.0 by the author.