iOS NSAttributedString HTML Rendering|Efficient Alternative to DocumentType.html
Discover a streamlined method to render HTML in iOS using NSAttributedString without relying on DocumentType.html, solving common rendering issues and boosting app performance.
点击这里查看本文章简体中文版本。
點擊這裡查看本文章正體中文版本。
This post was translated with AI assistance — let me know if anything sounds off!
Implementing iOS NSAttributedString HTML Render by Yourself
iOS NSAttributedString DocumentType.html Alternative
Photo by Florian Olivo
[TL;DR] 2023/03/12
Re-developed using a different approach: ”ZMarkupParser HTML String to NSAttributedString Tool“. For technical details and the development story, please visit “The Story of Manually Building an HTML Parser”.
Origin
Since the release of iOS 15 last year, an app has consistently been plagued by a crash issue. According to data from the past 90 days (2022/03/11~2022/06/08), it caused over 2.4K crashes, affecting more than 1.4K users.
According to the data, the official team appears to have fixed (or reduced the frequency of) this common crash issue in iOS versions ≥ 15.2, as the data shows a downward trend.
Most affected versions: iOS 15.0.X ~ iOS 15.X.X
It was also found that iOS 12 and iOS 13 had occasional crash reports, so this issue has likely existed for a long time. However, the crash rate was nearly 100% in the early versions of iOS 15.
Crash Reasons:
1
<compiler-generated> line 2147483647 specialized @nonobjc NSAttributedString.init(data:options:documentAttributes:)
NSAttributedString crashes during init with Crashed: com.apple.main-thread EXC_BREAKPOINT 0x00000001de9d4e44
.
It is also possible that the operation is not on the Main Thread.
Reproduction Steps:
When this issue suddenly appeared in large numbers, the development team was baffled; re-testing the points in the Crash Log showed no problems, and it was unclear under what conditions users encountered it. Until one time, by coincidence, I switched to “Low Power Mode” and the problem was triggered!! WTF!!!
Answer
After some searching, I found many similar cases online and located the earliest identical crash issue question on the App Developer Forums, along with an official response:
This is a known iOS Foundation bug: it has existed since iOS 12.
To render complex HTML without usage restrictions: please use WKWebView
Rendering Constraints: You may write your own HTML Parser & Renderer
Directly use Markdown as rendering constraints: iOS ≥ 15 NSAttributedString can directly render text using Markdown format
Rendering constraints mean limiting the rendering formats supported by the app, such as only supporting bold, italics, or hyperlinks.
Supplement. Rendering Complex HTML — Creating Text with Rich Visual Effects
Can coordinate an interface together with the backend:
1
2
3
4
5
6
7
8
9
10
{
"content":[
{"type":"text","value":"Paragraph 1 plain text"},
{"type":"text","value":"Paragraph 2 plain text"},
{"type":"text","value":"Paragraph 3 plain text"},
{"type":"text","value":"Paragraph 4 plain text"},
{"type":"image","src":"https://zhgchg.li/logo.png","title":"ZhgChgLi"},
{"type":"text","value":"Paragraph 5 plain text"}
]
}
Can be combined with Markdown to support text rendering, or refer to Medium’s approach:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
"Paragraph": {
"text": "code in text, and link in text, and ZhgChgLi, and bold, and I, only i",
"markups": [
{
"type": "CODE",
"start": 5,
"end": 7
},
{
"start": 18,
"end": 22,
"href": "http://zhgchg.li",
"type": "LINK"
},
{
"type": "STRONG",
"start": 50,
"end": 63
},
{
"type": "EM",
"start": 55,
"end": 69
}
]
}
The meaning of code in text, and link in text, and ZhgChgLi, and bold, and I, only i
is:
1
2
3
4
- Characters 5 to 7 should be marked as code (wrapped with `Text`)
- Characters 18 to 22 should be marked as a link (using [Text](URL) format)
- Characters 50 to 63 should be marked as bold (*Text*)
- Characters 55 to 69 should be marked as italic (_Text_)
With a standardized and describable structure, the app can render natively on its own, achieving optimal performance and user experience.
Pitfalls of UITextView for Text Wrapping with Images, refer to my previous article: iOS UITextView Text Wrapping Editor (Swift)
Why?
Before providing a practical solution, let’s first return to the root of the problem. I personally believe the main cause of this issue does not come from Apple; the official bug is merely the trigger for the problem.
The main issue comes from treating the App side as a Web for rendering. The advantage is fast web development, using the same API endpoint to deliver HTML regardless of client, and flexible rendering of any desired content. The drawbacks are that HTML is not a common interface for apps, app engineers are not expected to understand HTML, performance is very poor, rendering can only occur on the main thread, results are unpredictable during development, and supported specifications cannot be confirmed.
Looking further up, the problem mostly lies in unclear original requirements, uncertainty about which specifications the app needs to support, and rushing that leads to using HTML directly as the interface between the app and the web.
Very Poor Performance
Regarding performance, actual tests show a 5 to 20 times speed difference between using NSAttributedString DocumentType.html
directly and implementing rendering manually.
Better
Since it is an app to be used, a better approach is to start with app development. For apps, the cost of adjusting requirements is much higher than for the web. Effective app development should be based on iterative adjustments with defined specifications. At the moment, it is necessary to confirm the supported specifications. If changes are needed later, we can schedule time to expand the specifications. Quick, on-the-fly changes are not feasible. This approach reduces communication costs and increases work efficiency.
Confirming the scope of requirements
Confirm supported specifications
Confirm interface specifications (Markdown/BBCode/… continuing with HTML is also fine, but if there are restrictions, such as only using
<b>/<i>/<a>/<u>
, it must be clearly communicated to developers in the code)Implementing a rendering engine yourself
Maintenance and iteration support specifications
[2023/02/27 Updated] [TL;DR]:
Updated approach, no longer using XMLParser due to zero fault tolerance:
<br>
/ <Congratulation!>
/ <b>Bold<i>Bold+Italic</b>Italic</i>
These three cases can cause XMLParser to fail and throw an error, resulting in a blank display.
When using XMLParser, the HTML string must fully comply with XML rules and cannot tolerate errors like browsers or NSAttributedString.DocumentType.html, which can display content normally despite errors.
Rewrite using pure Swift, parsing HTML tags with Regex and tokenizing them. Analyze and correct tag accuracy (fix missing end tags & misplaced tags), then convert into an abstract syntax tree. Finally, use the Visitor Pattern to map HTML tags to abstract styles, producing the final NSAttributedString result; no parser libraries are used.
— —
How?
The die is cast. Back to the main topic, now that we are rendering NSAttributedString
using HTML, how can we resolve the aforementioned crashes and performance issues?
Inspired by
Strip HTML Remove HTML
Before discussing HTML Render, let’s first talk about Strip HTML. As mentioned earlier in the Why?
section, where the app obtains HTML and what kind of HTML it receives should be clearly defined in the specifications; it should not be that the app might receive HTML and therefore needs to strip it.
To quote a former supervisor: This is just too crazy, right?
Option 1. NSAttributedString
1
2
3
let data = "<div>Text</div>".data(using: .unicode)!
let attributed = try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
let string = attributed.string
Using NSAttributedString to render HTML and then extracting the string will result in a clean String.
The issues are the same as in this chapter: iOS 15 often crashes, has poor performance, and operations can only be done on the Main Thread.
Option 2. Regex
1
2
htmlString = "<div>Test</div>"
htmlString.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil)
The simplest and most effective way
Regex cannot guarantee complete accuracy. For example,
<p foo=">now what?">Paragraph</p>
is valid HTML but will be stripped incorrectly.
Option 3. XMLParser
Referencing the approach of SwiftRichString, use XMLParser from Foundation to parse HTML as XML and implement your own HTML Parser & Strip functionality.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
import UIKit
// Ref: https://github.com/malcommac/SwiftRichString
final class HTMLStripper: NSObject, XMLParserDelegate {
private static let topTag = "source"
private var xmlParser: XMLParser
private(set) var storedString: String
// The XML parser sometimes splits strings, which can break localization-sensitive
// string transforms. Work around this by using the currentString variable to
// accumulate partial strings, and then reading them back out as a single string
// when the current element ends, or when a new one is started.
private var currentString: String?
// MARK: - Initialization
init(string: String) throws {
let xmlString = HTMLStripper.escapeWithUnicodeEntities(string)
let xml = "<\(HTMLStripper.topTag)>\(xmlString)</\(HTMLStripper.topTag)>"
guard let data = xml.data(using: String.Encoding.utf8) else {
throw XMLParserInitError("Unable to convert to UTF8")
}
self.xmlParser = XMLParser(data: data)
self.storedString = ""
super.init()
xmlParser.shouldProcessNamespaces = false
xmlParser.shouldReportNamespacePrefixes = false
xmlParser.shouldResolveExternalEntities = false
xmlParser.delegate = self
}
/// Parse and generate attributed string.
func parse() throws -> String {
guard xmlParser.parse() else {
let line = xmlParser.lineNumber
let shiftColumn = (line == 1)
let shiftSize = HTMLStripper.topTag.lengthOfBytes(using: String.Encoding.utf8) + 2
let column = xmlParser.columnNumber - (shiftColumn ? shiftSize : 0)
throw XMLParserError(parserError: xmlParser.parserError, line: line, column: column)
}
return storedString
}
// MARK: XMLParserDelegate
@objc func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String: String]) {
foundNewString()
}
@objc func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?) {
foundNewString()
}
@objc func parser(_ parser: XMLParser, foundCharacters string: String) {
currentString = (currentString ?? "").appending(string)
}
// MARK: Support Private Methods
func foundNewString() {
if let currentString = currentString {
storedString.append(currentString)
self.currentString = nil
}
}
// handle html entity / html hex
// Perform string escaping to replace all characters which are not supported by NSXMLParser
// into the specified encoding with decimal entity.
// For example, if your string contains '&' character parser will break the style.
// This option is active by default.
// ref: https://github.com/malcommac/SwiftRichString/blob/e0b72d5c96968d7802856d2be096202c9798e8d1/Sources/SwiftRichString/Support/XMLStringBuilder.swift
static func escapeWithUnicodeEntities(_ string: String) -> String {
guard let escapeAmpRegExp = try? NSRegularExpression(pattern: "&(?!(#[0-9]{2,4}\|[A-z]{2,6});)", options: NSRegularExpression.Options(rawValue: 0)) else {
return string
}
let range = NSRange(location: 0, length: string.count)
return escapeAmpRegExp.stringByReplacingMatches(in: string,
options: NSRegularExpression.MatchingOptions(rawValue: 0),
range: range,
withTemplate: "&")
}
}
let test = "我<br/><a href=\"http://google.com\">同意</a>提供<b><i>個</i>人</b>身分證字號/護照/居留<span style=\"color:#FF0000;font-size:20px;word-spacing:10px;line-height:10px\">證號碼</span>,以供<i>跨境物流</i>方通關<span style=\"background-color:#00FF00;\">使用</span>,並已<img src=\"g.png\"/>了解跨境<br/>商品之物<p>流需</p>求"
let stripper = try HTMLStripper(string: test)
print(try! stripper.parse())
// I agree to provide personal ID number / passport / residence permit number for cross-border logistics clearance use, and have understood the logistics requirements of cross-border goods.
Using Foundation XML Parser to process a String, implement XMLParserDelegate
with currentString
to store the String. Since the String can be split into multiple parts, foundCharacters
may be called repeatedly. Use didStartElement
and didEndElement
to detect the start and end of the string, save the current result, and then clear currentString
.
The advantage is that it also converts HTML Entities to actual characters, e.g.
g -> g
The advantage is handling complexity, but XMLParser will fail if the HTML is non-compliant, e.g.,
<br>
instead of<br/>
.
I personally believe that simply stripping HTML Option 2. is the better method. I introduce this method because rendering HTML uses the same principle, so this serves as a simple example :)
HTML Render w/XMLParser
Using XMLParser to implement it ourselves, similar to the Strip principle, we can add specific rendering methods depending on which Tag is being parsed.
Requirement Specifications:
Support for additional tags to analyze
Support setting Tag Default Style, e.g., applying link style to the <a> tag
Support parsing the
style
attribute, as HTML explicitly specifies display style instyle="color:red"
Style supports changing text weight, size, underline, line spacing, letter spacing, background color, and text color
Does not support complex tags such as Image Tag, Table Tag, etc.
Everyone can reduce features according to their own specifications. For example, if background color adjustment is not needed, then the option to set the background color does not need to be included.
This article is just a conceptual implementation, not a Best Practice in architecture; if there are clear specifications and usage methods, consider applying some Design Patterns to achieve better maintainability and scalability.
⚠️⚠️⚠️ Attention ⚠️⚠️⚠️
A reminder, if your app is brand new or can be fully converted to Markdown format, it is recommended to use the above method. Writing your own render in this article is too complex and won’t perform better than Markdown.
Even if your iOS version is below 15 and does not support native Markdown, you can still find an excellent Markdown Parser solution on GitHub.
HTMLTagParser
1
2
3
4
5
6
7
protocol HTMLTagParser {
static var tag: String { get } // Declare the Tag Name to parse, e.g. a
var storedHTMLAttributes: [String: String]? { get set } // Parsed attributes will be stored here, e.g. href, style
var style: AttributedStringStyle? { get } // Style to apply for this Tag
func render(attributedString: inout NSMutableAttributedString) // Implement the logic to render HTML to attributedString
}
Declare parsable HTML tag entities for easy extension and management.
AttributedStringStyle
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
protocol AttributedStringStyle {
var font: UIFont? { get set }
var color: UIColor? { get set }
var backgroundColor: UIColor? { get set }
var wordSpacing: CGFloat? { get set }
var paragraphStyle: NSParagraphStyle? { get set }
var customs: [NSAttributedString.Key: Any]? { get set } // Universal setting port, it is recommended to abstract this after confirming supported specifications and close this port
func render(attributedString: inout NSMutableAttributedString)
}
// abstract implement
extension AttributedStringStyle {
func render(attributedString: inout NSMutableAttributedString) {
let range = NSMakeRange(0, attributedString.length)
if let font = font {
attributedString.addAttribute(NSAttributedString.Key.font, value: font, range: range)
}
if let color = color {
attributedString.addAttribute(NSAttributedString.Key.foregroundColor, value: color, range: range)
}
if let backgroundColor = backgroundColor {
attributedString.addAttribute(NSAttributedString.Key.backgroundColor, value: backgroundColor, range: range)
}
if let wordSpacing = wordSpacing {
attributedString.addAttribute(NSAttributedString.Key.kern, value: wordSpacing as Any, range: range)
}
if let paragraphStyle = paragraphStyle {
attributedString.addAttribute(NSAttributedString.Key.paragraphStyle, value: paragraphStyle, range: range)
}
if let customAttributes = customs {
attributedString.addAttributes(customAttributes, range: range)
}
}
}
Declare styles available for setting tags.
HTMLStyleAttributedParser
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// only support tag attributed down below
// can set color, font size, line height, word spacing, background color
enum HTMLStyleAttributedParser: String {
case color = "color"
case fontSize = "font-size"
case lineHeight = "line-height"
case wordSpacing = "word-spacing"
case backgroundColor = "background-color"
func render(attributedString: inout NSMutableAttributedString, value: String) -> Bool {
let range = NSMakeRange(0, attributedString.length)
switch self {
case .color:
if let color = convertToiOSColor(value) {
attributedString.addAttribute(NSAttributedString.Key.foregroundColor, value: color, range: range)
return true
}
case .backgroundColor:
if let color = convertToiOSColor(value) {
attributedString.addAttribute(NSAttributedString.Key.backgroundColor, value: color, range: range)
return true
}
case .fontSize:
if let size = convertToiOSSize(value) {
attributedString.addAttribute(NSAttributedString.Key.font, value: UIFont.systemFont(ofSize: CGFloat(size)), range: range)
return true
}
case .lineHeight:
if let size = convertToiOSSize(value) {
let paragraphStyle = NSMutableParagraphStyle()
paragraphStyle.lineSpacing = size
attributedString.addAttribute(NSAttributedString.Key.paragraphStyle, value: paragraphStyle, range: range)
return true
}
case .wordSpacing:
if let size = convertToiOSSize(value) {
attributedString.addAttribute(NSAttributedString.Key.kern, value: size, range: range)
return true
}
}
return false
}
// convert 36px -> 36
private func convertToiOSSize(_ string: String) -> CGFloat? {
guard let regex = try? NSRegularExpression(pattern: "^([0-9]+)"),
let firstMatch = regex.firstMatch(in: string, options: [], range: NSRange(location: 0, length: string.utf16.count)),
let range = Range(firstMatch.range, in: string),
let size = Float(String(string[range])) else {
return nil
}
return CGFloat(size)
}
// convert html hex color #ffffff to UIKit Color
private func convertToiOSColor(_ hexString: String) -> UIColor? {
var cString: String = hexString.trimmingCharacters(in: .whitespacesAndNewlines).uppercased()
if cString.hasPrefix("#") {
cString.remove(at: cString.startIndex)
}
if (cString.count) != 6 {
return nil
}
var rgbValue: UInt64 = 0
Scanner(string: cString).scanHexInt64(&rgbValue)
return UIColor(
red: CGFloat((rgbValue & 0xFF0000) >> 16) / 255.0,
green: CGFloat((rgbValue & 0x00FF00) >> 8) / 255.0,
blue: CGFloat(rgbValue & 0x0000FF) / 255.0,
alpha: CGFloat(1.0)
)
}
}
Implement a Style Attributed Parser to parse style="color:red;font-size:16px"
. Since CSS styles have many possible settings, you need to list the supported range.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
extension HTMLTagParser {
func render(attributedString: inout NSMutableAttributedString) {
defaultStyleRender(attributedString: &attributedString)
}
func defaultStyleRender(attributedString: inout NSMutableAttributedString) {
// setup default style to NSMutableAttributedString
style?.render(attributedString: &attributedString)
// setup & override HTML style (style="color:red;background-color:black") to NSMutableAttributedString if it exists
// any html tag can have style attribute
if let style = storedHTMLAttributes?["style"] {
let styles = style.split(separator: ";").map { $0.split(separator: ":") }.filter { $0.count == 2 }
for style in styles {
let key = String(style[0])
let value = String(style[1])
if let styleAttributed = HTMLStyleAttributedParser(rawValue: key), styleAttributed.render(attributedString: &attributedString, value: value) {
print("Unsupported style attribute or value[\(key):\(value)]")
}
}
}
}
}
Applying HTMLStyleAttributedParser & Abstract Implementation of HTMLStyleAttributedParser.
Some Implementation Examples of Tag Parser & AttributedStringStyle
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
struct LinkStyle: AttributedStringStyle {
var font: UIFont? = UIFont.systemFont(ofSize: 14)
var color: UIColor? = UIColor.blue
var backgroundColor: UIColor? = nil
var wordSpacing: CGFloat? = nil
var paragraphStyle: NSParagraphStyle?
var customs: [NSAttributedString.Key: Any]? = [.underlineStyle: NSUnderlineStyle.single.rawValue]
}
struct ATagParser: HTMLTagParser {
// <a></a>
static let tag: String = "a"
var storedHTMLAttributes: [String: String]? = nil
let style: AttributedStringStyle? = LinkStyle()
func render(attributedString: inout NSMutableAttributedString) {
defaultStyleRender(attributedString: &attributedString)
if let href = storedHTMLAttributes?["href"], let url = URL(string: href) {
let range = NSMakeRange(0, attributedString.length)
attributedString.addAttribute(NSAttributedString.Key.link, value: url, range: range)
}
}
}
struct BoldStyle: AttributedStringStyle {
var font: UIFont? = UIFont.systemFont(ofSize: 14, weight: .bold)
var color: UIColor? = UIColor.black
var backgroundColor: UIColor? = nil
var wordSpacing: CGFloat? = nil
var paragraphStyle: NSParagraphStyle?
var customs: [NSAttributedString.Key: Any]? = [.underlineStyle: NSUnderlineStyle.single.rawValue]
}
struct BoldTagParser: HTMLTagParser {
// <b></b>
static let tag: String = "b"
var storedHTMLAttributes: [String: String]? = nil
let style: AttributedStringStyle? = BoldStyle()
}
HTMLToAttributedStringParser: Core Implementation of XMLParserDelegate
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
// Ref: https://github.com/malcommac/SwiftRichString
final class HTMLToAttributedStringParser: NSObject {
private static let topTag = "source"
private var xmlParser: XMLParser?
private(set) var attributedString: NSMutableAttributedString = NSMutableAttributedString()
private(set) var supportedTagRenders: [HTMLTagParser] = []
private let defaultStyle: AttributedStringStyle
/// Styles applied at each fragment.
private var renderingTagRenders: [HTMLTagParser] = []
// The XML parser sometimes splits strings, which can break localization-sensitive
// string transforms. Work around this by using the currentString variable to
// accumulate partial strings, and then reading them back out as a single string
// when the current element ends, or when a new one is started.
private var currentString: String?
// MARK: - Initialization
init(defaultStyle: AttributedStringStyle) {
self.defaultStyle = defaultStyle
super.init()
}
func register(_ tagRender: HTMLTagParser) {
if let index = supportedTagRenders.firstIndex(where: { type(of: $0).tag == type(of: tagRender).tag }) {
supportedTagRenders.remove(at: index)
}
supportedTagRenders.append(tagRender)
}
/// Parse and generate attributed string.
func parse(string: String) throws -> NSAttributedString {
var xmlString = HTMLToAttributedStringParser.escapeWithUnicodeEntities(string)
// make sure <br/> format is correct XML
// because Web may use <br> to present <br/>, but <br> is not a valid XML
xmlString = xmlString.replacingOccurrences(of: "<br>", with: "<br/>")
let xml = "<\(HTMLToAttributedStringParser.topTag)>\(xmlString)</\(HTMLToAttributedStringParser.topTag)>"
guard let data = xml.data(using: String.Encoding.utf8) else {
throw XMLParserInitError("Unable to convert to UTF8")
}
let xmlParser = XMLParser(data: data)
xmlParser.shouldProcessNamespaces = false
xmlParser.shouldReportNamespacePrefixes = false
xmlParser.shouldResolveExternalEntities = false
xmlParser.delegate = self
self.xmlParser = xmlParser
attributedString = NSMutableAttributedString()
guard xmlParser.parse() else {
let line = xmlParser.lineNumber
let shiftColumn = (line == 1)
let shiftSize = HTMLToAttributedStringParser.topTag.lengthOfBytes(using: String.Encoding.utf8) + 2
let column = xmlParser.columnNumber - (shiftColumn ? shiftSize : 0)
throw XMLParserError(parserError: xmlParser.parserError, line: line, column: column)
}
return attributedString
}
}
// MARK: Private Method
private extension HTMLToAttributedStringParser {
func enter(element elementName: String, attributes: [String: String]) {
// elementName = tagName, EX: a,span,div...
guard elementName != HTMLToAttributedStringParser.topTag else {
return
}
if let index = supportedTagRenders.firstIndex(where: { type(of: $0).tag == elementName }) {
var tagRender = supportedTagRenders[index]
tagRender.storedHTMLAttributes = attributes
renderingTagRenders.append(tagRender)
}
}
func exit(element elementName: String) {
if !renderingTagRenders.isEmpty {
renderingTagRenders.removeLast()
}
}
func foundNewString() {
if let currentString = currentString {
// currentString != nil ,ex: <i>currentString</i>
var newAttributedString = NSMutableAttributedString(string: currentString)
if !renderingTagRenders.isEmpty {
for (key, tagRender) in renderingTagRenders.enumerated() {
// Render Style
tagRender.render(attributedString: &newAttributedString)
renderingTagRenders[key].storedHTMLAttributes = nil
}
} else {
defaultStyle.render(attributedString: &newAttributedString)
}
attributedString.append(newAttributedString)
self.currentString = nil
} else {
// currentString == nil ,ex: <br/>
var newAttributedString = NSMutableAttributedString()
for (key, tagRender) in renderingTagRenders.enumerated() {
// Render Style
tagRender.render(attributedString: &newAttributedString)
renderingTagRenders[key].storedHTMLAttributes = nil
}
attributedString.append(newAttributedString)
}
}
}
// MARK: Helper
extension HTMLToAttributedStringParser {
// handle html entity / html hex
// Perform string escaping to replace all characters which are not supported by NSXMLParser
// into the specified encoding with decimal entity.
// For example if your string contains '&' character parser will break the style.
// This option is active by default.
// ref: https://github.com/malcommac/SwiftRichString/blob/e0b72d5c96968d7802856d2be096202c9798e8d1/Sources/SwiftRichString/Support/XMLStringBuilder.swift
static func escapeWithUnicodeEntities(_ string: String) -> String {
guard let escapeAmpRegExp = try? NSRegularExpression(pattern: "&(?!(#[0-9]{2,4}\|[A-z]{2,6});)", options: NSRegularExpression.Options(rawValue: 0)) else {
return string
}
let range = NSRange(location: 0, length: string.count)
return escapeAmpRegExp.stringByReplacingMatches(in: string,
options: NSRegularExpression.MatchingOptions(rawValue: 0),
range: range,
withTemplate: "&")
}
}
// MARK: XMLParserDelegate
extension HTMLToAttributedStringParser: XMLParserDelegate {
func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String: String]) {
foundNewString()
enter(element: elementName, attributes: attributeDict)
}
func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?) {
foundNewString()
guard elementName != HTMLToAttributedStringParser.topTag else {
return
}
exit(element: elementName)
}
func parser(_ parser: XMLParser, foundCharacters string: String) {
currentString = (currentString ?? "").appending(string)
}
}
Applying the logic of Strip, we can combine the parsed structure by using elementName
to identify the current Tag, then apply the corresponding Tag Parser and the predefined Style.
Test Result
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
let test = "I<br/><a href=\"http://google.com\">agree</a> to provide <b><i>personal</i></b> ID card number/passport/residence <span style=\"color:#FF0000;font-size:20px;word-spacing:10px;line-height:10px\">certificate number</span> for <i>cross-border logistics</i> clearance <span style=\"background-color:#00FF00;\">use</span>, and have <img src=\"g.png\"/>understood the cross-border<br/>product <p>logistics requirements</p>"
let render = HTMLToAttributedStringParser(defaultStyle: DefaultTextStyle())
render.register(ATagParser())
render.register(BoldTagParser())
render.register(SpanTagParser())
//...
print(try! render.parse(string: test))
// Result:
// I{
// NSColor = "UIExtendedGrayColorSpace 0 1";
// NSFont = "\".SFNS-Regular 14.00 pt. P [] (0x13a012970) fobj=0x13a012970, spc=3.79\"";
// NSParagraphStyle = "Alignment 4, LineSpacing 3, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n 140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n 308L,\n 336L\n), DefaultTabInterval 0, Blocks (\n), Lists (\n), BaseWritingDirection -1, HyphenationFactor 0, TighteningForTruncation NO, HeaderLevel 0 LineBreakStrategy 0 PresentationIntents (\n) ListIntentOrdinal 0 CodeBlockIntentLanguageHint ''";
// }agree{
// NSColor = "UIExtendedSRGBColorSpace 0 0 1 1";
// NSFont = "\".SFNS-Regular 14.00 pt. P [] (0x13a012970) fobj=0x13a012970, spc=3.79\"";
// NSLink = "http://google.com";
// NSUnderline = 1;
// }to provide{
// NSColor = "UIExtendedGrayColorSpace 0 1";
// NSFont = "\".SFNS-Regular 14.00 pt. P [] (0x13a012970) fobj=0x13a012970, spc=3.79\"";
// NSParagraphStyle = "Alignment 4, LineSpacing 3, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n 140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n 308L,\n 336L\n), DefaultTabInterval 0, Blocks (\n), Lists (\n), BaseWritingDirection -1, HyphenationFactor 0, TighteningForTruncation NO, HeaderLevel 0 LineBreakStrategy 0 PresentationIntents (\n) ListIntentOrdinal 0 CodeBlockIntentLanguageHint ''";
// }personal{
// NSColor = "UIExtendedGrayColorSpace 0 1";
// NSFont = "\".SFNS-Bold 14.00 pt. P [] (0x13a013870) fobj=0x13a013870, spc=3.46\"";
// NSUnderline = 1;
// }ID card number/passport/residence{
// NSColor = "UIExtendedGrayColorSpace 0 1";
// NSFont = "\".SFNS-Regular 14.00 pt. P [] (0x13a012970) fobj=0x13a012970, spc=3.79\"";
// NSParagraphStyle = "Alignment 4, LineSpacing 3, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n 140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n 308L,\n 336L\n), DefaultTabInterval 0, Blocks (\n), Lists (\n), BaseWritingDirection -1, HyphenationFactor 0, TighteningForTruncation NO, HeaderLevel 0 LineBreakStrategy 0 PresentationIntents (\n) ListIntentOrdinal 0 CodeBlockIntentLanguageHint ''";
// }certificate number{
// NSColor = "UIExtendedSRGBColorSpace 1 0 0 1";
// NSFont = "\".SFNS-Regular 20.00 pt. P [] (0x13a015fa0) fobj=0x13a015fa0, spc=4.82\"";
// NSKern = 10;
// NSParagraphStyle = "Alignment 4, LineSpacing 10, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n 140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n 308L,\n 336L\n), DefaultTabInterval 0, Blocks (\n), Lists (\n), BaseWritingDirection -1, HyphenationFactor 0, TighteningForTruncation NO, HeaderLevel 0 LineBreakStrategy 0 PresentationIntents (\n) ListIntentOrdinal 0 CodeBlockIntentLanguageHint ''";
// }, for cross-border logistics clearance{
// NSColor = "UIExtendedGrayColorSpace 0 1";
// NSFont = "\".SFNS-Regular 14.00 pt. P [] (0x13a012970) fobj=0x13a012970, spc=3.79\"";
// NSParagraphStyle = "Alignment 4, LineSpacing 3, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n 140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n 308L,\n 336L\n), DefaultTabInterval 0, Blocks (\n), Lists (\n), BaseWritingDirection -1, HyphenationFactor 0, TighteningForTruncation NO, HeaderLevel 0 LineBreakStrategy 0 PresentationIntents (\n) ListIntentOrdinal 0 CodeBlockIntentLanguageHint ''";
// }use{
// NSBackgroundColor = "UIExtendedSRGBColorSpace 0 1 0 1";
// NSColor = "UIExtendedGrayColorSpace 0 1";
// NSFont = "\".SFNS-Regular 14.00 pt. P [] (0x13a012970) fobj=0x13a012970, spc=3.79\"";
// NSParagraphStyle = "Alignment 4, LineSpacing 3, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n 140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n 308L,\n 336L\n), DefaultTabInterval 0, Blocks (\n), Lists (\n), BaseWritingDirection -1, HyphenationFactor 0, TighteningForTruncation NO, HeaderLevel 0 LineBreakStrategy 0 PresentationIntents (\n) ListIntentOrdinal 0 CodeBlockIntentLanguageHint ''";
// }, and have understood the cross-border product logistics requirements{
// NSColor = "UIExtendedGrayColorSpace 0 1";
// NSFont = "\".SFNS-Regular 14.00 pt. P [] (0x13a012970) fobj=0x13a012970, spc=3.79\"";
// NSParagraphStyle = "Alignment 4, LineSpacing 3, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n 140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n 308L,\n 336L\n), DefaultTabInterval 0, Blocks (\n), Lists (\n), BaseWritingDirection -1, HyphenationFactor 0, TighteningForTruncation NO, HeaderLevel 0 LineBreakStrategy 0 PresentationIntents (\n) ListIntentOrdinal 0 CodeBlockIntentLanguageHint ''";
// }
Display Result:
Done!
This completes our implementation of HTML rendering using XMLParser, maintaining extensibility and specifications. We can manage and understand the supported string rendering types in the app through the code.
Complete Github Repo as follows
This article is simultaneously published on my personal blog: [Click here].
If you have any questions or feedback, feel free to contact me.
This post was originally published on Medium (View original post), and automatically converted and synced by ZMediumToMarkdown.