Implementing iOS NSAttributedString HTML Rendering on Your Own
An alternative to iOS NSAttributedString DocumentType.html
ℹ️ℹ️ℹ️ The following content is translated by OpenAI.
Click here to view the original Chinese version. | 點此查看本文中文版
Implementing iOS NSAttributedString HTML Rendering on Your Own
An alternative to iOS NSAttributedString DocumentType.html
Photo by Florian Olivo
[TL;DR] 2023/03/12
I have developed a new tool called “ZMarkupParser HTML String to NSAttributedString Converter” using a different approach. For technical details and the development story, please visit “The Journey of Building an HTML Parser”.
Background
Since the release of iOS 15 last year, our app has been plagued by a persistent crash issue. Data shows that in the last 90 days (from 2022/03/11 to 2022/06/08), there were over 2,400 crashes affecting more than 1,400 users.
This significant crash issue appears to have been addressed (or its occurrence reduced) in subsequent versions after iOS ≥ 15.2, as data shows a downward trend.
Most affected versions: iOS 15.0.X to iOS 15.X.X
We also noticed sporadic crashes on iOS 12 and iOS 13, indicating that this issue has likely existed for a long time, but the probability of occurrence in the early versions of iOS 15 was nearly 100%.
Cause of the Crashes:
1
<compiler-generated> line 2147483647 specialized @nonobjc NSAttributedString.init(data:options:documentAttributes:)
The crash occurs when initializing NSAttributedString with Crashed: com.apple.main-thread EXC_BREAKPOINT 0x00000001de9d4e44
.
It’s also possible that the operation is not happening on the Main Thread.
How to Reproduce:
When this issue first emerged, our development team was puzzled; we couldn’t find any problems in the crash logs and were unclear about the conditions under which users experienced the crashes. Until one day, by chance, I switched to “Low Power Mode” and triggered the issue! WTF!!!
Solution
After some searching, I found many similar cases online and discovered the earliest related crash issue inquiry on the App Developer Forums, which received a response from Apple:
- This is a known iOS Foundation bug that has existed since iOS 12.
- For rendering complex HTML without usage constraints, please use WKWebView.
- If there are rendering constraints, you can write your own HTML Parser & Renderer.
- You can directly use Markdown for rendering constraints: iOS ≥ 15 NSAttributedString can render text in Markdown format.
Rendering constraints refer to the formats that the app can support, such as only supporting bold, italic, and hyperlinks.
Additional Note: Rendering Complex HTML — Creating a Rich Text Effect
You can coordinate with the backend to create an interface:
1
2
3
4
5
6
7
8
9
10
{
"content":[
{"type":"text","value":"First paragraph of plain text"},
{"type":"text","value":"Second paragraph of plain text"},
{"type":"text","value":"Third paragraph of plain text"},
{"type":"text","value":"Fourth paragraph of plain text"},
{"type":"image","src":"https://zhgchg.li/logo.png","title":"ZhgChgLi"},
{"type":"text","value":"Fifth paragraph of plain text"}
]
}
This can be combined with Markdown to support text rendering, or you can refer to Medium’s approach:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
"Paragraph": {
"text": "code in text, and link in text, and ZhgChgLi, and bold, and I, only i",
"markups": [
{
"type": "CODE",
"start": 5,
"end": 7
},
{
"start": 18,
"end": 22,
"href": "http://zhgchg.li",
"type": "LINK"
},
{
"type": "STRONG",
"start": 50,
"end": 63
},
{
"type": "EM",
"start": 55,
"end": 69
}
]
}
This means that in the text code in text, and link in text, and ZhgChgLi, and bold, and I, only i
:
1
2
3
4
- Characters 5 to 7 should be marked as code (wrapped in `Text` format).
- Characters 18 to 22 should be marked as a link (wrapped in [Text](URL) format).
- Characters 50 to 63 should be marked as bold (wrapped in *Text* format).
- Characters 55 to 69 should be marked as italic (wrapped in _Text_ format).
With a defined and describable structure, the app can render natively, optimizing performance and user experience.
For pitfalls in using UITextView for rich text, refer to my previous article: iOS UITextView Rich Text Editor (Swift).
Why?
Before diving into the solution, let’s explore the root of the problem. I believe the main issue does not stem from Apple; the official bug is merely the trigger for this problem.
The core issue arises from the app being treated as a web renderer. The advantages are rapid web development, where the same API endpoint can serve HTML without distinguishing between clients, allowing flexible rendering of any desired content. The downsides are that HTML is not a common interface for apps, we cannot expect app engineers to understand HTML, performance is extremely poor, it can only operate on the Main Thread, and results cannot be anticipated during development, making it impossible to confirm supported specifications.
Digging deeper, the original requirements were often unclear, and it was uncertain which specifications the app needed to support. In the rush to deliver, HTML was directly used as the interface between the app and the web.
Performance Issues
To supplement the performance aspect, tests show that using NSAttributedString DocumentType.html
directly has a speed difference of 5 to 20 times compared to a self-implemented rendering method.
Better Approach
Since this is an app, a better approach should start from app development principles. For apps, the cost of adjusting requirements is much higher than for web development. Effective app development should be based on iterative adjustments with clear specifications. We need to confirm the specifications we can support, and if changes are needed later, we can schedule time to expand the specifications. This reduces communication costs and increases work efficiency.
- Confirm the scope of requirements.
- Confirm supported specifications.
- Confirm interface specifications (Markdown/BBCode/… If HTML is still used, it should be constrained, e.g., only using
<b>/<i>/<a>/<u>
, and developers should be clearly informed in the code). - Implement a custom rendering mechanism.
- Maintain and iterate on supported specifications.
[2023/02/27 Updated] [TL;DR]:
The method has been updated to avoid using XMLParser, as its fault tolerance is zero:
<br>
/ <Congratulation!>
/ <b>Bold<i>Bold+Italic</b>Italic</i>
These three scenarios can cause XMLParser to throw an error and display a blank result. When using XMLParser, the HTML string must fully comply with XML rules, unlike browsers or NSAttributedString.DocumentType.html, which can display errors gracefully.
I switched to pure Swift development, using Regex to parse HTML tags and perform tokenization, analyzing and correcting tag validity (fixing unclosed tags and misplaced tags), then converting to an abstract syntax tree. Finally, I used the Visitor Pattern to map HTML tags to abstract styles, resulting in the final NSAttributedString without relying on any parser libraries.
How?
Now that we have established how to render HTML in NSAttributedString
, how can we solve the aforementioned crash and performance issues?
Inspired by
Stripping HTML
Before discussing HTML rendering, let’s first talk about stripping HTML. As mentioned in the previous section “Why?”, the app should know where it will receive HTML and what HTML it should receive, rather than assuming that the app “might” receive HTML that needs to be stripped.
As my former supervisor once said: “Isn’t that just crazy?”
Option 1: NSAttributedString
1
2
3
let data = "<div>Text</div>".data(using: .unicode)!
let attributed = try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
let string = attributed.string
- Using NSAttributedString to render HTML and then extracting the string will yield a clean String.
- The same issues apply as in this chapter: iOS 15 is prone to crashes, has poor performance, and can only operate on the Main Thread.
Option 2: Regex
1
2
htmlString = "<div>Test</div>"
htmlString.replacingOccurrences(of: "<[^>]+>", with: "", options: .regularExpression, range: nil)
- The simplest and most effective method.
- Regex cannot guarantee complete accuracy, e.g.,
<p foo=">now what?">Paragraph</p>
is valid HTML but will be stripped incorrectly.
Option 3: XMLParser
Referencing SwiftRichString, we can use Foundation’s XMLParser to treat HTML as XML and implement our own HTML parser and strip functionality.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
import UIKit
// Ref: https://github.com/malcommac/SwiftRichString
final class HTMLStripper: NSObject, XMLParserDelegate {
private static let topTag = "source"
private var xmlParser: XMLParser
private(set) var storedString: String
// The XML parser sometimes splits strings, which can break localization-sensitive
// string transforms. Work around this by using the currentString variable to
// accumulate partial strings, and then reading them back out as a single string
// when the current element ends, or when a new one is started.
private var currentString: String?
// MARK: - Initialization
init(string: String) throws {
let xmlString = HTMLStripper.escapeWithUnicodeEntities(string)
let xml = "<\(HTMLStripper.topTag)>\(xmlString)</\(HTMLStripper.topTag)>"
guard let data = xml.data(using: String.Encoding.utf8) else {
throw XMLParserInitError("Unable to convert to UTF8")
}
self.xmlParser = XMLParser(data: data)
self.storedString = ""
super.init()
xmlParser.shouldProcessNamespaces = false
xmlParser.shouldReportNamespacePrefixes = false
xmlParser.shouldResolveExternalEntities = false
xmlParser.delegate = self
}
/// Parse and generate attributed string.
func parse() throws -> String {
guard xmlParser.parse() else {
let line = xmlParser.lineNumber
let shiftColumn = (line == 1)
let shiftSize = HTMLStripper.topTag.lengthOfBytes(using: String.Encoding.utf8) + 2
let column = xmlParser.columnNumber - (shiftColumn ? shiftSize : 0)
throw XMLParserError(parserError: xmlParser.parserError, line: line, column: column)
}
return storedString
}
// MARK: XMLParserDelegate
@objc func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String: String]) {
foundNewString()
}
@objc func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?) {
foundNewString()
}
@objc func parser(_ parser: XMLParser, foundCharacters string: String) {
currentString = (currentString ?? "").appending(string)
}
// MARK: Support Private Methods
func foundNewString() {
if let currentString = currentString {
storedString.append(currentString)
self.currentString = nil
}
}
// handle html entity / html hex
// Perform string escaping to replace all characters which is not supported by NSXMLParser
// into the specified encoding with decimal entity.
// For example if your string contains '&' character parser will break the style.
// This option is active by default.
// ref: https://github.com/malcommac/SwiftRichString/blob/e0b72d5c96968d7802856d2be096202c9798e8d1/Sources/SwiftRichString/Support/XMLStringBuilder.swift
static func escapeWithUnicodeEntities(_ string: String) -> String {
guard let escapeAmpRegExp = try? NSRegularExpression(pattern: "&(?!(#[0-9]{2,4}|[A-z]{2,6});)", options: NSRegularExpression.Options(rawValue: 0)) else {
return string
}
let range = NSRange(location: 0, length: string.count)
return escapeAmpRegExp.stringByReplacingMatches(in: string,
options: NSRegularExpression.MatchingOptions(rawValue: 0),
range: range,
withTemplate: "&")
}
}
let test = "我<br/><a href=\"http://google.com\">同意</a>提供<b><i>個</i>人</b>身分證字號/護照/居留<span style=\"color:#FF0000;font-size:20px;word-spacing:10px;line-height:10px\">證號碼</span>,以供<i>跨境物流</i>方通關<span style=\"background-color:#00FF00;\">使用</span>,並已<img src=\"g.png\"/>了解跨境<br/>商品之物<p>流需</p>求"
let stripper = try HTMLStripper(string: test)
print(try! stripper.parse())
// 我同意提供個人身分證字號/護照/居留證號碼,以供跨境物流方通關使用,並已了解跨境商品之物流需求
Using Foundation’s XML Parser to handle the string, we implement XMLParserDelegate
to store the string in currentString
, as the string can sometimes be split into multiple parts. The foundCharacters
method may be called multiple times, so when a new string is found, we store the current result and clear currentString
.
- The advantage is that it will also convert HTML entities to actual characters, e.g.,
g -> g
. - The advantage is that it handles complex HTML, and if it encounters non-compliant HTML, the XMLParser will fail, e.g., forgetting to close
<br>
as<br/>
.
Personally, I believe that if you simply want to strip HTML, Option 2 is the better method. I am introducing this method because rendering HTML uses the same principle, and this serves as a simple example. :)
HTML Rendering with XMLParser
Using XMLParser to implement rendering, similar to the stripping process, we can add parsing logic for what rendering method to apply for each tag.
Requirements:
- Support for extending the tags to be parsed.
- Support for setting default styles for tags, e.g., applying link styles to the
<a>
tag. - Support for parsing
style
attributes, as HTML may specify styles likestyle="color:red"
. - Styles should support changing font weight, size, underline, line spacing, letter spacing, background color, and text color.
- Do not support image tags, table tags, or other more complex tags.
You can adjust the functionality according to your specific requirements. For example, if you don’t need to support background color adjustments, you can omit that option.
The following text is a conceptual implementation and not a best practice in architecture. If there are clear specifications or usage methods, consider applying some design patterns to achieve better maintainability and extensibility.
⚠️⚠️⚠️ Attention ⚠️⚠️⚠️
Just a reminder, if your app is brand new or has the opportunity to be completely converted to Markdown format, it is still recommended to use the above method. Writing your own renderer is too complex and won’t perform better than Markdown.
Even if you are using iOS < 15, which does not support native Markdown, you can still find a great Markdown Parser solution on GitHub.
HTMLTagParser
1
2
3
4
5
6
7
protocol HTMLTagParser {
static var tag: String { get } // Declare the tag name to be parsed, e.g., a
var storedHTMLAttributes: [String: String]? { get set } // Attributed parsing results will be stored here, e.g., href, style
var style: AttributedStringStyle? { get } // The style to be applied to this tag
func render(attributedString: inout NSMutableAttributedString) // Implement the logic to render HTML to attributedString
}
This declares the HTML tag entities that can be parsed, making it easier to manage and extend.
AttributedStringStyle
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
protocol AttributedStringStyle {
var font: UIFont? { get set }
var color: UIColor? { get set }
var backgroundColor: UIColor? { get set }
var wordSpacing: CGFloat? { get set }
var paragraphStyle: NSParagraphStyle? { get set }
var customs: [NSAttributedString.Key: Any]? { get set } // A universal setting interface; it is recommended to abstract this out after confirming supported specifications and close this interface
func render(attributedString: inout NSMutableAttributedString)
}
// Abstract implementation
extension AttributedStringStyle {
func render(attributedString: inout NSMutableAttributedString) {
let range = NSMakeRange(0, attributedString.length)
if let font = font {
attributedString.addAttribute(NSAttributedString.Key.font, value: font, range: range)
}
if let color = color {
attributedString.addAttribute(NSAttributedString.Key.foregroundColor, value: color, range: range)
}
if let backgroundColor = backgroundColor {
attributedString.addAttribute(NSAttributedString.Key.backgroundColor, value: backgroundColor, range: range)
}
if let wordSpacing = wordSpacing {
attributedString.addAttribute(NSAttributedString.Key.kern, value: wordSpacing as Any, range: range)
}
if let paragraphStyle = paragraphStyle {
attributedString.addAttribute(NSAttributedString.Key.paragraphStyle, value: paragraphStyle, range: range)
}
if let customAttributes = customs {
attributedString.addAttributes(customAttributes, range: range)
}
}
}
This declares the styles that can be set for the tags.
HTMLStyleAttributedParser
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// Only supports the tag attributes listed below
// Can set color, font size, line height, word spacing, background color
enum HTMLStyleAttributedParser: String {
case color = "color"
case fontSize = "font-size"
case lineHeight = "line-height"
case wordSpacing = "word-spacing"
case backgroundColor = "background-color"
func render(attributedString: inout NSMutableAttributedString, value: String) -> Bool {
let range = NSMakeRange(0, attributedString.length)
switch self {
case .color:
if let color = convertToiOSColor(value) {
attributedString.addAttribute(NSAttributedString.Key.foregroundColor, value: color, range: range)
return true
}
case .backgroundColor:
if let color = convertToiOSColor(value) {
attributedString.addAttribute(NSAttributedString.Key.backgroundColor, value: color, range: range)
return true
}
case .fontSize:
if let size = convertToiOSSize(value) {
attributedString.addAttribute(NSAttributedString.Key.font, value: UIFont.systemFont(ofSize: CGFloat(size)), range: range)
return true
}
case .lineHeight:
if let size = convertToiOSSize(value) {
let paragraphStyle = NSMutableParagraphStyle()
paragraphStyle.lineSpacing = size
attributedString.addAttribute(NSAttributedString.Key.paragraphStyle, value: paragraphStyle, range: range)
return true
}
case .wordSpacing:
if let size = convertToiOSSize(value) {
attributedString.addAttribute(NSAttributedString.Key.kern, value: size, range: range)
return true
}
}
return false
}
// Convert 36px -> 36
private func convertToiOSSize(_ string: String) -> CGFloat? {
guard let regex = try? NSRegularExpression(pattern: "^([0-9]+)"),
let firstMatch = regex.firstMatch(in: string, options: [], range: NSRange(location: 0, length: string.utf16.count)),
let range = Range(firstMatch.range, in: string),
let size = Float(String(string[range])) else {
return nil
}
return CGFloat(size)
}
// Convert HTML hex color #ffffff to UIKit Color
private func convertToiOSColor(_ hexString: String) -> UIColor? {
var cString: String = hexString.trimmingCharacters(in: .whitespacesAndNewlines).uppercased()
if cString.hasPrefix("#") {
cString.remove(at: cString.startIndex)
}
if (cString.count) != 6 {
return nil
}
var rgbValue: UInt64 = 0
Scanner(string: cString).scanHexInt64(&rgbValue)
return UIColor(
red: CGFloat((rgbValue & 0xFF0000) >> 16) / 255.0,
green: CGFloat((rgbValue & 0x00FF00) >> 8) / 255.0,
blue: CGFloat(rgbValue & 0x0000FF) / 255.0,
alpha: CGFloat(1.0)
)
}
}
This implements the Style Attributed Parser to parse style="color:red;font-size:16px"
, but since CSS Style has many configurable styles, it is necessary to enumerate the supported range.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
extension HTMLTagParser {
func render(attributedString: inout NSMutableAttributedString) {
defaultStyleRender(attributedString: &attributedString)
}
func defaultStyleRender(attributedString: inout NSMutableAttributedString) {
// Setup default style to NSMutableAttributedString
style?.render(attributedString: &attributedString)
// Setup & override HTML style (style="color:red;background-color:black") to NSMutableAttributedString if it exists
// Any HTML tag can have a style attribute
if let style = storedHTMLAttributes?["style"] {
let styles = style.split(separator: ";").map { $0.split(separator: ":") }.filter { $0.count == 2 }
for style in styles {
let key = String(style[0])
let value = String(style[1])
if let styleAttributed = HTMLStyleAttributedParser(rawValue: key), styleAttributed.render(attributedString: &attributedString, value: value) {
print("Unsupported style attribute or value[\(key):\(value)]")
}
}
}
}
}
This applies the HTMLStyleAttributedParser & HTMLStyleAttributedParser abstract implementation.
Examples of Some Tag Parsers & AttributedStringStyle Implementations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
struct LinkStyle: AttributedStringStyle {
var font: UIFont? = UIFont.systemFont(ofSize: 14)
var color: UIColor? = UIColor.blue
var backgroundColor: UIColor? = nil
var wordSpacing: CGFloat? = nil
var paragraphStyle: NSParagraphStyle?
var customs: [NSAttributedString.Key: Any]? = [.underlineStyle: NSUnderlineStyle.single.rawValue]
}
struct ATagParser: HTMLTagParser {
// <a></a>
static let tag: String = "a"
var storedHTMLAttributes: [String: String]? = nil
let style: AttributedStringStyle? = LinkStyle()
func render(attributedString: inout NSMutableAttributedString) {
defaultStyleRender(attributedString: &attributedString)
if let href = storedHTMLAttributes?["href"], let url = URL(string: href) {
let range = NSMakeRange(0, attributedString.length)
attributedString.addAttribute(NSAttributedString.Key.link, value: url, range: range)
}
}
}
struct BoldStyle: AttributedStringStyle {
var font: UIFont? = UIFont.systemFont(ofSize: 14, weight: .bold)
var color: UIColor? = UIColor.black
var backgroundColor: UIColor? = nil
var wordSpacing: CGFloat? = nil
var paragraphStyle: NSParagraphStyle?
var customs: [NSAttributedString.Key: Any]? = [.underlineStyle: NSUnderlineStyle.single.rawValue]
}
struct BoldTagParser: HTMLTagParser {
// <b></b>
static let tag: String = "b"
var storedHTMLAttributes: [String: String]? = nil
let style: AttributedStringStyle? = BoldStyle()
}
HTMLToAttributedStringParser: Core Implementation of XMLParserDelegate
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
// Ref: https://github.com/malcommac/SwiftRichString
final class HTMLToAttributedStringParser: NSObject {
private static let topTag = "source"
private var xmlParser: XMLParser?
private(set) var attributedString: NSMutableAttributedString = NSMutableAttributedString()
private(set) var supportedTagRenders: [HTMLTagParser] = []
private let defaultStyle: AttributedStringStyle
/// Styles applied at each fragment.
private var renderingTagRenders: [HTMLTagParser] = []
// The XML parser sometimes splits strings, which can break localization-sensitive
// string transforms. Work around this by using the currentString variable to
// accumulate partial strings, and then reading them back out as a single string
// when the current element ends, or when a new one is started.
private var currentString: String?
// MARK: - Initialization
init(defaultStyle: AttributedStringStyle) {
self.defaultStyle = defaultStyle
super.init()
}
func register(_ tagRender: HTMLTagParser) {
if let index = supportedTagRenders.firstIndex(where: { type(of: $0).tag == type(of: tagRender).tag }) {
supportedTagRenders.remove(at: index)
}
supportedTagRenders.append(tagRender)
}
/// Parse and generate attributed string.
func parse(string: String) throws -> NSAttributedString {
var xmlString = HTMLToAttributedStringParser.escapeWithUnicodeEntities(string)
// Make sure <br/> format is correct XML
// Because web may use <br> to present <br/>, but <br> is not a valid XML
xmlString = xmlString.replacingOccurrences(of: "<br>", with: "<br/>")
let xml = "<\(HTMLToAttributedStringParser.topTag)>\(xmlString)</\(HTMLToAttributedStringParser.topTag)>"
guard let data = xml.data(using: String.Encoding.utf8) else {
throw XMLParserInitError("Unable to convert to UTF8")
}
let xmlParser = XMLParser(data: data)
xmlParser.shouldProcessNamespaces = false
xmlParser.shouldReportNamespacePrefixes = false
xmlParser.shouldResolveExternalEntities = false
xmlParser.delegate = self
self.xmlParser = xmlParser
attributedString = NSMutableAttributedString()
guard xmlParser.parse() else {
let line = xmlParser.lineNumber
let shiftColumn = (line == 1)
let shiftSize = HTMLToAttributedStringParser.topTag.lengthOfBytes(using: String.Encoding.utf8) + 2
let column = xmlParser.columnNumber - (shiftColumn ? shiftSize : 0)
throw XMLParserError(parserError: xmlParser.parserError, line: line, column: column)
}
return attributedString
}
}
// MARK: Private Method
private extension HTMLToAttributedStringParser {
func enter(element elementName: String, attributes: [String: String]) {
// elementName = tagName, EX: a, span, div...
guard elementName != HTMLToAttributedStringParser.topTag else {
return
}
if let index = supportedTagRenders.firstIndex(where: { type(of: $0).tag == elementName }) {
var tagRender = supportedTagRenders[index]
tagRender.storedHTMLAttributes = attributes
renderingTagRenders.append(tagRender)
}
}
func exit(element elementName: String) {
if !renderingTagRenders.isEmpty {
renderingTagRenders.removeLast()
}
}
func foundNewString() {
if let currentString = currentString {
// currentString != nil, ex: <i>currentString</i>
var newAttributedString = NSMutableAttributedString(string: currentString)
if !renderingTagRenders.isEmpty {
for (key, tagRender) in renderingTagRenders.enumerated() {
// Render Style
tagRender.render(attributedString: &newAttributedString)
renderingTagRenders[key].storedHTMLAttributes = nil
}
} else {
defaultStyle.render(attributedString: &newAttributedString)
}
attributedString.append(newAttributedString)
self.currentString = nil
} else {
// currentString == nil, ex: <br/>
var newAttributedString = NSMutableAttributedString()
for (key, tagRender) in renderingTagRenders.enumerated() {
// Render Style
tagRender.render(attributedString: &newAttributedString)
renderingTagRenders[key].storedHTMLAttributes = nil
}
attributedString.append(newAttributedString)
}
}
}
// MARK: Helper
extension HTMLToAttributedStringParser {
// Handle HTML entity / HTML hex
// Perform string escaping to replace all characters which are not supported by NSXMLParser
// into the specified encoding with decimal entity.
// For example, if your string contains '&' character, the parser will break the style.
// This option is active by default.
// Ref: https://github.com/malcommac/SwiftRichString/blob/e0b72d5c96968d7802856d2be096202c9798e8d1/Sources/SwiftRichString/Support/XMLStringBuilder.swift
static func escapeWithUnicodeEntities(_ string: String) -> String {
guard let escapeAmpRegExp = try? NSRegularExpression(pattern: "&(?!(#[0-9]{2,4}|[A-z]{2,6});)", options: NSRegularExpression.Options(rawValue: 0)) else {
return string
}
let range = NSRange(location: 0, length: string.count)
return escapeAmpRegExp.stringByReplacingMatches(in: string,
options: NSRegularExpression.MatchingOptions(rawValue: 0),
range: range,
withTemplate: "&")
}
}
// MARK: XMLParserDelegate
extension HTMLToAttributedStringParser: XMLParserDelegate {
func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String: String]) {
foundNewString()
enter(element: elementName, attributes: attributeDict)
}
func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?) {
foundNewString()
guard elementName != HTMLToAttributedStringParser.topTag else {
return
}
exit(element: elementName)
}
func parser(_ parser: XMLParser, foundCharacters string: String) {
currentString = (currentString ?? "").appending(string)
}
}
By applying the logic of stripping, we can combine the structured components within it, using elementName
to identify the current tag and apply the corresponding Tag Parser along with the defined styles.
Test Result
1
2
3
4
5
6
7
let test = "我<br/><a href=\"http://google.com\">同意</a>提供<b><i>個</i>人</b>身分證字號/護照/居留<span style=\"color:#FF0000;font-size:20px;word-spacing:10px;line-height:10px\">證號碼</span>,以供<i>跨境物流</i>方通關<span style=\"background-color:#00FF00;\">使用</span>,並已<img src=\"g.png\"/>了解跨境<br/>商品之物<p>流需</p>求"
let render = HTMLToAttributedStringParser(defaultStyle: DefaultTextStyle())
render.register(ATagParser())
render.register(BoldTagParser())
render.register(SpanTagParser())
//...
print(try! render.parse(string: test))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
**Display Result:**

### Done!
We have successfully implemented HTML rendering functionality using XMLParser, while maintaining extensibility and compliance with specifications. This allows us to manage and understand the types of string rendering currently supported by the app through the code.
### Complete GitHub Repository:
[](https://github.com/zhgchgli0718/HTMLToAttributedStringRednerExample){:target="_blank"}
> _This article is also published on my personal blog: [**\[Click here to visit\]**](../a8c2d26cc734/)._
> _If you have any questions or feedback, feel free to [contact me](https://www.zhgchg.li/contact){:target="_blank"}._
This article was first published on Medium ➡️ Click Here
Automatically converted and synchronized using ZMediumToMarkdown and Medium-to-jekyll-starter.