Post

HTML Parsing Techniques|Build Custom Parsers for NSAttributedString Rendering

Discover how to transform HTML into NSAttributedString with a custom-built ZMarkupParser engine, solving rendering challenges and enhancing text display precision for iOS developers.

HTML Parsing Techniques|Build Custom Parsers for NSAttributedString Rendering

点击这里查看本文章简体中文版本。

點擊這裡查看本文章正體中文版本。

This post was translated with AI assistance — let me know if anything sounds off!


The Story of Building a Handmade HTML Parser

ZMarkupParser HTML to NSAttributedString Rendering Engine Development Diary

Tokenization conversion of HTML strings, normalization processing, generation of abstract syntax trees, application of visitor and builder patterns, and some miscellaneous discussions…

Continuation

Last year, I published an article titled “[TL;DR] Implementing iOS NSAttributedString HTML Render by Yourself,” which briefly introduced using XMLParser to parse HTML and convert it into NSAttributedString.Key. The code structure and ideas in the article were quite messy, as it was just a quick record of issues I encountered earlier and I didn’t spend much time researching this topic back then.

Convert HTML String to NSAttributedString

Revisiting this topic, we need to convert the HTML string provided by the API into an NSAttributedString and apply the corresponding styles to display it in a UITextView/UILabel.

e.g. <b>Test<a>Link</a></b> should display as Test Link

  • Note 1
    It is not recommended to use HTML as the communication and rendering medium between the app and data because the HTML specification is too flexible. Apps cannot support all HTML styles and there is no official HTML conversion rendering engine.

  • Note 2
    Starting from iOS 14, you can use the official native AttributedString to parse Markdown or include the apple/swift-markdown Swift Package to parse Markdown.

  • Note 3
    Due to the large scale of our project and the long-term use of HTML as the medium, we are currently unable to fully switch to Markdown or other markup languages.

  • Note 4
    The HTML here is not meant to display a full HTML webpage, but only to use HTML as a style for rendering Markdown string styles.
    (To render a full page with complex content including images and tables, you still need to use WebView loadHTML)

It is highly recommended to use Markdown as the string rendering markup language. If your project faces the same issue as mine and you have no choice but to use HTML without an elegant tool to convert to NSAttributedString, then please use it.

Friends who read the previous article can directly skip to the ZhgChgLi / ZMarkupParser section.

NSAttributedString.DocumentType.html

The methods for HTML to NSAttributedString found online usually require us to directly use NSAttributedString’s built-in options to render HTML. An example is shown below:

1
2
3
4
5
6
7
let htmlString = "<b>Test<a>Link</a></b>"
let data = htmlString.data(using: String.Encoding.utf8)!
let attributedOptions:[NSAttributedString.DocumentReadingOptionKey: Any] = [
  .documentType :NSAttributedString.DocumentType.html,
  .characterEncoding: String.Encoding.utf8.rawValue
]
let attributedString = try! NSAttributedString(data: data, options: attributedOptions, documentAttributes: nil)

Problems with this approach:

  • Poor performance: This method renders styles through the WebView Core, then switches back to the Main Thread for UI display; rendering over 300 characters takes 0.03 seconds.

  • Word swallowing: For example, marketing copy might use <Congratulation!>, which will be treated as an HTML tag and removed.

  • Cannot customize: For example, you cannot specify the exact boldness level of HTML bold text in NSAttributedString.

  • iOS ≥ 12 intermittent crashes issue with no official fix

  • In iOS 15, a frequent crash issue occurred. Testing showed a 100% crash rate under low battery conditions (fixed in iOS ≥ 15.2).

  • Strings that are too long will cause a crash. Testing shows that inputting strings longer than 54,600 characters will 100% cause a crash (EXC_BAD_ACCESS).

The most painful issue for us remains the crash problem. From the release of iOS 15 to 15.2 before the fix, the app was consistently dominated by this issue. According to data, from 2022/03/11 to 2022/06/08, it caused over 2.4K crashes and affected more than 1.4K users.

This crash issue has existed since iOS 12. iOS 15 just hit a bigger snag, but I guess the fix in iOS 15.2 is only a patch; Apple cannot completely eliminate it.

Secondly, the issue is performance. As a string style Markup Language, it is heavily used in App UILabel/UITextView. As mentioned earlier, one Label requires 0.03 seconds, and multiplying this by multiple UILabel/UITextView instances can cause noticeable lag in user interaction.

XMLParser

The second method is introduced in the previous article, using XMLParser to parse into corresponding NSAttributedString keys and apply styles.

You can refer to the implementation of SwiftRichString and the previous article.

The previous article only explored using XMLParser to parse HTML and perform corresponding conversions, completing an experimental implementation. However, it did not design it as a well-structured and extensible “tool.”

Problems with this approach:

  • Error tolerance 0: <br> / <Congratulation!> / <b>Bold<i>Bold+Italic</b>Italic</i>
    The above three HTML cases may appear, and XMLParser will throw an error and show a blank screen when parsing them.

  • When using XMLParser, the HTML string must fully comply with XML rules and cannot be displayed correctly with tolerance like a browser or NSAttributedString.DocumentType.html.

Standing on the Shoulders of Giants

Neither of the above two solutions can perfectly and elegantly solve the HTML problem, so I started searching for existing solutions.

Searched everywhere but all results are similar to the above project Orz, no giant shoulders to stand on.

ZhgChgLi/ZMarkupParser

Without the shoulders of giants, I had to become the giant myself, so I developed an HTML String to NSAttributedString tool.

Developed purely in Swift, it uses Regex to parse HTML tags and performs tokenization. It analyzes and corrects tag accuracy (fixing unclosed tags & misaligned tags), then converts them into an abstract syntax tree. Finally, it uses the Visitor Pattern to map HTML tags to abstract styles, resulting in the final NSAttributedString output; no parser libraries are used.

Features

  • Support HTML Render (to NSAttributedString) / Stripper (remove HTML tags) / Selector functions

  • Higher Performance than NSAttributedString.DocumentType.html

  • Automatically analyze and correct tag accuracy (fix tags without end tags & misplaced tags)

  • Support dynamic styling from style="color:red..."

  • Supports custom style specification, for example, making bold text bolder

  • Supports flexible expandable tags or custom tags and attributes

For detailed introduction, installation, and usage, please refer to this article: 「 ZMarkupParser HTML String to NSAttributedString Tool

You can directly git clone the project, then open ZMarkupParser.xcworkspace. Select the ZMarkupParser-Demo target and build & run it to try it out.

[ZMarkupParser](https://github.com/ZhgChgLi/ZMarkupParser){:target="_blank"}

ZMarkupParser

Technical Details

Next is the main focus of this article, sharing the technical details about developing this tool.

Operation Flow Overview

Overview of Operation Process

The above image shows the general workflow. The following article will explain each step in detail and include the code.

⚠️ This article simplifies the demo code by reducing abstraction and performance considerations, focusing mainly on explaining the operating principles. For the final results, please refer to the project Source Code.

Tokenization

a.k.a parser, parsing

When it comes to HTML rendering, the most important part is parsing. Previously, HTML was parsed as XML using XMLParser; however, this cannot handle the fact that everyday HTML is not 100% XML, causing parser errors and lacking dynamic correction.

After ruling out the use of XMLParser, the only option left for us in Swift is to use Regex for matching and parsing.

At first, I didn’t think much and planned to use regex to extract “paired” HTML tags directly, then recursively search layer by layer inside until the end. However, this approach couldn’t handle nested HTML tags or support error tolerance for misaligned tags. Therefore, we changed the strategy to extract “single” HTML tags, record whether they are Start Tags, Close Tags, or Self-Closing Tags, and combine other strings into a parsed result array.

Tokenization structure is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
enum HTMLParsedResult {
    case start(StartItem) // <a>
    case close(CloseItem) // </a>
    case selfClosing(SelfClosingItem) // <br/>
    case rawString(NSAttributedString)
}

extension HTMLParsedResult {
    class SelfClosingItem {
        let tagName: String
        let tagAttributedString: NSAttributedString
        let attributes: [String: String]?
        
        init(tagName: String, tagAttributedString: NSAttributedString, attributes: [String : String]?) {
            self.tagName = tagName
            self.tagAttributedString = tagAttributedString
            self.attributes = attributes
        }
    }
    
    class StartItem {
        let tagName: String
        let tagAttributedString: NSAttributedString
        let attributes: [String: String]?

        // Start Tag might be an abnormal HTML tag or normal text e.g. <Congratulation!>. After normalization, if found to be an isolated Start Tag, mark as True.
        var isIsolated: Bool = false
        
        init(tagName: String, tagAttributedString: NSAttributedString, attributes: [String : String]?) {
            self.tagName = tagName
            self.tagAttributedString = tagAttributedString
            self.attributes = attributes
        }
        
        // Used for automatic filling correction during normalization
        func convertToCloseParsedItem() -> CloseItem {
            return CloseItem(tagName: self.tagName)
        }
        
        // Used for automatic filling correction during normalization
        func convertToSelfClosingParsedItem() -> SelfClosingItem {
            return SelfClosingItem(tagName: self.tagName, tagAttributedString: self.tagAttributedString, attributes: self.attributes)
        }
    }
    
    class CloseItem {
        let tagName: String
        init(tagName: String) {
            self.tagName = tagName
        }
    }
}

The regex used is as follows:

1
<(?:(?<closeTag>\/)?(?<tagName>[A-Za-z0-9]+)(?<tagAttributes>(?:\s*(\w+)\s*=\s*(["\|']).*?\5)*)\s*(?<selfClosingTag>\/)?>)

-> Online Regex101 Playground

  • closeTag: matches < / a>

  • tagName: matches < a > or , </ a >

  • tagAttributes: Match <a href="https://zhgchg.li" style="color:red" >

  • selfClosingTag: Match <br / >

*This regex can still be optimized, will do it later

The latter part of the article offers more details on regular expressions for those interested.

Putting it all together:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
var tokenizationResult: [HTMLParsedResult] = []

let expression = try? NSRegularExpression(pattern: pattern, options: expressionOptions)
let attributedString = NSAttributedString(string: "<a>Li<b>nk</a>Bold</b>")
let totalLength = attributedString.string.utf16.count // utf-16 supports emoji
var lastMatch: NSTextCheckingResult?

// Start Tags Stack, First In Last Out (FILO)
// Check if the HTML string needs further normalization to fix misalignment or add Self-Closing Tags
var stackStartItems: [HTMLParsedResult.StartItem] = []
var needFormatter: Bool = false

expression.enumerateMatches(in: attributedString.string, range: NSMakeRange(0, totalLength)) { match, _, _ in
  if let match = match {
    // Check the string between tags or before the first tag
    // e.g. Test<a>Link</a>zzz<b>bold</b>Test2 -> Test, zzz
    let lastMatchEnd = lastMatch?.range.upperBound ?? 0
    let currentMatchStart = match.range.lowerBound
    if currentMatchStart > lastMatchEnd {
      let rawStringBetweenTag = attributedString.attributedSubstring(from: NSMakeRange(lastMatchEnd, (currentMatchStart - lastMatchEnd)))
      tokenizationResult.append(.rawString(rawStringBetweenTag))
    }

    // <a href="https://zhgchg.li">, </a>
    let matchAttributedString = attributedString.attributedSubstring(from: match.range)
    // a, a
    let matchTag = attributedString.attributedSubstring(from: match.range(withName: "tagName"))?.string.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
    // false, true
    let matchIsEndTag = matchResult.attributedString(from: match.range(withName: "closeTag"))?.string.trimmingCharacters(in: .whitespacesAndNewlines) == "/"
    // href="https://zhgchg.li", nil
    // Use regex to parse HTML Attributes to [String: String], see Source Code
    let matchTagAttributes = parseAttributes(matchResult.attributedString(from: match.range(withName: "tagAttributes")))
    // false, false
    let matchIsSelfClosingTag = matchResult.attributedString(from: match.range(withName: "selfClosingTag"))?.string.trimmingCharacters(in: .whitespacesAndNewlines) == "/"

    if let matchAttributedString = matchAttributedString,
       let matchTag = matchTag {
        if matchIsSelfClosingTag {
          // e.g. <br/>
          tokenizationResult.append(.selfClosing(.init(tagName: matchTag, tagAttributedString: matchAttributedString, attributes: matchTagAttributes)))
        } else {
          // e.g. <a> or </a>
          if matchIsEndTag {
            // e.g. </a>
            // Find the last occurrence of the same TagName in the stack
            if let index = stackStartItems.lastIndex(where: { $0.tagName == matchTag }) {
              // If not the last one, it means misalignment or missing closing tag
              if index != stackStartItems.count - 1 {
                  needFormatter = true
              }
              tokenizationResult.append(.close(.init(tagName: matchTag)))
              stackStartItems.remove(at: index)
            } else {
              // Extra close tag e.g </a>
              // Ignore as it won't affect further processing
            }
          } else {
            // e.g. <a>
            let startItem: HTMLParsedResult.StartItem = HTMLParsedResult.StartItem(tagName: matchTag, tagAttributedString: matchAttributedString, attributes: matchTagAttributes)
            tokenizationResult.append(.start(startItem))
            // Push to stack
            stackStartItems.append(startItem)
          }
        }
     }

    lastMatch = match
  }
}

// Check trailing RawString
// e.g. Test<a>Link</a>Test2 -> Test2
if let lastMatch = lastMatch {
  let currentIndex = lastMatch.range.upperBound
  if totalLength > currentIndex {
    // Remaining string exists
    let restString = attributedString.attributedSubstring(from: NSMakeRange(currentIndex, (totalLength - currentIndex)))
    tokenizationResult.append(.rawString(restString))
  }
} else {
  // lastMatch = nil means no tags found, all plain text
  let restString = attributedString.attributedSubstring(from: NSMakeRange(0, totalLength))
  tokenizationResult.append(.rawString(restString))
}

// Check if stack is empty; if not, mark unmatched Start Tags as isolated
for stackStartItem in stackStartItems {
  stackStartItem.isIsolated = true
  needFormatter = true
}

print(tokenizationResult)
// [
//    .start("a",["href":"https://zhgchg.li"])
//    .rawString("Li")
//    .start("b",nil)
//    .rawString("nk")
//    .close("a")
//    .rawString("Bold")
//    .close("b")
// ]

Operation process as shown above

The operation process is shown in the above diagram.

The final result will be an array of tokenization results.

Implementation corresponding to the source code in HTMLStringToParsedResultProcessor.swift

Normalization

a.k.a Formatter, normalization

After obtaining the preliminary parsing results in the previous step, if normalization is still needed during parsing, this step is required to automatically fix HTML tag issues.

There are three types of HTML tag issues:

  • HTML Tag but Missing Close Tag: for example <br>

  • Plain text treated as HTML Tag: for example <Congratulation!>

  • HTML Tag misalignment issue: For example, <a>Li<b>nk</a>Bold</b>

The fix is simple. We need to iterate through the elements of the Tokenization result and try to fill in the missing parts.

Operation process as shown above

The workflow is shown in the above diagram.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
var normalizationResult = tokenizationResult

// Start Tags Stack, First In Last Out (FILO)
var stackExpectedStartItems: [HTMLParsedResult.StartItem] = []
var itemIndex = 0
while itemIndex < newItems.count {
    switch newItems[itemIndex] {
    case .start(let item):
        if item.isIsolated {
            // If it is an isolated Start Tag
            if WC3HTMLTagName(rawValue: item.tagName) == nil && (item.attributes?.isEmpty ?? true) {
                // If not a WC3 defined HTML Tag & has no HTML Attributes
                // See Source Code for WC3HTMLTagName Enum
                // Considered as normal text mistaken as HTML Tag
                // Change to raw string type
                normalizationResult[itemIndex] = .rawString(item.tagAttributedString)
            } else {
                // Otherwise, convert to self-closing tag, e.g. <br> -> <br/>
                normalizationResult[itemIndex] = .selfClosing(item.convertToSelfClosingParsedItem())
            }
            itemIndex += 1
        } else {
            // Normal Start Tag, push to Stack
            stackExpectedStartItems.append(item)
            itemIndex += 1
        }
    case .close(let item):
        // Encounter Close Tag
        // Get Tags between Start Stack Tag and this Close Tag
        // e.g <a><u><b>[CurrentIndex]</a></u></b> -> gap 0
        // e.g <a><u><b>[CurrentIndex]</a></u></b> -> gap b,u

        let reversedStackExpectedStartItems = Array(stackExpectedStartItems.reversed())
        guard let reversedStackExpectedStartItemsOccurredIndex = reversedStackExpectedStartItems.firstIndex(where: { $0.tagName == item.tagName }) else {
            itemIndex += 1
            continue
        }
        
        let reversedStackExpectedStartItemsOccurred = Array(reversedStackExpectedStartItems.prefix(upTo: reversedStackExpectedStartItemsOccurredIndex))
        
        // gap 0 means tags are correctly nested
        guard reversedStackExpectedStartItemsOccurred.count != 0 else {
            // is pair, pop
            stackExpectedStartItems.removeLast()
            itemIndex += 1
            continue
        }
        
        // There are intermediate tags, auto-insert missing tags before and after
        // e.g <a><u><b>[CurrentIndex]</a></u></b> ->
        // e.g <a><u><b>[CurrentIndex]</b></u></a><b></u></u></b>
        let stackExpectedStartItemsOccurred = Array(reversedStackExpectedStartItemsOccurred.reversed())
        let afterItems = stackExpectedStartItemsOccurred.map({ HTMLParsedResult.start($0) })
        let beforeItems = reversedStackExpectedStartItemsOccurred.map({ HTMLParsedResult.close($0.convertToCloseParsedItem()) })
        normalizationResult.insert(contentsOf: afterItems, at: newItems.index(after: itemIndex))
        normalizationResult.insert(contentsOf: beforeItems, at: itemIndex)
        
        itemIndex = newItems.index(after: itemIndex) + stackExpectedStartItemsOccurred.count
        
        // Update Start Stack Tags
        // e.g. -> b,u
        stackExpectedStartItems.removeAll { startItem in
            return reversedStackExpectedStartItems.prefix(through: reversedStackExpectedStartItemsOccurredIndex).contains(where: { $0 === startItem })
        }
    case .selfClosing, .rawString:
        itemIndex += 1
    }
}

print(normalizationResult)
// [
//    .start("a",["href":"https://zhgchg.li"])
//    .rawString("Li")
//    .start("b",nil)
//    .rawString("nk")
//    .close("b")
//    .close("a")
//    .start("b",nil)
//    .rawString("Bold")
//    .close("b")
// ]

Implementation corresponding to the source code in HTMLParsedResultFormatterProcessor.swift

Abstract Syntax Tree

a.k.a AST, Abstract Syntax Tree

After completing data preprocessing with Tokenization & Normalization, the next step is to convert the results into an abstract tree 🌲.

As shown above

As shown in the image above

Converting to an abstract syntax tree makes future operations and expansions easier, such as implementing Selector functionality or other conversions like HTML to Markdown. Similarly, if we want to add Markdown to NSAttributedString later, we only need to implement Markdown tokenization and normalization to achieve it.

First, we define a Markup Protocol with Child & Parent properties to record leaf and branch information:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
protocol Markup: AnyObject {
    var parentMarkup: Markup? { get set }
    var childMarkups: [Markup] { get set }
    
    func appendChild(markup: Markup)
    func prependChild(markup: Markup)
    func accept<V: MarkupVisitor>(_ visitor: V) -> V.Result
}

extension Markup {
    func appendChild(markup: Markup) {
        markup.parentMarkup = self
        childMarkups.append(markup)
    }
    
    func prependChild(markup: Markup) {
        markup.parentMarkup = self
        childMarkups.insert(markup, at: 0)
    }
}

Additionally, use the Visitor Pattern to define each style attribute as an Element object, then apply different Visit strategies to obtain individual application results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
protocol MarkupVisitor {
    associatedtype Result
        
    func visit(markup: Markup) -> Result
    
    func visit(_ markup: RootMarkup) -> Result
    func visit(_ markup: RawStringMarkup) -> Result
    
    func visit(_ markup: BoldMarkup) -> Result
    func visit(_ markup: LinkMarkup) -> Result
    //...
}

extension MarkupVisitor {
    func visit(markup: Markup) -> Result {
        return markup.accept(self)
    }
}

Basic Markup Elements:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Root node
final class RootMarkup: Markup {
    weak var parentMarkup: Markup? = nil
    var childMarkups: [Markup] = []
    
    func accept<V>(_ visitor: V) -> V.Result where V : MarkupVisitor {
        return visitor.visit(self)
    }
}

// Leaf node
final class RawStringMarkup: Markup {
    let attributedString: NSAttributedString
    
    init(attributedString: NSAttributedString) {
        self.attributedString = attributedString
    }
    
    weak var parentMarkup: Markup? = nil
    var childMarkups: [Markup] = []
    
    func accept<V>(_ visitor: V) -> V.Result where V : MarkupVisitor {
        return visitor.visit(self)
    }
}

Define Markup Style Nodes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Branch nodes:

// Link style
final class LinkMarkup: Markup {
    weak var parentMarkup: Markup? = nil
    var childMarkups: [Markup] = []
    
    func accept<V>(_ visitor: V) -> V.Result where V : MarkupVisitor {
        return visitor.visit(self)
    }
}

// Bold style
final class BoldMarkup: Markup {
    weak var parentMarkup: Markup? = nil
    var childMarkups: [Markup] = []
    
    func accept<V>(_ visitor: V) -> V.Result where V : MarkupVisitor {
        return visitor.visit(self)
    }
}

Corresponding implementation in the source code Markup

Before converting to an abstract tree, we also need to…

MarkupComponent

Because our tree structure does not depend on any data structure (for example, a node/LinkMarkup should have URL information to proceed with rendering).
Therefore, we define a separate container to store tree nodes and their related data:

1
2
3
4
5
6
7
8
9
10
11
12
13
protocol MarkupComponent {
    associatedtype T
    var markup: Markup { get }
    var value: T { get }
    
    init(markup: Markup, value: T)
}

extension Sequence where Iterator.Element: MarkupComponent {
    func value(markup: Markup) -> Element.T? {
        return self.first(where:{ $0.markup === markup })?.value as? Element.T
    }
}

Implementation corresponding to the MarkupComponent in the source code

You can also declare Markup as Hashable and directly use a Dictionary to store values [Markup: Any]. However, Markup cannot be used as a regular type and must be written as any Markup.

HTMLTag & HTMLTagName & HTMLTagNameVisitor

For the HTML Tag Name section, we added a layer of abstraction, allowing users to decide which tags need to be processed. This also makes future expansion easier. For example, the <strong> tag name can also correspond to BoldMarkup.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public protocol HTMLTagName {
    var string: String { get }
    func accept<V: HTMLTagNameVisitor>(_ visitor: V) -> V.Result
}

public struct A_HTMLTagName: HTMLTagName {
    public let string: String = WC3HTMLTagName.a.rawValue
    
    public init() {
        
    }
    
    public func accept<V>(_ visitor: V) -> V.Result where V : HTMLTagNameVisitor {
        return visitor.visit(self)
    }
}

public struct B_HTMLTagName: HTMLTagName {
    public let string: String = WC3HTMLTagName.b.rawValue
    
    public init() {
        
    }
    
    public func accept<V>(_ visitor: V) -> V.Result where V : HTMLTagNameVisitor {
        return visitor.visit(self)
    }
}

Corresponding implementation in the source code of HTMLTagNameVisitor

Also refer to the W3C wiki listing HTML tag name enums: WC3HTMLTagName.swift

HTMLTag is simply a container object because we want to allow external specification of the styles corresponding to the HTML tag, so we declare a container to group them together:

1
2
3
4
5
6
7
8
9
struct HTMLTag {
    let tagName: HTMLTagName
    let customStyle: MarkupStyle? // Explained later in Render section
    
    init(tagName: HTMLTagName, customStyle: MarkupStyle? = nil) {
        self.tagName = tagName
        self.customStyle = customStyle
    }
}

Corresponding implementation in the source code HTMLTag

HTMLTagNameToHTMLMarkupVisitor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct HTMLTagNameToMarkupVisitor: HTMLTagNameVisitor {
    typealias Result = Markup
    
    let attributes: [String: String]?
    
    func visit(_ tagName: A_HTMLTagName) -> Result {
        return LinkMarkup()
    }
    
    func visit(_ tagName: B_HTMLTagName) -> Result {
        return BoldMarkup()
    }
    //...
}

Implementation corresponding to HTMLTagNameToHTMLMarkupVisitor in the source code

Convert to Abstract Syntax Tree with HTML Data

We need to convert the normalized HTML data into an abstract tree. First, declare a MarkupComponent data structure to store the HTML data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
struct HTMLElementMarkupComponent: MarkupComponent {
    struct HTMLElement {
        let tag: HTMLTag
        let tagAttributedString: NSAttributedString
        let attributes: [String: String]?
    }
    
    typealias T = HTMLElement
    
    let markup: Markup
    let value: HTMLElement
    init(markup: Markup, value: HTMLElement) {
        self.markup = markup
        self.value = value
    }
}

Convert to Markup Abstract Tree:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
var htmlElementComponents: [HTMLElementMarkupComponent] = []
let rootMarkup = RootMarkup()
var currentMarkup: Markup = rootMarkup

let htmlTags: [String: HTMLTag]
init(htmlTags: [HTMLTag]) {
  self.htmlTags = Dictionary(uniqueKeysWithValues: htmlTags.map{ ($0.tagName.string, $0) })
}

// Start Tags Stack, ensure correct pop of tags
// Normalization has been done before, so errors are unlikely, just to be safe
var stackExpectedStartItems: [HTMLParsedResult.StartItem] = []
for thisItem in from {
    switch thisItem {
    case .start(let item):
        let visitor = HTMLTagNameToMarkupVisitor(attributes: item.attributes)
        let htmlTag = self.htmlTags[item.tagName] ?? HTMLTag(tagName: ExtendTagName(item.tagName))
        // Use Visitor to get the corresponding Markup
        let markup = visitor.visit(tagName: htmlTag.tagName)
        
        // Add self as a leaf node of the current branch
        // Become the current branch node
        htmlElementComponents.append(.init(markup: markup, value: .init(tag: htmlTag, tagAttributedString: item.tagAttributedString, attributes: item.attributes)))
        currentMarkup.appendChild(markup: markup)
        currentMarkup = markup
        
        stackExpectedStartItems.append(item)
    case .selfClosing(let item):
        // Directly add as a leaf node of the current branch
        let visitor = HTMLTagNameToMarkupVisitor(attributes: item.attributes)
        let htmlTag = self.htmlTags[item.tagName] ?? HTMLTag(tagName: ExtendTagName(item.tagName))
        let markup = visitor.visit(tagName: htmlTag.tagName)
        htmlElementComponents.append(.init(markup: markup, value: .init(tag: htmlTag, tagAttributedString: item.tagAttributedString, attributes: item.attributes)))
        currentMarkup.appendChild(markup: markup)
    case .close(let item):
        if let lastTagName = stackExpectedStartItems.popLast()?.tagName,
           lastTagName == item.tagName {
            // When encountering a Close Tag, return to the previous level
            currentMarkup = currentMarkup.parentMarkup ?? currentMarkup
        }
    case .rawString(let attributedString):
        // Directly add as a leaf node of the current branch
        currentMarkup.appendChild(markup: RawStringMarkup(attributedString: attributedString))
    }
}

// print(htmlElementComponents)
// [(markup: LinkMarkup, (tag: a, attributes: ["href":"zhgchg.li"]...)]

The operation result is shown in the above image

The operation result is shown in the above image.

Implementation corresponding to the source code in HTMLParsedResultToHTMLElementWithRootMarkupProcessor.swift

At this point, we have actually completed the Selector functionality 🎉

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public class HTMLSelector: CustomStringConvertible {
    
    let markup: Markup
    let componets: [HTMLElementMarkupComponent]
    init(markup: Markup, componets: [HTMLElementMarkupComponent]) {
        self.markup = markup
        self.componets = componets
    }
    
    public func filter(_ htmlTagName: String) -> [HTMLSelector] {
        let result = markup.childMarkups.filter({ componets.value(markup: $0)?.tag.tagName.isEqualTo(htmlTagName) ?? false })
        return result.map({ .init(markup: $0, componets: componets) })
    }

    //...
}

We can filter leaf node objects layer by layer.

Corresponding implementation in the source code for HTMLSelector

Parser — HTML to MarkupStyle (Abstract of NSAttributedString.Key)

Next, we need to complete the conversion from HTML to MarkupStyle (NSAttributedString.Key).

NSAttributedString sets text styles through NSAttributedString.Key attributes. We abstract all fields of NSAttributedString.Key to correspond to MarkupStyle, MarkupStyleColor, MarkupStyleFont, and MarkupStyleParagraphStyle.

Purpose:

  • The original data structure for Attributes is [NSAttributedString.Key: Any?]. If exposed directly, it’s hard to control the values users provide, and incorrect values can cause crashes, such as .font: 123.

  • Styles need to be inheritable. For example, in <a><b>test</b></a>, the style of the text “test” should inherit both the link and bold styles (bold + link). Exposing the Dictionary directly makes it difficult to control inheritance properly.

  • Encapsulating iOS/macOS (UIKit/AppKit) Objects

MarkupStyle Struct

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
public struct MarkupStyle {
    public var font:MarkupStyleFont
    public var paragraphStyle:MarkupStyleParagraphStyle
    public var foregroundColor:MarkupStyleColor? = nil
    public var backgroundColor:MarkupStyleColor? = nil
    public var ligature:NSNumber? = nil
    public var kern:NSNumber? = nil
    public var tracking:NSNumber? = nil
    public var strikethroughStyle:NSUnderlineStyle? = nil
    public var underlineStyle:NSUnderlineStyle? = nil
    public var strokeColor:MarkupStyleColor? = nil
    public var strokeWidth:NSNumber? = nil
    public var shadow:NSShadow? = nil
    public var textEffect:String? = nil
    public var attachment:NSTextAttachment? = nil
    public var link:URL? = nil
    public var baselineOffset:NSNumber? = nil
    public var underlineColor:MarkupStyleColor? = nil
    public var strikethroughColor:MarkupStyleColor? = nil
    public var obliqueness:NSNumber? = nil
    public var expansion:NSNumber? = nil
    public var writingDirection:NSNumber? = nil
    public var verticalGlyphForm:NSNumber? = nil
    //...

    // Inherited from...
    // Default: If field is nil, fill from the current data object 'from'
    mutating func fillIfNil(from: MarkupStyle?) {
        guard let from = from else { return }
        
        var currentFont = self.font
        currentFont.fillIfNil(from: from.font)
        self.font = currentFont
        
        var currentParagraphStyle = self.paragraphStyle
        currentParagraphStyle.fillIfNil(from: from.paragraphStyle)
        self.paragraphStyle = currentParagraphStyle
        //..
    }

    // MarkupStyle to NSAttributedString.Key: Any
    func render() -> [NSAttributedString.Key: Any] {
        var data: [NSAttributedString.Key: Any] = [:]
        
        if let font = font.getFont() {
            data[.font] = font
        }

        if let ligature = self.ligature {
            data[.ligature] = ligature
        }
        //...
        return data
    }
}

public struct MarkupStyleFont: MarkupStyleItem {
    public enum FontWeight {
        case style(FontWeightStyle)
        case rawValue(CGFloat)
    }
    public enum FontWeightStyle: String {
        case ultraLight, light, thin, regular, medium, semibold, bold, heavy, black
        // ...
    }
    
    public var size: CGFloat?
    public var weight: FontWeight?
    public var italic: Bool?
    //...
}

public struct MarkupStyleParagraphStyle: MarkupStyleItem {
    public var lineSpacing:CGFloat? = nil
    public var paragraphSpacing:CGFloat? = nil
    public var alignment:NSTextAlignment? = nil
    public var headIndent:CGFloat? = nil
    public var tailIndent:CGFloat? = nil
    public var firstLineHeadIndent:CGFloat? = nil
    public var minimumLineHeight:CGFloat? = nil
    public var maximumLineHeight:CGFloat? = nil
    public var lineBreakMode:NSLineBreakMode? = nil
    public var baseWritingDirection:NSWritingDirection? = nil
    public var lineHeightMultiple:CGFloat? = nil
    public var paragraphSpacingBefore:CGFloat? = nil
    public var hyphenationFactor:Float? = nil
    public var usesDefaultHyphenation:Bool? = nil
    public var tabStops: [NSTextTab]? = nil
    public var defaultTabInterval:CGFloat? = nil
    public var textLists: [NSTextList]? = nil
    public var allowsDefaultTighteningForTruncation:Bool? = nil
    public var lineBreakStrategy: NSParagraphStyle.LineBreakStrategy? = nil
    //...
}

public struct MarkupStyleColor {
    let red: Int
    let green: Int
    let blue: Int
    let alpha: CGFloat
    //...
}

Implementation corresponding to MarkupStyle in the source code

Also refer to the W3c wiki, browser predefined color name listing corresponding color name text & color R,G,B enum: MarkupStyleColorName.swift

HTMLTagStyleAttribute & HTMLTagStyleAttributeVisitor

Here, let’s mention these two objects again, because HTML tags allow styling through CSS; similarly, we apply the same abstraction used for HTMLTagName to the HTML style attribute.

For example, HTML might give: <a style=”color:red;font-size:14px”>RedLink</a>, which means the link should be set to red color and size 14px.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public protocol HTMLTagStyleAttribute {
    var styleName: String { get }
    
    func accept<V: HTMLTagStyleAttributeVisitor>(_ visitor: V) -> V.Result
}

public protocol HTMLTagStyleAttributeVisitor {
    associatedtype Result
    
    func visit(styleAttribute: HTMLTagStyleAttribute) -> Result
    func visit(_ styleAttribute: ColorHTMLTagStyleAttribute) -> Result
    func visit(_ styleAttribute: FontSizeHTMLTagStyleAttribute) -> Result
    //...
}

public extension HTMLTagStyleAttributeVisitor {
    func visit(styleAttribute: HTMLTagStyleAttribute) -> Result {
        return styleAttribute.accept(self)
    }
}

Implementation corresponding to HTMLTagStyleAttribute in the source code

HTMLTagStyleAttributeToMarkupStyleVisitor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
struct HTMLTagStyleAttributeToMarkupStyleVisitor: HTMLTagStyleAttributeVisitor {
    typealias Result = MarkupStyle?
    
    let value: String
    
    func visit(_ styleAttribute: ColorHTMLTagStyleAttribute) -> Result {
        // Regex extract Color Hex or Mapping from HTML Pre-defined Color Name, see Source Code
        guard let color = MarkupStyleColor(string: value) else { return nil }
        return MarkupStyle(foregroundColor: color)
    }
    
    func visit(_ styleAttribute: FontSizeHTMLTagStyleAttribute) -> Result {
        // Regex extract 10px -> 10, see Source Code
        guard let size = self.convert(fromPX: value) else { return nil }
        return MarkupStyle(font: MarkupStyleFont(size: CGFloat(size)))
    }
    // ...
}

Implementation corresponding to the source code in HTMLTagAttributeToMarkupStyleVisitor.swift

The value of init is set to the attribute’s value, converted to the corresponding MarkupStyle field based on the visit type.

HTMLElementMarkupComponentMarkupStyleVisitor

After introducing the MarkupStyle object, we will convert the Normalization’s HTMLElementComponents result into MarkupStyle.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
// MarkupStyle Policy
public enum MarkupStylePolicy {
    case respectMarkupStyleFromCode // Prioritize code-based styles, fill in with HTML Style Attribute
    case respectMarkupStyleFromHTMLStyleAttribute // Prioritize HTML Style Attribute, fill in with code-based styles
}

struct HTMLElementMarkupComponentMarkupStyleVisitor: MarkupVisitor {

    typealias Result = MarkupStyle?
    
    let policy: MarkupStylePolicy
    let components: [HTMLElementMarkupComponent]
    let styleAttributes: [HTMLTagStyleAttribute]

    func visit(_ markup: BoldMarkup) -> Result {
        // .bold is just the default style defined in MarkupStyle, please refer to Source Code
        return defaultVisit(components.value(markup: markup), defaultStyle: .bold)
    }
    
    func visit(_ markup: LinkMarkup) -> Result {
        // .link is just the default style defined in MarkupStyle, please refer to Source Code
        var markupStyle = defaultVisit(components.value(markup: markup), defaultStyle: .link) ?? .link
        
        // Get the HtmlElement corresponding to LinkMarkup from HtmlElementComponents
        // Find href parameter in HtmlElement's attributes (HTML URL string)
        if let href = components.value(markup: markup)?.attributes?["href"] as? String,
           let url = URL(string: href) {
            markupStyle.link = url
        }
        return markupStyle
    }

    // ...
}

extension HTMLElementMarkupComponentMarkupStyleVisitor {
    // Get the customized MarkupStyle specified in the HTMLTag container
    private func customStyle(_ htmlElement: HTMLElementMarkupComponent.HTMLElement?) -> MarkupStyle? {
        guard let customStyle = htmlElement?.tag.customStyle else {
            return nil
        }
        return customStyle
    }
    
    // Default action
    func defaultVisit(_ htmlElement: HTMLElementMarkupComponent.HTMLElement?, defaultStyle: MarkupStyle? = nil) -> Result {
        var markupStyle: MarkupStyle? = customStyle(htmlElement) ?? defaultStyle
        // Get the HtmlElement corresponding to LinkMarkup from HtmlElementComponents
        // Check if HtmlElement's attributes contain a `style` attribute
        guard let styleString = htmlElement?.attributes?["style"],
              styleAttributes.count > 0 else {
            // None found
            return markupStyle
        }

        // Has Style Attributes
        // Split Style Value string into array
        // font-size:14px;color:red -> ["font-size":"14px","color":"red"]
        let styles = styleString.split(separator: ";").filter { $0.trimmingCharacters(in: .whitespacesAndNewlines) != "" }.map { $0.split(separator: ":") }
        
        for style in styles {
            guard style.count == 2 else {
                continue
            }
            // e.g. font-size
            let key = style[0].trimmingCharacters(in: .whitespacesAndNewlines)
            // e.g. 14px
            let value = style[1].trimmingCharacters(in: .whitespacesAndNewlines)
            
            if let styleAttribute = styleAttributes.first(where: { $0.isEqualTo(styleName: key) }) {
                // Use the HTMLTagStyleAttributeToMarkupStyleVisitor above to convert back to MarkupStyle
                let visitor = HTMLTagStyleAttributeToMarkupStyleVisitor(value: value)
                if var thisMarkupStyle = visitor.visit(styleAttribute: styleAttribute) {
                    // When Style Attribute has a converted value...
                    // Merge with previous MarkupStyle result
                    thisMarkupStyle.fillIfNil(from: markupStyle)
                    markupStyle = thisMarkupStyle
                }
            }
        }
        
        // If there is a default Style
        if var defaultStyle = defaultStyle {
            switch policy {
                case .respectMarkupStyleFromHTMLStyleAttribute:
                  // Style Attribute MarkupStyle takes priority,
                  // then merge defaultStyle result
                    markupStyle?.fillIfNil(from: defaultStyle)
                case .respectMarkupStyleFromCode:
                  // defaultStyle takes priority,
                  // then merge Style Attribute MarkupStyle result
                  defaultStyle.fillIfNil(from: markupStyle)
                  markupStyle = defaultStyle
            }
        }
        
        return markupStyle
    }
}

Implementation corresponding to the source code in HTMLTagAttributeToMarkupStyleVisitor.swift

We define some default styles in MarkupStyle. If a Markup does not have a style specified from outside the Code for its Tag, the default style will be used.

There are two style inheritance strategies:

  • respectMarkupStyleFromCode:
    Use the default style as the base; then check the Style Attributes to see what styles can be added. If a value already exists, ignore it.

  • respectMarkupStyleFromHTMLStyleAttribute:
    Prioritize Style Attributes; then check the default styles to see what can be added. Ignore if the value already exists.

HTMLElementWithMarkupToMarkupStyleProcessor

Convert the normalization results into AST & MarkupStyleComponent.

Declare a new MarkupComponent to store the corresponding MarkupStyle this time:

1
2
3
4
5
6
7
8
9
10
struct MarkupStyleComponent: MarkupComponent {
    typealias T = MarkupStyle
    
    let markup: Markup
    let value: MarkupStyle
    init(markup: Markup, value: MarkupStyle) {
        self.markup = markup
        self.value = value
    }
}

Simple traversal of a Markup Tree & HTMLElementMarkupComponent structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
let styleAttributes: [HTMLTagStyleAttribute]
let policy: MarkupStylePolicy
    
func process(from: (Markup, [HTMLElementMarkupComponent])) -> [MarkupStyleComponent] {
  var components: [MarkupStyleComponent] = []
  let visitor = HTMLElementMarkupComponentMarkupStyleVisitor(policy: policy, components: from.1, styleAttributes: styleAttributes)
  walk(markup: from.0, visitor: visitor, components: &components)
  return components
}
    
func walk(markup: Markup, visitor: HTMLElementMarkupComponentMarkupStyleVisitor, components: inout [MarkupStyleComponent]) {
        
  if let markupStyle = visitor.visit(markup: markup) {
    components.append(.init(markup: markup, value: markupStyle))
  }
        
  for markup in markup.childMarkups {
    walk(markup: markup, visitor: visitor, components: &components)
  }
}

// print(components)
// [(markup: LinkMarkup, MarkupStyle(link: https://zhgchg.li, color: .blue)]
// [(markup: BoldMarkup, MarkupStyle(font: .init(weight: .bold))]

Implementation corresponding to the source code in HTMLElementWithMarkupToMarkupStyleProcessor.swift

The process result is shown in the above image

The process result is shown in the above image.

Render — Convert To NSAttributedString

Now that we have the HTML Tag abstract tree structure and the corresponding MarkupStyle for each HTML Tag, the final step is to generate the final NSAttributedString rendering result.

MarkupNSAttributedStringVisitor

To convert HTML markup to NSAttributedString in iOS, you can use the following Swift code snippet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import UIKit

let htmlString = "<p>This is <strong>bold</strong> and <em>italic</em> text.</p>"

if let data = htmlString.data(using: .utf8) {
    do {
        let attributedString = try NSAttributedString(
            data: data,
            options: [.documentType: NSAttributedString.DocumentType.html,
                      .characterEncoding: String.Encoding.utf8.rawValue],
            documentAttributes: nil)
        // Use attributedString as needed
    } catch {
        print("Error creating attributed string: \(error)")
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
struct MarkupNSAttributedStringVisitor: MarkupVisitor {
    typealias Result = NSAttributedString
    
    let components: [MarkupStyleComponent]
    // root / base MarkupStyle, specified externally, e.g., can set the font size for the entire string
    let rootStyle: MarkupStyle?
    
    func visit(_ markup: RootMarkup) -> Result {
        // Look down to RawString objects
        return collectAttributedString(markup)
    }
    
    func visit(_ markup: RawStringMarkup) -> Result {
        // Return Raw String
        // Collect all MarkupStyles in the chain
        // Apply Style to NSAttributedString
        return applyMarkupStyle(markup.attributedString, with: collectMarkupStyle(markup))
    }
    
    func visit(_ markup: BoldMarkup) -> Result {
        // Look down to RawString objects
        return collectAttributedString(markup)
    }
    
    func visit(_ markup: LinkMarkup) -> Result {
        // Look down to RawString objects
        return collectAttributedString(markup)
    }
    // ...
}

private extension MarkupNSAttributedStringVisitor {
    // Apply Style to NSAttributedString
    func applyMarkupStyle(_ attributedString: NSAttributedString, with markupStyle: MarkupStyle?) -> NSAttributedString {
        guard let markupStyle = markupStyle else { return attributedString }
        let mutableAttributedString = NSMutableAttributedString(attributedString: attributedString)
        mutableAttributedString.addAttributes(markupStyle.render(), range: NSMakeRange(0, mutableAttributedString.string.utf16.count))
        return mutableAttributedString
    }

    func collectAttributedString(_ markup: Markup) -> NSMutableAttributedString {
        // collect from downstream
        // Root -> Bold -> String("Bold")
        //      \
        //       > String("Test")
        // Result: Bold Test
        // Recursively look down layer by layer for raw strings, visit and combine into final NSAttributedString
        return markup.childMarkups.compactMap({ visit(markup: $0) }).reduce(NSMutableAttributedString()) { partialResult, attributedString in
            partialResult.append(attributedString)
            return partialResult
        }
    }
    
    func collectMarkupStyle(_ markup: Markup) -> MarkupStyle? {
        // collect from upstream
        // String("Test") -> Bold -> Italic -> Root
        // Result: style: Bold+Italic
        // Look up layer by layer for parent tag's markup style
        // Then inherit styles layer by layer
        var currentMarkup: Markup? = markup.parentMarkup
        var currentStyle = components.value(markup: markup)
        while let thisMarkup = currentMarkup {
            guard let thisMarkupStyle = components.value(markup: thisMarkup) else {
                currentMarkup = thisMarkup.parentMarkup
                continue
            }

            if var thisCurrentStyle = currentStyle {
                thisCurrentStyle.fillIfNil(from: thisMarkupStyle)
                currentStyle = thisCurrentStyle
            } else {
                currentStyle = thisMarkupStyle
            }

            currentMarkup = thisMarkup.parentMarkup
        }
        
        if var currentStyle = currentStyle {
            currentStyle.fillIfNil(from: rootStyle)
            return currentStyle
        } else {
            return rootStyle
        }
    }
}

Implementation corresponding to the source code in MarkupNSAttributedStringVisitor.swift

Operation process and results as shown above

The operation process and results are shown in the above image.

In the end, we can get:

1
2
3
4
5
6
7
8
9
10
11
Li{
    NSColor = "Blue";
    NSFont = "<UICTFont: 0x145d17600> font-family: \".SFUI-Regular\"; font-weight: normal; font-style: normal; font-size: 13.00pt";
    NSLink = "https://zhgchg.li";
}nk{
    NSColor = "Blue";
    NSFont = "<UICTFont: 0x145d18710> font-family: \".SFUI-Semibold\"; font-weight: bold; font-style: normal; font-size: 13.00pt";
    NSLink = "https://zhgchg.li";
}Bold{
    NSFont = "<UICTFont: 0x145d18710> font-family: \".SFUI-Semibold\"; font-weight: bold; font-style: normal; font-size: 13.00pt";
}

🎉🎉🎉🎉Completed🎉🎉🎉🎉

At this point, we have completed the entire process of converting an HTML String to NSAttributedString.

Stripper — Remove HTML Tags

Removing HTML tags is relatively simple and only requires:

1
2
3
4
5
6
7
8
9
10
func attributedString(_ markup: Markup) -> NSAttributedString {
  if let rawStringMarkup = markup as? RawStringMarkup {
    return rawStringMarkup.attributedString
  } else {
    return markup.childMarkups.compactMap({ attributedString($0) }).reduce(NSMutableAttributedString()) { partialResult, attributedString in
      partialResult.append(attributedString)
      return partialResult
    }
  }
}

Corresponding to the implementation in the source code MarkupStripperProcessor.swift

Similar to Render, but simply finds RawStringMarkup and returns the content.

Extend — Dynamic Expansion

To cover all HTML Tags and Style Attributes, a dynamic extension point was created to allow direct dynamic extension of objects from the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public struct ExtendTagName: HTMLTagName {
    public let string: String
    
    public init(_ w3cHTMLTagName: WC3HTMLTagName) {
        self.string = w3cHTMLTagName.rawValue
    }
    
    public init(_ string: String) {
        self.string = string.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
    }
    
    public func accept<V>(_ visitor: V) -> V.Result where V : HTMLTagNameVisitor {
        return visitor.visit(self)
    }
}
// to
final class ExtendMarkup: Markup {
    weak var parentMarkup: Markup? = nil
    var childMarkups: [Markup] = []

    func accept<V>(_ visitor: V) -> V.Result where V : MarkupVisitor {
        return visitor.visit(self)
    }
}

//----

public struct ExtendHTMLTagStyleAttribute: HTMLTagStyleAttribute {
    public let styleName: String
    public let render: ((String) -> (MarkupStyle?)) // Dynamically change MarkupStyle using closure
    
    public init(styleName: String, render: @escaping ((String) -> (MarkupStyle?))) {
        self.styleName = styleName
        self.render = render
    }
    
    public func accept<V>(_ visitor: V) -> V.Result where V : HTMLTagStyleAttributeVisitor {
        return visitor.visit(self)
    }
}

ZHTMLParserBuilder

Finally, we use the Builder Pattern to allow external modules to quickly construct the objects required by ZMarkupParser, while implementing proper access level control.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
public final class ZHTMLParserBuilder {
    
    private(set) var htmlTags: [HTMLTag] = []
    private(set) var styleAttributes: [HTMLTagStyleAttribute] = []
    private(set) var rootStyle: MarkupStyle?
    private(set) var policy: MarkupStylePolicy = .respectMarkupStyleFromCode
    
    public init() {
        
    }
    
    public static func initWithDefault() -> Self {
        var builder = Self.init()
        for htmlTagName in ZHTMLParserBuilder.htmlTagNames {
            builder = builder.add(htmlTagName)
        }
        for styleAttribute in ZHTMLParserBuilder.styleAttributes {
            builder = builder.add(styleAttribute)
        }
        return builder
    }
    
    public func set(_ htmlTagName: HTMLTagName, withCustomStyle markupStyle: MarkupStyle?) -> Self {
        return self.add(htmlTagName, withCustomStyle: markupStyle)
    }
    
    public func add(_ htmlTagName: HTMLTagName, withCustomStyle markupStyle: MarkupStyle? = nil) -> Self {
        // Only one tagName can exist
        htmlTags.removeAll { htmlTag in
            return htmlTag.tagName.string == htmlTagName.string
        }
        
        htmlTags.append(HTMLTag(tagName: htmlTagName, customStyle: markupStyle))
        
        return self
    }
    
    public func add(_ styleAttribute: HTMLTagStyleAttribute) -> Self {
        styleAttributes.removeAll { thisStyleAttribute in
            return thisStyleAttribute.styleName == styleAttribute.styleName
        }
        
        styleAttributes.append(styleAttribute)
        
        return self
    }
    
    public func set(rootStyle: MarkupStyle) -> Self {
        self.rootStyle = rootStyle
        return self
    }
    
    public func set(policy: MarkupStylePolicy) -> Self {
        self.policy = policy
        return self
    }
    
    public func build() -> ZHTMLParser {
        // ZHTMLParser init is internal only, cannot be directly initialized externally
        // Can only be initialized via ZHTMLParserBuilder
        return ZHTMLParser(htmlTags: htmlTags, styleAttributes: styleAttributes, policy: policy, rootStyle: rootStyle)
    }
}

Corresponding implementation in the source code ZHTMLParserBuilder.swift

initWithDefault will include all implemented HTMLTagName/Style Attributes by default

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public extension ZHTMLParserBuilder {
    static var htmlTagNames: [HTMLTagName] {
        return [
            A_HTMLTagName(),
            B_HTMLTagName(),
            BR_HTMLTagName(),
            DIV_HTMLTagName(),
            HR_HTMLTagName(),
            I_HTMLTagName(),
            LI_HTMLTagName(),
            OL_HTMLTagName(),
            P_HTMLTagName(),
            SPAN_HTMLTagName(),
            STRONG_HTMLTagName(),
            U_HTMLTagName(),
            UL_HTMLTagName(),
            DEL_HTMLTagName(),
            TR_HTMLTagName(),
            TD_HTMLTagName(),
            TH_HTMLTagName(),
            TABLE_HTMLTagName(),
            IMG_HTMLTagName(handler: nil),
            // ...
        ]
    }
}

public extension ZHTMLParserBuilder {
    static var styleAttributes: [HTMLTagStyleAttribute] {
        return [
            ColorHTMLTagStyleAttribute(),
            BackgroundColorHTMLTagStyleAttribute(),
            FontSizeHTMLTagStyleAttribute(),
            FontWeightHTMLTagStyleAttribute(),
            LineHeightHTMLTagStyleAttribute(),
            WordSpacingHTMLTagStyleAttribute(),
            // ...
        ]
    }
}

ZHTMLParser init is only internal, so it cannot be directly initialized externally. It can only be initialized through ZHTMLParserBuilder.

ZHTMLParser encapsulates Render/Selector/Stripper operations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public final class ZHTMLParser: ZMarkupParser {
    let htmlTags: [HTMLTag]
    let styleAttributes: [HTMLTagStyleAttribute]
    let rootStyle: MarkupStyle?

    internal init(...) {
    }
    
    // Get link style attributes
    public var linkTextAttributes: [NSAttributedString.Key: Any] {
        // ...
    }
    
    public func selector(_ string: String) -> HTMLSelector {
        // ...
    }
    
    public func selector(_ attributedString: NSAttributedString) -> HTMLSelector {
        // ...
    }
    
    public func render(_ string: String) -> NSAttributedString {
        // ...
    }
    
    // Allow rendering NSAttributedString inside nodes using HTMLSelector results
    public func render(_ selector: HTMLSelector) -> NSAttributedString {
        // ...
    }
    
    public func render(_ attributedString: NSAttributedString) -> NSAttributedString {
        // ...
    }
    
    public func stripper(_ string: String) -> String {
        // ...
    }
    
    public func stripper(_ attributedString: NSAttributedString) -> NSAttributedString {
        // ...
    }
    
  // ...
}

Implementation corresponding to the source code in ZHTMLParser.swift

UIKit Issues

The most common use of NSAttributedString is to display it in a UITextView, but be aware:

  • The link style in UITextView is uniformly determined by the linkTextAttributes setting and does not consider the NSAttributedString.Key settings. Individual link styles cannot be set; this is why ZMarkupParser.linkTextAttributes exists as an interface.

  • UILabel currently does not support changing link styles, and since UILabel lacks TextStorage, loading NSTextAttachment images requires additional handling of UILabel.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public extension UITextView {
    func setHtmlString(_ string: String, with parser: ZHTMLParser) {
        self.setHtmlString(NSAttributedString(string: string), with: parser)
    }
    
    func setHtmlString(_ string: NSAttributedString, with parser: ZHTMLParser) {
        self.attributedText = parser.render(string)
        self.linkTextAttributes = parser.linkTextAttributes
    }
}
public extension UILabel {
    func setHtmlString(_ string: String, with parser: ZHTMLParser) {
        self.setHtmlString(NSAttributedString(string: string), with: parser)
    }
    
    func setHtmlString(_ string: NSAttributedString, with parser: ZHTMLParser) {
        let attributedString = parser.render(string)
        attributedString.enumerateAttribute(NSAttributedString.Key.attachment, in: NSMakeRange(0, attributedString.string.utf16.count), options: []) { (value, effectiveRange, nil) in
            guard let attachment = value as? ZNSTextAttachment else {
                return
            }
            
            attachment.register(self)
        }
        
        self.attributedText = attributedString
    }
}

Therefore, by extending UIKit, external code only needs to simply call setHTMLString() to complete the binding.

Complex Rendering Items — Item List

Record of the implementation of the project list.

Using <ol> / <ul> to wrap <li> in HTML indicates a list of items:

1
2
3
4
5
6
<ul>
    <li>ItemA</li>
    <li>ItemB</li>
    <li>ItemC</li>
    //...
</ul>

Using the same parsing method as before, we can obtain other list items and know the current list index in visit(_ markup: ListItemMarkup) (thanks to the conversion to AST).

1
2
3
4
func visit(_ markup: ListItemMarkup) -> Result {
  let siblingListItems = markup.parentMarkup?.childMarkups.filter({ $0 is ListItemMarkup }) ?? []
  let position = (siblingListItems.firstIndex(where: { $0 === markup }) ?? 0)
}

NSParagraphStyle has an NSTextList object that can be used to display list items, but it does not allow customization of the space width (I personally find the space too wide). If there is a space between the bullet and the string, line breaks will occur there, causing the display to look a bit odd, as shown below:

The Beter part might be achievable through setting headIndent, firstLineHeadIndent, NSTextTab, but tests showed that with long strings or size changes, the result still cannot be perfectly displayed.

Currently only achieves Acceptable by inserting the combined item list string at the beginning of the string.

We only use NSTextList.MarkerFormat for list item symbols, not NSTextList directly.

The supported list symbols can be found here: MarkupStyleList.swift

Final display result: ( <ol><li> )

Complex Rendering Items — Table

Similar to implementing list items, but in a table.

Using <table> in HTML to create tables -> wrapping <tr> for table rows -> wrapping <td>/<th> to represent table cells:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<table>
  <tr>
    <th>Company</th>
    <th>Contact</th>
    <th>Country</th>
  </tr>
  <tr>
    <td>Alfreds Futterkiste</td>
    <td>Maria Anders</td>
    <td>Germany</td>
  </tr>
  <tr>
    <td>Centro comercial Moctezuma</td>
    <td>Francisco Chang</td>
    <td>Mexico</td>
  </tr>
</table>

Testing shows that the native NSAttributedString.DocumentType.html uses the private macOS API NSTextBlock for rendering, allowing full display of HTML table styles and content.

A bit of cheating! We can’t use Private API 🥲

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
    func visit(_ markup: TableColumnMarkup) -> Result {
        let attributedString = collectAttributedString(markup)
        let siblingColumns = markup.parentMarkup?.childMarkups.filter({ $0 is TableColumnMarkup }) ?? []
        let position = (siblingColumns.firstIndex(where: { $0 === markup }) ?? 0)
        
        // Whether a desired width is specified externally, can set .max to avoid truncated string
        var maxLength: Int? = markup.fixedMaxLength
        if maxLength == nil {
            // If not specified, find the string length of the first row in the same column as max length
            if let tableRowMarkup = markup.parentMarkup as? TableRowMarkup,
               let firstTableRow = tableRowMarkup.parentMarkup?.childMarkups.first(where: { $0 is TableRowMarkup }) as? TableRowMarkup {
                let firstTableRowColumns = firstTableRow.childMarkups.filter({ $0 is TableColumnMarkup })
                if firstTableRowColumns.indices.contains(position) {
                    let firstTableRowColumnAttributedString = collectAttributedString(firstTableRowColumns[position])
                    let length = firstTableRowColumnAttributedString.string.utf16.count
                    maxLength = length
                }
            }
        }
        
        if let maxLength = maxLength {
            // If the field exceeds maxLength, truncate the string
            if attributedString.string.utf16.count > maxLength {
                attributedString.mutableString.setString(String(attributedString.string.prefix(maxLength))+"...")
            } else {
                attributedString.mutableString.setString(attributedString.string.padding(toLength: maxLength, withPad: " ", startingAt: 0))
            }
        }
        
        if position < siblingColumns.count - 1 {
            // Add spaces as spacing; external can specify how many spaces for spacing width
            attributedString.append(makeString(in: markup, string: String(repeating: " ", count: markup.spacing)))
        }
        
        return attributedString
    }
    
    func visit(_ markup: TableRowMarkup) -> Result {
        let attributedString = collectAttributedString(markup)
        attributedString.append(makeBreakLine(in: markup)) // Add newline, see Source Code for details
        return attributedString
    }
    
    func visit(_ markup: TableMarkup) -> Result {
        let attributedString = collectAttributedString(markup)
        attributedString.append(makeBreakLine(in: markup)) // Add newline, see Source Code for details
        attributedString.insert(makeBreakLine(in: markup), at: 0) // Add newline, see Source Code for details
        return attributedString
    }

The final display effect is as shown in the picture below:

not perfect, but acceptable.

Complex Rendering Item — Image

Finally, let’s discuss the biggest challenge: loading remote images into NSAttributedString.

Using <img> to represent images in HTML:

1
<img src="https://user-images.githubusercontent.com/33706588/219608966-20e0c017-d05c-433a-9a52-091bc0cfd403.jpg" width="300" height="125"/>

You can specify the desired display size using the width / height HTML attributes.

Displaying images in NSAttributedString is much more complicated than expected; there is no good implementation. I encountered some issues before when working on UITextView Text Wrapping, but after researching again, I found there is still no perfect solution.

Currently, ignore the native issue that NSTextAttachment cannot reuse and release memory. First, implement downloading images from a remote source to NSTextAttachment, then insert it into NSAttributedString, and achieve automatic content updating.

This series of operations is further divided into a smaller project for implementation, aiming for easier optimization and reuse in other projects in the future:

Mainly based on Asynchronous NSTextAttachments series, but replaced the final update part (UI refresh needed after download to display) and added Delegate/DataSource for external extension.

Operation flow and relationships as shown above

The operation process and relationships are shown in the above diagram.

  • Declare a ZNSTextAttachmentable object that encapsulates an NSTextStorage object (built-in UITextView) and the UILabel itself (UILabel has no NSTextStorage).
    The operation method is only to implement replacing attributedString from NSRange. (func replace(attachment: ZNSTextAttachment, to: ZResizableNSTextAttachment))

  • The implementation principle is to first use ZNSTextAttachment to wrap the imageURL, PlaceholderImage, and prominent display size information, then directly display the image using the placeholder.

  • When the system needs this image on the screen, it calls the image(forBounds…) method, and at this point, we start downloading the Image Data.

  • DataSource allows external control over how to download or implement Image Cache Policy. By default, it uses URLSession to request image data directly.

  • After downloading, create a new ZResizableNSTextAttachment and implement the custom image size logic in attachmentBounds(for…)

  • Call the replace(attachment: ZNSTextAttachment, to: ZResizableNSTextAttachment) method to replace the position of ZNSTextAttachment with ZResizableNSTextAttachment

  • Send didLoad Delegate notification for external integration when needed

  • Completed

For detailed code, please refer to Source Code .

The reason for not using NSLayoutManager.invalidateLayout(forCharacterRange: range, actualCharacterRange: nil) or NSLayoutManager.invalidateDisplay(forCharacterRange: range) to refresh the UI is that the UI does not update correctly. Since the range is already known, directly triggering a replacement of the NSAttributedString ensures the UI updates properly.

The final display result is as follows:

1
2
<span style="color:red">Hello</span>Hello hello <br />
<img src="https://user-images.githubusercontent.com/33706588/219608966-20e0c017-d05c-433a-9a52-091bc0cfd403.jpg"/>

Testing & Continuous Integration

In this project, besides writing Unit Tests, Snapshot Tests were also created for integration testing to facilitate comprehensive testing and comparison of the final NSAttributedString.

The main functional logic has UnitTests and integration tests, with the final Test Coverage around 85%.

[ZMarkupParser — codecov](https://app.codecov.io/gh/ZhgChgLi/ZMarkupParser){:target="_blank"}

ZMarkupParser — codecov

Snapshot Test

Directly Importing the Framework:

1
2
3
4
5
6
7
8
9
10
11
12
13
import SnapshotTesting
// ...
func testShouldKeepNSAttributedString() {
  let parser = ZHTMLParserBuilder.initWithDefault().build()
  let textView = UITextView()
  textView.frame.size.width = 390
  textView.isScrollEnabled = false
  textView.backgroundColor = .white
  textView.setHtmlString("html string...", with: parser)
  textView.layoutIfNeeded()
  assertSnapshot(matching: textView, as: .image, record: false)
}
// ...

Directly compare the final results to ensure the adjustments and integration have no issues.

Codecov Test Coverage

Integrate with Codecov.io (free for Public Repos) to assess Test Coverage by simply installing the Codecov GitHub App and configuring it.

After setting up Codecov <-> Github Repo, you can also add a codecov.yml file in the project’s root directory.

1
2
3
4
5
6
comment:                  # this is a top-level key
  layout: "reach, diff, flags, files"
  behavior: default
  require_changes: false  # if true: only post the comment if coverage changes
  require_base: no        # [yes :: must have a base report to post]
  require_head: yes       # [yes :: must have a head report to post]

Configuration file, this enables automatic commenting of the CI results on each PR after it is issued.

Continuous Integration

Github Action, CI Integration: ci.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
name: CI

on:
  workflow_dispatch:
  pull_request:
    types: [opened, reopened]
  push:
    branches:
    - main

jobs:
  build:
    runs-on: self-hosted
    steps:
      - uses: actions/checkout@v3
      - name: spm build and test
        run: \|
          set -o pipefail
          xcodebuild test -workspace ZMarkupParser.xcworkspace -testPlan ZMarkupParser -scheme ZMarkupParser -enableCodeCoverage YES -resultBundlePath './scripts/TestResult.xcresult' -destination 'platform=iOS Simulator,name=iPhone 14,OS=16.1' build test \| xcpretty
      - name: Codecov
        uses: codecov/codecov-action@v3.1.1
        with:
          xcode: true
          xcode_archive_path: './scripts/TestResult.xcresult'

This setting runs build and test when a PR is opened/reopened or code is pushed to the main branch, and finally uploads the test coverage report to codecov.

Regex

Regarding regular expressions, each time I use them, I improve further; this time I didn’t use them much, but since I initially wanted to use a regex to extract paired HTML tags, I studied how to write them more carefully.

Some cheat sheet notes I learned this time…

  • ?: allows the ( ) to group the match but does not capture the result
    e.g. (?:https?:\/\/)?(?:www\.)?example\.com will return the entire URL in https://www.example.com instead of https:// and www

  • .+? Non-greedy match (returns the nearest match)
    e.g. <.+?> in <a>test</a> returns <a> and </a> instead of the entire string

  • (?=XYZ) any string until the substring XYZ appears; note that a similar pattern [^XYZ] means any character until X or Y or Z appears.
    e.g. (?:__)(.+?(?=__))(?:__) (any string until __) will match test

  • ?R recursively searches inward for the same pattern
    e.g. \((?:[^()]\|((?R)))+\) applied to (simple) (and(nested)) will match (simple), (and(nested)), and (nested)

  • ?<GroupName>\k<GroupName> matches the previous Group Name
    e.g. (?<tagName><a>).*(\k<GroupName>)

  • (?(X)yes\|no) matches yes if the Xth capture group (or Group Name) has a value, otherwise matches no
    Not supported in Swift for now

Other Great Regex Articles:

Swift Package Manager & Cocoapods

This is also my first time developing with SPM & Cocoapods… quite interesting. SPM is really convenient; however, if two projects depend on the same package, opening both projects simultaneously may cause one of them to fail to find the package and not build properly.

Cocoapods has uploaded ZMarkupParser but hasn’t tested if it works properly, since I use SPM 😝.

ChatGPT

Based on actual development experience, I find it most useful only when helping to polish the Readme; during development, I haven’t felt any significant benefit. Asking mid-senior level questions often results in unclear or even incorrect answers (for example, I asked about some regex rules and the answers were not quite accurate), so in the end, I still rely on Google to manually find the correct solutions.

Not to mention asking it to write code, unless it’s a simple Code Gen Object; otherwise, don’t expect it to complete the entire tool architecture directly. (At least for now, it seems Copilot might be more helpful for coding tasks)

But it can provide general guidance on knowledge gaps, allowing us to quickly get a rough idea of how certain things should be done. Sometimes, when our grasp is too weak, it’s hard to quickly find the right direction on Google. In such cases, ChatGPT is quite helpful.

Disclaimer

After more than three months of research and development, I am exhausted, but I want to clarify that this approach is only a feasible result from my study. It may not be the best solution and could still be optimized. This project is more like a starting point, hoping to find a perfect solution for Markup Language to NSAttributedString. Contributions are highly welcome; many aspects still require collective effort to improve.

Contributing

[ZMarkupParser](https://github.com/ZhgChgLi/ZMarkupParser){:target="_blank"} [⭐](https://github.com/ZhgChgLi/ZMarkupParser){:target="_blank"}

ZMarkupParser

Here are some improvements that come to mind at this moment (2023/03/12). They will be recorded in the Repo later:

  1. Performance/algorithm optimization, although faster and more stable than the native NSAttributedString.DocumentType.html, still has much room for improvement. I believe its performance is definitely not as good as XMLParser. Hopefully, one day it can achieve the same performance while maintaining customization and automatic error correction.

  2. Support for more HTML Tags and Style Attribute conversion and parsing

  3. ZNSTextAttachment further optimized to enable reuse and release memory; may need to study CoreText

  4. Supports Markdown parsing. Since the underlying abstraction is not limited to HTML, as long as the Markdown-to-Markup object is properly built, Markdown parsing can be completed. Therefore, I named it ZMarkupParser instead of ZHTMLParser, hoping that one day it can also support Markdown to NSAttributedString.

  5. Supports Any to Any, e.g. HTML To Markdown, Markdown To HTML. Since we have the original AST tree (Markup object), it is possible to implement conversion between any Markup formats.

  6. Implementing CSS !important Functionality to Enhance the Inheritance Strategy of Abstract MarkupStyle

  7. Enhance the HTML Selector feature, which currently only has the most basic filter functionality.

  8. So many, feel free to open an issue

If you want to support but lack the time, you can give me a ⭐ so this Repo can be seen by more people, increasing the chance for Github experts to contribute!

Summary

[ZMarkupParser](https://github.com/ZhgChgLi/ZMarkupParser){:target="_blank"}

ZMarkupParser

That sums up all the technical details and my journey developing ZMarkupParser. It took me almost three months of after-work and weekend time, countless research and practice sessions, writing tests, improving test coverage, and setting up CI. Only then did I have a somewhat presentable result. I hope this tool helps those facing similar challenges, and I also hope everyone can work together to make this tool even better.

[pinkoi.com](https://www.pinkoi.com){:target="_blank"}

pinkoi.com

Currently applied in our company’s iOS app pinkoi.com, no issues found. 😄

Further Reading

If you have any questions or feedback, feel free to contact me.


Buy me a beer

This post was originally published on Medium (View original post), and automatically converted and synced by ZMediumToMarkdown.

Improve this page on Github.

This post is licensed under CC BY 4.0 by the author.