Post

ZMarkupParser HTML String to NSAttributedString Converter Tool

Convert HTML String to NSAttributedString with corresponding key style settings

ZMarkupParser HTML String to NSAttributedString Converter Tool

ℹ️ℹ️ℹ️ The following content is translated by OpenAI.

Click here to view the original Chinese version. | 點此查看本文中文版


ZMarkupParser HTML String to NSAttributedString Converter Tool

Convert HTML String to NSAttributedString with corresponding key style settings.

ZhgChgLi / ZMarkupParser

[ZhgChgLi](https://github.com/ZhgChgLi){:target="_blank"} [ZMarkupParser](https://github.com/ZhgChgLi/ZMarkupParser){:target="_blank"}

ZhgChgLi / ZMarkupParser

Features

  • Developed entirely in Swift, it uses Regex to parse HTML tags, followed by tokenization to analyze and correct tag validity (fixing unclosed tags and misaligned tags), and finally converts to an abstract syntax tree. The Visitor Pattern is then used to map HTML tags to abstract styles, resulting in the final NSAttributedString output, without relying on any parser libraries.
  • Supports HTML rendering (to NSAttributedString), stripping HTML tags, and selector functionality.
  • Automatically analyzes and corrects tag validity (fixing unclosed tags and misaligned tags): <br><br/>
    <b>Bold<i>Bold+Italic</b>Italic</i><b>Bold<i>Bold+Italic</i></b><i>Italic</i>
    <Congratulation!><Congratulation!> (treated as a string)
  • Supports custom style specifications, e.g., <b></b>weight: .semibold & underline: 1
  • Allows for custom HTML tag parsing, e.g., parsing <zhgchgli></zhgchgli> into desired styles.
  • Includes a structural design that facilitates HTML tag expansion. Currently, it supports basic styles, as well as rendering for ul/ol/li lists and hr separators. Future expansions will allow for quick support of additional HTML tags.
  • Supports style parsing from style HTML attributes. HTML can specify text styles through style, and this library can also support styles specified in style, e.g., <b style="font-size: 20px"></b>Bold + font size 20 px.
  • Compatible with iOS/macOS.
  • Supports HTML color names to UIColor/NSColor.
  • Test coverage: 80%+
  • Supports parsing of <img> images, <ul> lists, <table> tables, and more HTML tags.
  • Offers higher performance than NSAttributedString.DocumentType.html.

Performance Analysis

[Performance Benchmark](https://quickchart.io/chart-maker/view/zm-73887470-e667-4ca3-8df0-fe3563832b0b){:target="_blank"}

Performance Benchmark

  • Test environment: 2022/M2/24GB Memory/macOS 13.2/XCode 14.1
  • X-axis: HTML character count
  • Y-axis: Rendering time (seconds)

*Additionally, NSAttributedString.DocumentType.html crashes (EXC_BAD_ACCESS) for strings longer than 54,600+ characters.

Try It Out

You can directly download the project, open ZMarkupParser.xcworkspace, and select the ZMarkupParser-Demo target to build and run for testing.

Installation

Supports SPM/Cocoapods. Please refer to the Readme.

Usage

Style Declaration

MarkupStyle/MarkupStyleColor/MarkupStyleParagraphStyle, encapsulating NSAttributedString.Key.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
var font: MarkupStyleFont
var paragraphStyle: MarkupStyleParagraphStyle
var foregroundColor: MarkupStyleColor? = nil
var backgroundColor: MarkupStyleColor? = nil
var ligature: NSNumber? = nil
var kern: NSNumber? = nil
var tracking: NSNumber? = nil
var strikethroughStyle: NSUnderlineStyle? = nil
var underlineStyle: NSUnderlineStyle? = nil
var strokeColor: MarkupStyleColor? = nil
var strokeWidth: NSNumber? = nil
var shadow: NSShadow? = nil
var textEffect: String? = nil
var attachment: NSTextAttachment? = nil
var link: URL? = nil
var baselineOffset: NSNumber? = nil
var underlineColor: MarkupStyleColor? = nil
var strikethroughColor: MarkupStyleColor? = nil
var obliqueness: NSNumber? = nil
var expansion: NSNumber? = nil
var writingDirection: NSNumber? = nil
var verticalGlyphForm: NSNumber? = nil
...

You can declare styles to apply to HTML tags as desired:

1
let myStyle = MarkupStyle(font: MarkupStyleFont(size: 13), backgroundColor: MarkupStyleColor(name: .aquamarine))

HTML Tags

Declare the HTML tags to render and their corresponding Markup Styles. The currently predefined HTML tag names are as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
A_HTMLTagName(), // <a></a>
B_HTMLTagName(), // <b></b>
BR_HTMLTagName(), // <br></br>
DIV_HTMLTagName(), // <div></div>
HR_HTMLTagName(), // <hr></hr>
I_HTMLTagName(), // <i></i>
LI_HTMLTagName(), // <li></li>
OL_HTMLTagName(), // <ol></ol>
P_HTMLTagName(), // <p></p>
SPAN_HTMLTagName(), // <span></span>
STRONG_HTMLTagName(), // <strong></strong>
U_HTMLTagName(), // <u></u>
UL_HTMLTagName(), // <ul></ul>
DEL_HTMLTagName(), // <del></del>
IMG_HTMLTagName(handler: ZNSTextAttachmentHandler), // <img> and image downloader
TR_HTMLTagName(), // <tr>
TD_HTMLTagName(), // <td>
TH_HTMLTagName(), // <th>
...and more
...

When parsing the <a> tag, the specified MarkupStyle will be applied.

To extend HTML tag names:

1
let zhgchgli = ExtendTagName("zhgchgli")

HTML Style Attributes

As mentioned earlier, HTML supports specifying styles through style attributes. This is also abstracted to allow for specifying supported styles and extensions. The currently predefined HTML style attributes are as follows:

1
2
3
4
5
6
7
ColorHTMLTagStyleAttribute(), // color
BackgroundColorHTMLTagStyleAttribute(), // background-color
FontSizeHTMLTagStyleAttribute(), // font-size
FontWeightHTMLTagStyleAttribute(), // font-weight
LineHeightHTMLTagStyleAttribute(), // line-height
WordSpacingHTMLTagStyleAttribute(), // word-spacing
...

To extend style attributes:

1
2
3
4
5
6
7
8
9
ExtendHTMLTagStyleAttribute(styleName: "text-decoration", render: { value in
  var newStyle = MarkupStyle()
  if value == "underline" {
    newStyle.underline = NSUnderlineStyle.single
  } else {
    // ...  
  }
  return newStyle
})

Usage

1
2
3
import ZMarkupParser

let parser = ZHTMLParserBuilder.initWithDefault().set(rootStyle: MarkupStyle(font: MarkupStyleFont(size: 13)).build()

initWithDefault automatically includes predefined HTML tag names, default corresponding MarkupStyles, and predefined style attributes.

set(rootStyle:) allows you to specify the default style for the entire string, or you can choose not to specify one.

Customization

1
2
let parser = ZHTMLParserBuilder.initWithDefault().add(ExtendTagName("zhgchgli"), withCustomStyle: MarkupStyle(backgroundColor: MarkupStyleColor(name: .aquamarine))).build() // will use the markup style you specify to render the extended HTML tag <zhgchgli></zhgchgli>
let parser = ZHTMLParserBuilder.initWithDefault().add(B_HTMLTagName(), withCustomStyle: MarkupStyle(font: MarkupStyleFont(size: 18, weight: .style(.semibold)))).build() // will use the markup style you specify to render <b></b> instead of the default bold markup style

HTML Rendering

1
2
3
4
5
6
let attributedString = parser.render(htmlString) // NSAttributedString

// work with UITextView
textView.setHtmlString(htmlString)
// work with UILabel
label.setHtmlString(htmlString)

HTML Stripping

1
parser.stripper(htmlString)

Selector HTML String

1
2
3
4
5
6
7
let selector = parser.selector(htmlString) // HTMLSelector e.g. input: <a><b>Test</b>Link</a>
selector.first("a")?.first("b").attributedString // will return Test
selector.filter("a").attributedString // will return Test Link

// render from selector result
let selector = parser.selector(htmlString) // HTMLSelector e.g. input: <a><b>Test</b>Link</a>
parser.render(selector.first("a")?.first("b"))

Async

If you need to render long strings, you can use the async method to prevent UI blocking.

1
2
3
parser.render(String) { _ in }...
parser.stripper(String) { _ in }...
parser.selector(String) { _ in }...

Know-how

  • The hyperlink style in UITextView is determined by linkTextAttributes, which may lead to situations where NSAttributedString.key is set but does not appear to take effect.
  • UILabel does not support specifying URL styles, which may also lead to situations where NSAttributedString.key is set but does not appear to take effect.
  • For rendering complex HTML, it is still necessary to use WKWebView (including JS/table rendering).

Technical Principles and Development Story: “The Journey of Building an HTML Parser

Contributions and Issues Welcome

If you have any questions or suggestions, feel free to contact me.


This article was first published on Medium ➡️ Click Here

Automatically converted and synchronized using ZMediumToMarkdown and Medium-to-jekyll-starter.

Improve this page on Github.

Buy me a beer

677 Total Views
Last Statistics Date: 2025-03-11 | 641 Views on Medium.
This post is licensed under CC BY 4.0 by the author.