ZMarkupParser HTML String to NSAttributedString Converter Tool
Convert HTML String to NSAttributedString with corresponding key style settings
ℹ️ℹ️ℹ️ The following content is translated by OpenAI.
Click here to view the original Chinese version. | 點此查看本文中文版
ZMarkupParser HTML String to NSAttributedString Converter Tool
Convert HTML String to NSAttributedString with corresponding key style settings.
ZhgChgLi / ZMarkupParser
Features
- Developed entirely in Swift, it uses Regex to parse HTML tags, followed by tokenization to analyze and correct tag validity (fixing unclosed tags and misaligned tags), and finally converts to an abstract syntax tree. The Visitor Pattern is then used to map HTML tags to abstract styles, resulting in the final NSAttributedString output, without relying on any parser libraries.
- Supports HTML rendering (to NSAttributedString), stripping HTML tags, and selector functionality.
- Automatically analyzes and corrects tag validity (fixing unclosed tags and misaligned tags):
<br>
→<br/>
<b>Bold<i>Bold+Italic</b>Italic</i>
→<b>Bold<i>Bold+Italic</i></b><i>Italic</i>
<Congratulation!>
→<Congratulation!>
(treated as a string) - Supports custom style specifications, e.g.,
<b></b>
→weight: .semibold & underline: 1
- Allows for custom HTML tag parsing, e.g., parsing
<zhgchgli></zhgchgli>
into desired styles. - Includes a structural design that facilitates HTML tag expansion. Currently, it supports basic styles, as well as rendering for
ul/ol/li
lists andhr
separators. Future expansions will allow for quick support of additional HTML tags. - Supports style parsing from
style
HTML attributes. HTML can specify text styles throughstyle
, and this library can also support styles specified instyle
, e.g.,<b style="font-size: 20px"></b>
→Bold + font size 20 px
. - Compatible with iOS/macOS.
- Supports HTML color names to UIColor/NSColor.
- Test coverage: 80%+
- Supports parsing of
<img>
images,<ul>
lists,<table>
tables, and more HTML tags. - Offers higher performance than
NSAttributedString.DocumentType.html
.
Performance Analysis
- Test environment: 2022/M2/24GB Memory/macOS 13.2/XCode 14.1
- X-axis: HTML character count
- Y-axis: Rendering time (seconds)
*Additionally, NSAttributedString.DocumentType.html
crashes (EXC_BAD_ACCESS) for strings longer than 54,600+ characters.
Try It Out
You can directly download the project, open ZMarkupParser.xcworkspace
, and select the ZMarkupParser-Demo
target to build and run for testing.
Installation
Supports SPM/Cocoapods. Please refer to the Readme.
Usage
Style Declaration
MarkupStyle/MarkupStyleColor/MarkupStyleParagraphStyle, encapsulating NSAttributedString.Key.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
var font: MarkupStyleFont
var paragraphStyle: MarkupStyleParagraphStyle
var foregroundColor: MarkupStyleColor? = nil
var backgroundColor: MarkupStyleColor? = nil
var ligature: NSNumber? = nil
var kern: NSNumber? = nil
var tracking: NSNumber? = nil
var strikethroughStyle: NSUnderlineStyle? = nil
var underlineStyle: NSUnderlineStyle? = nil
var strokeColor: MarkupStyleColor? = nil
var strokeWidth: NSNumber? = nil
var shadow: NSShadow? = nil
var textEffect: String? = nil
var attachment: NSTextAttachment? = nil
var link: URL? = nil
var baselineOffset: NSNumber? = nil
var underlineColor: MarkupStyleColor? = nil
var strikethroughColor: MarkupStyleColor? = nil
var obliqueness: NSNumber? = nil
var expansion: NSNumber? = nil
var writingDirection: NSNumber? = nil
var verticalGlyphForm: NSNumber? = nil
...
You can declare styles to apply to HTML tags as desired:
1
let myStyle = MarkupStyle(font: MarkupStyleFont(size: 13), backgroundColor: MarkupStyleColor(name: .aquamarine))
HTML Tags
Declare the HTML tags to render and their corresponding Markup Styles. The currently predefined HTML tag names are as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
A_HTMLTagName(), // <a></a>
B_HTMLTagName(), // <b></b>
BR_HTMLTagName(), // <br></br>
DIV_HTMLTagName(), // <div></div>
HR_HTMLTagName(), // <hr></hr>
I_HTMLTagName(), // <i></i>
LI_HTMLTagName(), // <li></li>
OL_HTMLTagName(), // <ol></ol>
P_HTMLTagName(), // <p></p>
SPAN_HTMLTagName(), // <span></span>
STRONG_HTMLTagName(), // <strong></strong>
U_HTMLTagName(), // <u></u>
UL_HTMLTagName(), // <ul></ul>
DEL_HTMLTagName(), // <del></del>
IMG_HTMLTagName(handler: ZNSTextAttachmentHandler), // <img> and image downloader
TR_HTMLTagName(), // <tr>
TD_HTMLTagName(), // <td>
TH_HTMLTagName(), // <th>
...and more
...
When parsing the <a>
tag, the specified MarkupStyle will be applied.
To extend HTML tag names:
1
let zhgchgli = ExtendTagName("zhgchgli")
HTML Style Attributes
As mentioned earlier, HTML supports specifying styles through style attributes. This is also abstracted to allow for specifying supported styles and extensions. The currently predefined HTML style attributes are as follows:
1
2
3
4
5
6
7
ColorHTMLTagStyleAttribute(), // color
BackgroundColorHTMLTagStyleAttribute(), // background-color
FontSizeHTMLTagStyleAttribute(), // font-size
FontWeightHTMLTagStyleAttribute(), // font-weight
LineHeightHTMLTagStyleAttribute(), // line-height
WordSpacingHTMLTagStyleAttribute(), // word-spacing
...
To extend style attributes:
1
2
3
4
5
6
7
8
9
ExtendHTMLTagStyleAttribute(styleName: "text-decoration", render: { value in
var newStyle = MarkupStyle()
if value == "underline" {
newStyle.underline = NSUnderlineStyle.single
} else {
// ...
}
return newStyle
})
Usage
1
2
3
import ZMarkupParser
let parser = ZHTMLParserBuilder.initWithDefault().set(rootStyle: MarkupStyle(font: MarkupStyleFont(size: 13)).build()
initWithDefault
automatically includes predefined HTML tag names, default corresponding MarkupStyles, and predefined style attributes.
set(rootStyle:)
allows you to specify the default style for the entire string, or you can choose not to specify one.
Customization
1
2
let parser = ZHTMLParserBuilder.initWithDefault().add(ExtendTagName("zhgchgli"), withCustomStyle: MarkupStyle(backgroundColor: MarkupStyleColor(name: .aquamarine))).build() // will use the markup style you specify to render the extended HTML tag <zhgchgli></zhgchgli>
let parser = ZHTMLParserBuilder.initWithDefault().add(B_HTMLTagName(), withCustomStyle: MarkupStyle(font: MarkupStyleFont(size: 18, weight: .style(.semibold)))).build() // will use the markup style you specify to render <b></b> instead of the default bold markup style
HTML Rendering
1
2
3
4
5
6
let attributedString = parser.render(htmlString) // NSAttributedString
// work with UITextView
textView.setHtmlString(htmlString)
// work with UILabel
label.setHtmlString(htmlString)
HTML Stripping
1
parser.stripper(htmlString)
Selector HTML String
1
2
3
4
5
6
7
let selector = parser.selector(htmlString) // HTMLSelector e.g. input: <a><b>Test</b>Link</a>
selector.first("a")?.first("b").attributedString // will return Test
selector.filter("a").attributedString // will return Test Link
// render from selector result
let selector = parser.selector(htmlString) // HTMLSelector e.g. input: <a><b>Test</b>Link</a>
parser.render(selector.first("a")?.first("b"))
Async
If you need to render long strings, you can use the async method to prevent UI blocking.
1
2
3
parser.render(String) { _ in }...
parser.stripper(String) { _ in }...
parser.selector(String) { _ in }...
Know-how
- The hyperlink style in UITextView is determined by linkTextAttributes, which may lead to situations where NSAttributedString.key is set but does not appear to take effect.
- UILabel does not support specifying URL styles, which may also lead to situations where NSAttributedString.key is set but does not appear to take effect.
- For rendering complex HTML, it is still necessary to use WKWebView (including JS/table rendering).
Technical Principles and Development Story: “The Journey of Building an HTML Parser”
Contributions and Issues Welcome
If you have any questions or suggestions, feel free to contact me.
This article was first published on Medium ➡️ Click Here
Automatically converted and synchronized using ZMediumToMarkdown and Medium-to-jekyll-starter.