Exploring iOS 12 CoreML — Using Machine Learning to Automatically Predict Article Categories, with Self-Training Models!
Discover how to convert or train models with CoreML 2.0 and apply them in real products.
ℹ️ℹ️ℹ️ The following content is translated by OpenAI.
Click here to view the original Chinese version. | 點此查看本文中文版
Exploring iOS 12 CoreML — Using Machine Learning to Automatically Predict Article Categories, with Self-Training Models!
Discover how to use CoreML 2.0 to convert or train models and apply them in real products.
Following up on the previous article about using machine learning on iOS, this piece dives into using CoreML.
First, a brief history: Apple released CoreML (which includes the Vision framework mentioned in the last article) in 2017. In 2018, they quickly followed up with CoreML 2.0, which not only improved performance but also supported custom CoreML models.
Introduction
If you’ve only heard the term “machine learning” but aren’t quite sure what it means, here’s a simple explanation:
“Predicting the outcome of future events based on past experiences.”
For example: If I always add ketchup to my egg pancake, after a few visits, the breakfast shop owner will remember, “Hey, handsome, ketchup?” If I respond, “Yes,” she predicts correctly; if I say, “No, because it’s radish cake + egg pancake,” she remembers and adjusts her prediction next time.
Input data: egg pancake, cheese egg pancake, egg pancake + radish cake, radish cake, egg
Output data: add ketchup / do not add ketchup
Model: the owner’s memory and judgment
Honestly, my understanding of machine learning has been purely theoretical until now, so if there are any mistakes, please feel free to correct me.
Speaking of which, I must give a shout-out to Apple for making machine learning accessible. With just a basic understanding, anyone can operate it without needing extensive knowledge, lowering the entry barrier. After implementing this example, I finally felt a tangible connection to machine learning, sparking my interest in the field.
Getting Started
The first step, of course, is the “model” mentioned earlier. Where do models come from?
There are three ways:
- Find pre-trained models online and convert them to CoreML format.
The Awesome-CoreML-Models GitHub project collects many pre-trained models.
For model conversion, refer to the official website or other online resources.
- Download pre-trained models from Apple’s Machine Learning website (mainly for learning or testing purposes).
- Use tools to train your own model🏆.
So, what can you do?
- Image recognition 🏆
- Text content classification🏆
- Word segmentation
- Language detection
- Named entity recognition
For word segmentation, refer to Natural Language Processing in iOS Apps: An Introduction to NSLinguisticTagger.
Today’s Main Focus — Text Content Classification + Self-Training Models
In simple terms, we provide the machine with “text content” and “categories” to train it to classify future data. For example: “Click to see the latest offers!” or “Claim your $1000 shopping voucher now” => “Advertisement”; “Alan sent you a message” or “Your account is about to expire” => “Important Notice.”
Practical applications include spam detection, label generation, and classification prediction.
p.s. I haven’t figured out what to train for image recognition yet, so I didn’t explore that; interested readers can check out this article where the official GUI training tool for images is provided — it’s very convenient!
Required Tools: MacOS Mojave⬆ + Xcode 10
Training Tool: BlankSpace007/TextClassiferPlayground (the official tool only provides a GUI training tool for images, while text requires custom coding; this is a third-party tool provided by a community expert).
Preparing Training Data:
The data structure is as shown, supporting .json and .csv files.
Prepare the training data; here, I’m exporting training data using PhpMyAdmin (MySQL):
1
SELECT `title` AS `text`, `type` AS `label` FROM `posts` WHERE `status` = '1'
Change the export format to JSON:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[
{"type":"header","version":"4.7.5","comment":"Export to JSON plugin for PHPMyAdmin"},
{"type":"database","name":"db"},
{"type":"table","name":"posts","database":"db","data":
// Remove the above
[
{
"label":"",
"text":""
}
]
// Remove the below
}
]
Open the downloaded JSON file and keep only the content within the DATA structure.
Using the Training Tool:
After downloading the training tool, click on TextClassifer.playground to open the Playground.
Click the red box to run -> Click the green box to switch view display.
Drag the JSON file into the GUI tool.
Open the console below to check training progress; seeing “Test Accuracy” means model training is complete.
If the data is too large, it will test your computer’s processing power.
Fill in basic information and click “Save.”
Save the trained model file.
CoreML model file.
At this point, your model is trained! Isn’t that easy?
Specific Training Method:
- First, segment the input sentences (e.g., “I want to know what to prepare for a wedding” becomes “I want, to know, wedding, need, prepare, what”), and then perform a series of machine learning calculations based on their categories.
- Group the training data, for example: 80% for training and 20% for testing and validation.
At this point, most of the work is done; next, just add the model file to your iOS project and write a few lines of code.

Drag the model file (*.mlmodel) into the project.
Code Section:
1
2
3
4
5
6
7
import CoreML
//
if #available(iOS 12.0, *), let prediction = try? textClassifier().prediction(text: "Text content to predict") {
let type = prediction.label
print("I think it's...\(type)")
}
Done!
Questions to Explore:
- Can it support re-learning?
- Can the mlmodel file be converted to other platforms?
- Can models be trained on iOS?
As for these three points, the information I’ve found so far suggests that none of them are possible.
Conclusion:
I am currently applying this in a practical app to predict article categories when posting.
I trained the model with only about 100 samples, and the current prediction accuracy is around 35%, mainly for experimental purposes.
— — — — —
It’s that simple to complete my first machine learning project! There’s still a long way to go in understanding how the background works, but I hope this project inspires everyone!
References: WWDC2018 Create ML (Part 2).
If you have any questions or feedback, feel free to contact me.
This article was first published on Medium ➡️ Click Here
Automatically converted and synchronized using ZMediumToMarkdown and Medium-to-jekyll-starter.