Exploring Vision — Automatic Face Detection and Cropping for App Profile Pictures (Swift)
Practical Applications of Vision
ℹ️ℹ️ℹ️ The following content is translated by OpenAI.
Click here to view the original Chinese version. | 點此查看本文中文版
Exploring Vision — Automatic Face Detection and Cropping for App Profile Pictures (Swift)
Practical Applications of Vision
[2024/08/13 Update]
- Please refer to the new article and API: “iOS Vision framework x WWDC 24 Discover Swift enhancements in the Vision framework Session”
Without further ado, here’s a before-and-after image:
Before vs After Optimization — Marriage App
Recently, with the release of iOS 12, I noticed the newly available CoreML machine learning framework. I found it quite interesting and started to think about how I could apply it to our current products.
CoreML introductory article has been published: Using Machine Learning to Automatically Predict Article Categories, Including Self-Training the Model
CoreML provides interfaces for training and integrating machine learning models for text and images into apps. My initial idea was to use CoreML for face detection to address the issue of profile pictures being cropped awkwardly in the app, as shown in the left image above. When a face appears near the edges, it can easily get cut off due to scaling and cropping.
After some research, I realized that I was a bit behind the curve; this functionality was already introduced in iOS 11 with the “Vision” framework, which supports text detection, face detection, image matching, QR code detection, object tracking, and more.
The face detection feature is what we are using here, and after optimization, it looks like the right image above; it detects faces and crops the image accordingly.
Let’s Get Started:
First, we’ll implement a feature to mark the position of detected faces and get acquainted with how to use Vision.
Demo App
The completed image is shown above, which can mark the positions of faces in the photo.
P.S. It can only mark “faces”; it doesn’t include the entire head or hair 😅
This part of the code is mainly divided into two sections. The first part addresses the issue of leaving white space when scaling the original image to fit into an ImageView. In simple terms, we want the size of the Image to match the size of the ImageView. If we directly insert the image, it can lead to misalignment as shown below:
You might think about changing the ContentMode to fill, fit, or redraw, but that would distort the image or cut it off.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
let ratio = UIScreen.main.bounds.size.width
// This is because I set the left and right alignment of the UIImageView to 0, with an aspect ratio of 1:1.
let sourceImage = UIImage(named: "Demo2")?.kf.resize(to: CGSize(width: ratio, height: CGFloat.leastNonzeroMagnitude), for: .aspectFill)
// Using KingFisher's image resizing feature, with width as the base and height flexible.
imageView.contentMode = .redraw
// Set contentMode to redraw to fill the space.
imageView.image = sourceImage
// Assign the image.
imageViewConstraints.constant = (ratio - (sourceImage?.size.height ?? 0))
imageView.layoutIfNeeded()
imageView.sizeToFit()
// This part modifies the imageView's constraints; for details, see the complete example at the end.
This is how we handle the image.
Cropping is assisted by Kingfisher, but you can replace it with other libraries or custom methods.
The second part focuses on the main code for face detection:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
if #available(iOS 11.0, *) {
// Supported only in iOS 11 and later
let completionHandle: VNRequestCompletionHandler = { request, error in
if let faceObservations = request.results as? [VNFaceObservation] {
// Detected faces
DispatchQueue.main.async {
// Update UI on the main thread
let size = self.imageView.frame.size
faceObservations.forEach({ (faceObservation) in
// Coordinate system transformation
let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
let transRect = faceObservation.boundingBox.applying(translate).applying(transform)
let markerView = UIView(frame: transRect)
markerView.backgroundColor = UIColor(red: 0/255, green: 255/255, blue: 0/255, alpha: 0.3)
self.imageView.addSubview(markerView)
})
}
} else {
print("No faces detected")
}
}
// Face detection request
let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
DispatchQueue.global().async {
// Face detection takes time, so we run it in a background thread to avoid freezing the UI
do {
try faceHandle.perform([baseRequest])
} catch {
print("Error: \(error)")
}
}
} else {
print("Not supported")
}
The key point to note is the coordinate system transformation; the results from detection are in the original image’s coordinates. We need to convert them to the actual coordinates of the enclosing ImageView to use them correctly.
Now, let’s move on to the main event of the day — cropping the profile picture based on the detected face position.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
let ratio = UIScreen.main.bounds.size.width
// This is because I set the left and right alignment of the UIImageView to 0, with an aspect ratio of 1:1; for details, see the complete example at the end.
let sourceImage = UIImage(named: "Demo")
imageView.contentMode = .scaleAspectFill
// Use scaleAspectFill mode to fill the space.
imageView.image = sourceImage
// Directly assign the original image; we will manipulate it later.
if let image = sourceImage, #available(iOS 11.0, *), let ciImage = CIImage(image: image) {
let completionHandle: VNRequestCompletionHandler = { request, error in
if request.results?.count == 1, let faceObservation = request.results?.first as? VNFaceObservation {
// One face detected
let size = CGSize(width: ratio, height: ratio)
let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
let finalRect = faceObservation.boundingBox.applying(translate).applying(transform)
let center = CGPoint(x: (finalRect.origin.x + finalRect.width / 2 - size.width / 2), y: (finalRect.origin.y + finalRect.height / 2 - size.height / 2))
// Calculate the center point of the face's bounding box
let newImage = image.kf.resize(to: size, for: .aspectFill).kf.crop(to: size, anchorOn: center)
// Crop the image based on the center point
DispatchQueue.main.async {
// Update UI on the main thread
self.imageView.image = newImage
}
} else {
print("Detected multiple faces or no faces")
}
}
let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
DispatchQueue.global().async {
do {
try faceHandle.perform([baseRequest])
} catch {
print("Error: \(error)")
}
}
} else {
print("Not supported")
}
The logic is similar to marking the face positions, but the difference for the profile picture is that it has a fixed size (e.g., 300x300), so we skip the initial part about fitting the Image to the ImageView.
Another difference is that we need to calculate the center point of the face’s bounding box and use this point for cropping the image.
Red dot indicates the center point of the face’s bounding box.
Final Result:
The moment before cropping shows the original image position.
Complete App Example:
The code has been uploaded to GitHub: Click here.
If you have any questions or feedback, feel free to contact me.
This article was first published on Medium ➡️ Click Here
Automatically converted and synchronized using ZMediumToMarkdown and Medium-to-jekyll-starter.