Post

Vision Face Detection|Swift App Auto Crop Profile Pictures with Precision

Discover how Swift developers automate face detection and cropping in profile picture uploads using Vision. Solve manual editing pain points and enhance user experience with accurate, real-time facial recognition and cropping techniques.

Vision Face Detection|Swift App Auto Crop Profile Pictures with Precision

点击这里查看本文章简体中文版本。

點擊這裡查看本文章正體中文版本。

This post was translated with AI assistance — let me know if anything sounds off!


Vision Exploration — Automatic Face Detection and Cropping for App Avatar Upload (Swift)

Vision Practical Applications

[2024/08/13 Update]

Without further ado, here is the finished product:

Before Optimization vs After Optimization — [結婚吧APP](https://itunes.apple.com/tw/app/%E7%B5%90%E5%A9%9A%E5%90%A7-%E4%B8%8D%E6%89%BE%E6%9C%80%E8%B2%B4-%E5%8F%AA%E6%89%BE%E6%9C%80%E5%B0%8D/id1356057329?ls=1&mt=8){:target="_blank"}

Before Optimization V.S After Optimization — 結婚吧APP

Recently, iOS 12 released an update, introducing the new CoreML machine learning framework. I found it quite interesting and started thinking about where it could be integrated into current products.

CoreML Preview Article Now Published: Automatically Predict Article Categories Using Machine Learning, Including Training the Model Yourself

CoreML provides interfaces for training and integrating text and image machine learning models into apps. My original idea was to use CoreML for facial recognition to solve the problem of cropped heads or faces in the app’s image cropping feature. As shown in the left image above, if a face appears near the edge, scaling and cropping can easily cut off part of the face.

After some online research, I realized my knowledge was limited. This feature was already released in iOS 11: the “Vision” framework, which supports text detection, face detection, image matching, QR code detection, object tracking, and more.

This uses the face detection feature, optimized as shown in the right image; it locates the face and crops the image centered on it.

Let’s Get Started:

First, let’s create a function that can mark face locations to get a basic understanding of how to use Vision.

Demo APP

Demo APP

As shown in the completed image above, it can mark the positions of the faces in the photo.

p.s Only “faces” can be tagged; the entire head including hair is not allowed 😅

This code is mainly divided into two parts. The first part addresses the issue of blank spaces when scaling the original image size to fit into an ImageView. Simply put, we want the ImageView’s size to match the Image’s size exactly. Directly placing the image causes alignment problems as shown below.

You might think of directly changing ContentMode to fill, fit, or redraw, but this will cause distortion or the image to be cut off.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
let ratio = UIScreen.main.bounds.size.width
// Here, because my UIImageView is set to align left and right at 0, with an aspect ratio of 1:1

let sourceImage = UIImage(named: "Demo2")?.kf.resize(to: CGSize(width: ratio, height: CGFloat.leastNonzeroMagnitude), for: .aspectFill)
// Using KingFisher's image resizing feature, width as the base, height flexible

imageView.contentMode = .redraw
// contentMode set to redraw to fill

imageView.image = sourceImage
// Assign the image

imageViewConstraints.constant = (ratio - (sourceImage?.size.height ?? 0))
imageView.layoutIfNeeded()
imageView.sizeToFit()
// This part adjusts the imageView's constraints, see the full example at the end of the article

The above is the processing done for the image.

For image cropping, we use Kingfisher to assist us, but it can be replaced with other libraries or custom methods

Part Two, Getting Straight to the Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
if #available(iOS 11.0, *) {
    // Supported only after iOS 11
    let completionHandle: VNRequestCompletionHandler = { request, error in
        if let faceObservations = request.results as? [VNFaceObservation] {
            // Detected faces
            
            DispatchQueue.main.async {
                // Operate UIView, switch back to main thread
                let size = self.imageView.frame.size
                
                faceObservations.forEach({ (faceObservation) in
                    // Coordinate system transformation
                    let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
                    let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
                    let transRect =  faceObservation.boundingBox.applying(translate).applying(transform)
                    
                    let markerView = UIView(frame: transRect)
                    markerView.backgroundColor = UIColor.init(red: 0/255, green: 255/255, blue: 0/255, alpha: 0.3)
                    self.imageView.addSubview(markerView)
                })
            }
        } else {
            print("No faces detected")
        }
    }
    
    // Recognition request
    let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
    let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
    DispatchQueue.global().async {
        // Recognition takes time, so run on background thread to avoid UI blocking
        do{
            try faceHandle.perform([baseRequest])
        }catch{
            print("Throws: \(error)")
        }
    }
  
} else {
    //
    print("Not supported")
}

The main point to note is the coordinate system conversion; the recognized result is in the original coordinates of the Image. We need to convert it to the actual coordinates within the enclosing ImageView to use it correctly.

Next, let’s move on to today’s main task — cropping the profile picture according to the face position.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
let ratio = UIScreen.main.bounds.size.width
// Here, because my UIImageView is set to align left and right at 0, with an aspect ratio of 1:1. See the full example at the end.

let sourceImage = UIImage(named: "Demo")

imageView.contentMode = .scaleAspectFill
// Use scaleAspectFill mode to fill

imageView.image = sourceImage
// Directly assign the original image, we'll manipulate it later

if let image = sourceImage, #available(iOS 11.0, *), let ciImage = CIImage(image: image) {
    let completionHandle: VNRequestCompletionHandler = { request, error in
        if request.results?.count == 1, let faceObservation = request.results?.first as? VNFaceObservation {
            // One face detected
            let size = CGSize(width: ratio, height: ratio)
            
            let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
            let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
            let finalRect = faceObservation.boundingBox.applying(translate).applying(transform)
            
            let center = CGPoint(x: (finalRect.origin.x + finalRect.width / 2 - size.width / 2), y: (finalRect.origin.y + finalRect.height / 2 - size.height / 2))
            // Calculate the center point of the face bounding box
            
            let newImage = image.kf.resize(to: size, for: .aspectFill).kf.crop(to: size, anchorOn: center)
            // Crop the image based on the center point
            
            DispatchQueue.main.async {
                // Update UIView on the main thread
                self.imageView.image = newImage
            }
        } else {
            print("Detected multiple faces or no face detected")
        }
    }
    let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
    let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
    DispatchQueue.global().async {
        do {
            try faceHandle.perform([baseRequest])
        } catch {
            print("Throws: \(error)")
        }
    }
} else {
    print("Not supported")
}

The principle is similar to marking face positions. The difference is that the profile picture has a fixed size (e.g., 300x300), so we skip the first part where the Image needs to fit the ImageView.

Another difference is that we need to calculate the center point of the face area and use this center point as the reference to crop the image.

Red dot indicates the center point of the face area

The red dot marks the center point of the face area

Final Render:

The moment before the pause is the original image location

The moment just before the snap is the original image position

Complete APP Example:

The code has been uploaded to Github: Click here

If you have any questions or feedback, feel free to contact me.


Buy me a beer

This post was originally published on Medium (View original post), and automatically converted and synced by ZMediumToMarkdown.

Improve this page on Github.

This post is licensed under CC BY 4.0 by the author.