On-Device ML on iOS: Core ML — AI Mobile Coders

What is Core ML?

Core ML is Apple’s framework for running machine-learning models on-device across iPhone, iPad and Mac. It’s tightly integrated with Apple hardware, automatically using the CPU, GPU and the Neural Engine (a dedicated AI chip) for fast, energy-efficient inference.

The .mlmodel format

Core ML uses .mlmodel (compiled to .mlmodelc) files. You can get one from Apple’s model gallery, convert a model with coremltools, or train a simple one with Create ML (Apple’s no-code training app). When you drag a model into Xcode, it auto-generates a Swift class for it — very convenient.

Running a model

// Xcode generates a class from MyModel.mlmodel
let model = try MyImageClassifier(configuration: MLModelConfiguration())

let input = try MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)

print(output.classLabel)                 // "cat"
print(output.classLabelProbs)            // ["cat": 0.92, ...]

Vision framework for images

For image tasks, pair Core ML with Apple’s Vision framework. Vision handles resizing, orientation and running the model with little code — ideal for classification, detection and face/landmark tasks.

let request = VNCoreMLRequest(model: visionModel) { request, _ in
    if let top = (request.results as? [VNClassificationObservation])?.first {
        print(top.identifier, top.confidence)
    }
}
let handler = VNImageRequestHandler(cgImage: cgImage)
try handler.perform([request])

Create ML: train without code

Apple’s Create ML app lets you train image, text, sound and tabular models by dragging in example data — no ML coding. It exports a .mlmodel ready for your app. Great for custom, app-specific classifiers.

The Neural Engine advantage

On Apple chips, Core ML can run models on the Neural Engine, which is dramatically faster and more power-efficient than the CPU. You don’t write special code — Core ML picks the best hardware automatically — but it’s why on-device AI feels so smooth on iPhones.

Keeping the UI responsive

Run predictions off the main thread and update the UI back on the main thread, exactly as with any heavy work in iOS.

Common mistakes

Passing an image in the wrong format/size instead of using Vision to handle it.
Blocking the main thread with inference.
Shipping a large model without considering app size and converting/optimising it.

Summary: Core ML runs models on-device on Apple hardware, automatically using the Neural Engine for speed and efficiency. Use the Vision framework for image tasks, Create ML to train custom models without code, and always run inference off the main thread.