← All courses

How ML Models Work (for App Developers)

🗓 May 31, 2026 ⏱ 3 min read

You don’t need to be a data scientist

To use AI in apps, you don’t need to understand the deep maths. But a clear mental model of what a model is and how inference works will make you far more effective. Here’s the practical version.

A model is a trained function

Think of a model as a function that has “learned” from examples. You give it an input and it returns an output:

  • Image classifier: input = a photo → output = “cat: 0.92, dog: 0.05”.
  • Object detector: input = a photo → output = boxes + labels.
  • Text model: input = a sentence → output = sentiment, a translation, or generated text.

The numbers (like 0.92) are confidence scores — how sure the model is. Your app decides a threshold (e.g. only act if confidence > 0.7).

Training vs inference

  • Training — feeding a model millions of examples so it learns patterns. Done once, offline, by ML engineers or pre-trained for you. Heavy and slow.
  • Inference — using the trained model to get a result for new input. This is what your app does, and it must be fast.

As a mobile developer you almost always do inference only, using a model someone already trained.

Inputs and outputs are tensors

Models speak in numbers. An image becomes a grid of pixel values; text becomes token numbers. These numeric arrays are called tensors. A big part of mobile ML code is pre-processing (turning your data into the tensor the model expects) and post-processing (turning the output tensor into something useful).

// conceptual flow
val input = preprocess(bitmap)      // resize, normalize -> tensor
val output = model.run(input)       // inference
val label = postprocess(output)     // pick the top score -> "cat"

Pre-processing matters

Models are picky: they expect a specific input size (e.g. 224×224 pixels) and value range (e.g. 0–1). If you feed the wrong shape or scale, you get garbage results — a very common beginner bug. Always match the model’s documented input format.

Where models come from

  • Pre-trained models — ready to use (TensorFlow Hub, Hugging Face, Apple/Google model zoos).
  • Transfer learning — take a pre-trained model and fine-tune it on your data (a middle ground).
  • Custom models — trained by your ML team for a specific job.

Common mistakes

  • Feeding input in the wrong size/scale (the #1 cause of wrong results).
  • Ignoring confidence scores and trusting every prediction.
  • Trying to train on the phone — do inference on-device, train elsewhere.
Summary: A model is a trained function; your app runs inference on it. Convert your data into the tensor the model expects (pre-process), run it, and interpret the output with confidence thresholds (post-process). Use pre-trained models — you rarely train on mobile.