On-Device NLP & Text — AI Mobile Coders

Language features on mobile

Beyond vision, phones do a lot with text and language: detecting the language, recognising names and dates, classifying sentiment, offering smart replies, and increasingly running small language models entirely on-device. Many of these are available through ready-made APIs.

Common on-device NLP tasks

Language identification — which language is this text?
Entity extraction — find phone numbers, addresses, dates, money in text (and make them tappable).
Sentiment / text classification — positive/negative, spam/not-spam, topic tags.
Smart reply — suggest short responses in a chat.
On-device translation — translate without the cloud.

Apple’s Natural Language framework

import NaturalLanguage

let tagger = NLTagger(tagSchemes: [.sentimentScore])
tagger.string = "I love this app!"
let (sentiment, _) = tagger.tag(at: text.startIndex, unit: .paragraph, scheme: .sentimentScore)
print(sentiment?.rawValue)   // a score from -1 (negative) to 1 (positive)

Google ML Kit (Android & iOS)

val languageIdentifier = LanguageIdentification.getClient()
languageIdentifier.identifyLanguage("Bonjour le monde")
    .addOnSuccessListener { code -> println(code) }   // "fr"

How text becomes numbers (tokenization)

Language models can’t read words directly — they split text into tokens (pieces of words) and map each to a number. This “tokenization” is the text equivalent of resizing an image. High-level APIs hide it, but it’s good to know it’s happening, especially when working with on-device language models.

Small language models on-device

A growing trend is running small language models locally for private, offline text generation and understanding (e.g. Google’s on-device Gemini Nano via ML Kit GenAI, or compact open models). They’re limited compared to cloud LLMs but keep data on the phone and work offline. For heavy generation you still use the cloud (next lesson).

Common mistakes

Sending text to the cloud for tasks a built-in on-device API already handles.
Assuming on-device models match cloud LLM quality — they’re smaller.
Ignoring languages and scripts your users actually use.

Summary: Phones can identify languages, extract entities, classify sentiment and even run small language models on-device via Apple’s Natural Language and Google’s ML Kit. Use these for fast, private text features, and reach for the cloud only for heavy generation.