Running AI in the browser: lessons from background removal

We're building AI background removal that runs entirely in your browser. Here's what we've learned about WebGPU, model size, and the trade-offs between speed and quality.

Background removal used to be a Photoshop task. Then it became a server-side AI task (remove.bg, Canva, Adobe). Now it's becoming a browser task — AI models that run on your device, with no upload.

We're building this for pictoolkit. The principle is straightforward but the engineering has real trade-offs. Here's what we've learned.

The basic approach

Background removal works by training a neural network to identify the "foreground" pixels of an image — the subject — and produce a mask. Pixels in the mask become transparent; pixels outside it are preserved.

Several excellent open-source models exist:

  • U²-Net — the OG. Good quality, manageable size (~170 MB).
  • BiRefNet — current state of the art. Excellent edges, especially for fine details like hair.
  • RMBG-1.4 — by BRIA AI. Solid quality, designed to be production-ready.
  • SAM (Segment Anything) — by Meta. Different approach (interactive segmentation).

For automated background removal, BiRefNet or RMBG-1.4 are the strongest open options. We're testing both.

The model size problem

The biggest constraint of browser AI is download size. Users won't wait for a 500 MB model. They might tolerate 50 MB. Anything more, and they bounce.

RMBG-1.4 is about 80 MB. BiRefNet is bigger — over 200 MB in its full form. Both need to be downloaded once and cached, but that initial download is friction.

Mitigations we're trying:

  • Quantization. Converting from FP32 to FP16 or INT8 cuts model size by 2-4× with minimal quality loss for inference.
  • Pruning. Removing weights that contribute little to the output. Modest size reductions but starts hurting quality.
  • Distillation. Training a smaller model to mimic the larger one. Most promising but requires retraining.

Realistic target: get a high-quality model to under 50 MB.

The inference speed problem

Once downloaded, the model needs to run in reasonable time. Users tolerate ~3 seconds for a single image. More than that feels broken.

On a recent laptop with WebGPU, RMBG-1.4 takes about 800ms for a 1024×1024 image. That's acceptable. On a 2018 laptop with only WebGL, it takes 5-8 seconds. That's too slow.

The mitigation is detecting capabilities and choosing models accordingly:

  • WebGPU available: Use the higher-quality model. ~1s per image.
  • WebGL only: Use a smaller, faster model. Slightly lower quality but tolerable speed.
  • No GPU acceleration: Don't offer the feature on this device. Fall back to a server-side option (with explicit consent) or politely decline.

WebGPU is great but immature

WebGPU is the future of browser AI. It gives JavaScript direct access to GPU compute, similar to what CUDA gives Python. For our use case, it's 5-10× faster than WebGL.

The catch: WebGPU only shipped to all major browsers in 2023. Mobile support is still uneven. Some users have it disabled. The API itself is still evolving.

For now, we treat WebGPU as the preferred path with WebGL fallback. As adoption grows, WebGPU will become the default.

The quality vs. speed trade-off

Every neural network model has a quality knob — usually the number of layers, parameters, or input resolution. More of any of these means better quality but slower inference.

We're testing several configurations:

  • Fast mode: Smaller model, 512×512 max input. ~400ms per image. Quality good for most photos.
  • Balanced mode: Medium model, 1024×1024 input. ~1s per image. Quality matches commercial services.
  • Quality mode: Full BiRefNet, 2048×2048 input. ~4s per image. Best quality, especially for hair and fur.

The user picks. Most pick balanced.

The edge cases

AI background removal works great on common cases (people, products on plain backgrounds, clear subjects). It struggles on:

  • Transparent objects like wine glasses or windows.
  • Hair against a busy background — even the best models can miss fine strands.
  • Multiple subjects with overlap.
  • Motion blur at the subject boundary.
  • Specular reflections that look like background patterns.

For these cases, traditional interactive tools (where the user manually refines the mask) still beat automated AI. We're considering offering both modes.

Privacy as a feature

Doing this in the browser solves the biggest concern with online background removal: your images stay yours. No server sees them. No model is trained on them. No backup retains them.

This matters more than people realize. Product photos before launch, identity documents, personal photos with sensitive backgrounds — these are exactly the cases where people want background removal but worry about upload services.

Our pitch: same quality as the paid services, completely private.

When will this ship?

We're aiming for early 2026. The model size is the main remaining hurdle — we want it under 50 MB without sacrificing quality. As soon as we hit that target, we ship.


Get notified when browser-based background removal launches. In the meantime, explore our other tools.

Keep reading

Related articles