OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision

542ternaus | 3 days ago | 101 | HN

The thing I love about OpenCV is that it remains hands down the best library for simply loading images and video. I've never even used any of its fancy computer vision features, but if I need to load a video file and look at the pixels - which I did need to do recently for an art project - OpenCV does it in about four lines of code.

loading story #48463325

loading story #48463303

loading story #48461885

loading story #48462584

pzo6 hours ago | parent | next

Quite a good release although not sure why they invest so much time into their ONNX engine. I don't think they have enough stuff and big pockets to compete with ONNXRuntime, CoreAI, ExecuTorch, LiteRT.

I'm happy they added option for ONNXRuntime. I wish their cv.dnn was mostly that unified wrapper around many different backends (ONNXRuntime, Executorch, LiteRT, CoreAI) and maybe just some tooling around it (performance metrics tools, model downloads etc). Transformers(.js) approach looks better for me.

Wish they also invested more time into better production ready Camera I/O (for mobiles, device/format discovery, manual settings, depthmap support, etc) and better Highgui that could use different backends (skia, webgpu) and on mobiles.

ftchd10 hours ago | parent | next

> One practical detail is worth knowing. The new engine is CPU-only at the moment, so if you select a non-CPU backend and target (for example CUDA or OpenVINO through setPreferableBackend and setPreferableTarget), you will want the classic engine.

So there's room for even better performance!

loading story #48458066

loading story #48458162

loading story #48464451

boredemployee1 hour ago | parent | next

How can I learn the practical side of computer vision in 2026?

I'm not interested in understanding papers or the math behind it, but rather in how to put a system into production, whether it's object detection, running 20 cameras in parallel on a single computer, like sizing hardware for a specific task, and so on.

Any tips?

loading story #48463150

loading story #48463098

GreenSalem5 hours ago | parent | next

AI written release post and it shows...

loading story #48460442

loading story #48460533

loading story #48461192

loading story #48463586

arcanine9 hours ago | parent | next

They really improved the performance. I tested yolov8 medium segmentation model on intel i7 11th gen cpu.

Opencv 4.11 : ~255ms Opencv 5.0.0 : ~185ms

with the same code.

loading story #48462012

shelled7 hours ago | parent | next

A few years ago I was using OpenCV is a commercial Android SDK (it might still be being used; also because iOS provided almost all of those "needs" ready-made and Android just didn't, neither did Firebase, or Jetpack suites/tools). I was the one who had added it in the SDK. There was a lot I/we could do but as an Android developer (barely any exposure to CV or even C/C++) what I felt we lacked was documentation, a community. We struggled with even shaving off parts that we did not want to ship with our SDK. Speed was such an issue. The problem was someone who just wanted to use the lib (on mobile) a lot of things felt esoteric and out of reach i.e difficult. It didn't have to be.Sadly LLM wasn't at full speed back then, barely useable, not even talked about. Something like this would have been a perfect use case of AI/LLM. A coder, not from the exact/specific field the tool was made in/from, but being able to take full advantage of its capabilities in a nuanced/selective manner.

loading story #48464212

loading story #48463848

hbcondo7142 days ago | parent | next

> LLMs and VLMs, Running Inside OpenCV…Qwen 2.5, Gemma 3, PaliGemma, and the GPT-2 / GPT-4 family

Why these specific models / versions?

loading story #48461118

maelito8 hours ago | parent | next

Can it detect the speed of the car without any hand-made measurement ?

loading story #48463265

loading story #48458664

loading story #48461538

loading story #48460240

globalnode10 hours ago | parent | next

does this mean im actually able to try object detection in opencv now? i mean i know basic image processing techniques, and i know "in theory" how ML works but ive never really seen a case where i can just say "heres an image now detect all the apples". theres always 1. find a model that has the knowledge, 2. hook it up to an inference engine, 3. do something useful. i always get stuck at 1.

loading story #48457965

loading story #48457885

loading story #48458089

Magnets7 hours ago | parent | next

The announcement itself is pure AI slop

loading story #48460115

charankilari7 hours ago | parent | next

wow its been ages

xavierforge2 hours ago | parent | next

[flagged]

cdogukank3 hours ago | parent | next

[dead]

imJack8 hours ago | parent | next

[dead]

pimlottc5 hours ago | parent | next

[dead]

leoncos3 days ago | parent | next

When I use Codex/Claude to complete a computer vision task, such as extracting assets from an image, OpenCV is their default solution. However, I believe that using YOLO and other methods is outdated. The best solution now is to directly use Nano Banana or other AI image models. A paper has proven that image generation models can perform most CV tasks well. I believe the new OpenCV should become a wrapper for VLM or AI image models.

loading story #48457698

loading story #48458104

loading story #48457840

loading story #48457802

loading story #48458199

loading story #48424903

loading story #48458869

loading story #48459542

loading story #48458082

loading story #48457739

oliveiracwb9 hours ago | parent

Computer vision was the formative school for many autodidacts. Although I acquired substantial knowledge from articles translated via Power Translator and Babylon (whose outputs closely mirror those of any 2-million-parameter SLM), it was OpenCV that made concepts like convolutions, softmax, minmax, and others finally click for me. I have consistently viewed OpenCV as an intrinsically open, educational, and adaptable library. Any developer can dissect its codebase to extract a specific filter or algorithmic implementation and tailor it to their requirements. It is certainly not cruising at the velocity of trillion-dollar capital. But it holds its altitude. And it will always be there.

#visit	13,678,844
#session	74,665
#live-session	0