Sunday, September 18, 2011

NV path rendering

A while ago NVIDIA released drivers with their NV_path_rendering extension. GL_NV_path_rendering is a relatively new OpenGL extension which allows rendering of stroked and filled paths on the GPU.

 I've heard about it before but lately I've been too busy with other stuff to look at, well, anything.

 I've decided to spend a few seconds looking at the videos NVIDIA posted on their site. They were comparing the NV_path_rendering, Skia, Cairo and Qt. Pretty neat. Some of the demos were using huge paths, clipped by another weird path and using perspective transforms. Qt was very slow. It was time for me to abandon my dream of teaching my imaginary hamster how to drive a stick shift and once again look at path rendering.

You see, I wrote the path rendering code so many times that one of my favorite pastimes was creating ridicules paths that no one would ever think about rendering and seeing how fast I could render them. Qt's OpenGL code was always unbelievably good at rendering paths no one would ever render. Clearly these people were trying to outcrazy me.

Fortunately there's an SDK posted on the NVIDIA site and it's really well done. It even compiles and works on GNU/Linux. Probably the best demo code for a new extension that I've ever seen. The extension itself is very well done as well. It's very robust, ultimately though it's the implementation that I care about. I have just one workstation with an NVIDIA card in it, a measly Quadro 600, running on a dual processor xeon e5405, but it was enough to play with it.

 The parts using Qt were using the raster engine though. I've looked at the code and decided to write something that would render the same thing but using just Qt. The results were a little surprising. Qt OpenGL could render tiger.svg scaling and rotating it at about 270fps, while the NV_path_rendering was running at about 72fps. Here's both of them running side by side:

 (numbers lower for both on account of them running at the same time of course). As you can see Qt is almost 4x faster. I've figured it might be related to the different SVG implementations and rendering techniques used, so I quickly hacked the demo NVIDIA posted to open a brand new window (you need to click on it to start rendering) and render to QGLPixelBuffer but using the same SVG and rendering code as their NV_path_rendering demo code. The results were basically the same.

I posted the code for the Qt demo and the patch to nvpr_svg on github: https://github.com/zackr/qt_svg

The patch is larger than it should be because it also changed the file encoding on the saved files from DOS to Unix but you shouldn't have any issues applying it.

So from a quick glance it doesn't seem like there are any performance benefits to using NV_path_rendering, in fact Qt would likely be quite a bit slower with it. Having said that NVIDIA's implementation looks very robust and a lot more numerically stable. I've spent a little bit of time looking at the individual pixels and came away very impressed.

In general the extension is in a little bit of a weird situation. On one hand, unlike OpenVG which creates a whole new API, it's the proper way of introducing GPU path rendering, on the other hand pretty much every vector graphics toolkit out there already implements GPU based path rendering. Obviously the implementations differ and some might profit from the extension but for Qt the question is whether that quality matters more than the performance. Specifically whether the quality improves enough to justify the performance hit.

I think the extension's success will largely depend on whether it's promoted to, at least an EXT or, ideally an ARB, meaning all the drivers support it. Using it would make the implementations of path rendering in toolkits/vector graphics libs a lot simpler and give driver developer a central place to optimize a pretty crucial  part of the modern graphics stack. Unfortunately if you still need to maintain the non NV_path_rendering paths then it doesn't make a whole lot of sense. Mesa3D implementation would be trivial simply because I've already implemented path rendering for OpenVG using the Gallium3D interface, so it'd be a matter of moving that code but I'm just not sure if anyone will be actually using this extension. All in all, it's a very well done extension but it might be a little too late.

14 comments:

Gaz Davidson said...

Are you 100% sure that vsync is disabled? 60FPS looks suspiciously like the refresh-rate of your monitor!

Zack said...

@Gaz Davidson: yes, the refresh rate of my monitor is not ~72hz. Like I mentioned the numbers drop because on the screenshot it's running side by side with the Qt implementation.

vdp said...

I didn't quite understand what you were doing differently from the nvidia demo, that would explain the dramatic performance difference ?

Also, could they have been using more modern nvidia hardware that benefit much more from their approach than your hardware does ?

Rsh said...

@vdp: As Zack said, they have been using the raster engine, instead of OpenGL's.

Harsh86 said...

@Zack: How fast were the respective benchmarks running individually? As in not side by side?

Zack said...

@Harsh86: Quoting the blog to which you're responding: "Qt OpenGL could render tiger.svg scaling and rotating it at about 270fps, while the NV_path_rendering was running at about 72fps".

Tails3903 said...

@Zach: The QT code is running 4x AA like the nVidia demo right?

Also, is this massive performance increase coming from the QT code or something at the driver level? I'm curious if you will see the same performance difference running the QT app on Windows, since your path rendering optimizations are only found in the G3D linux driver right?

Zack said...

@Tails3903: Yea, it is. You can always just double check it on github.

It's all about the difference in algorithms used by Qt and nv_path_rendering implementation.

The tests were running on the proprietary NVIDIA driver because Gallium3D doesn't support nv_path_rendering and anyway I was interested in the performance on an environment native to this extension. It would be probably a lot more interesting to see differences on high end machines than on anything else.

peda said...

your application: 811 fps
nvidia sdk: 114 fps

(GeForce GTX 460)

Zack said...

@peda: Ah, thanks a lot. Very interesting, I thought their implementation would scale a lot better.

Mike234 said...

@Zach: Is it possible to pull off something like nvpr_tiger3d demo in QT using QT's super fast OpenGL paint engine?

Rendering resolution independent 2D paths in a 3D scene may be one area where NV Path Rendering can't be beat.

Mathias Panzenböck (panzi) said...

Your ./qt_svg gives me about 700fps and ./nvpr_svg -nosync -svg svg/complex/tiger.svg -spin -animate gives me about 78fps.

System:
Fedora Linux 16 x68_64
KDE 4.8.3 with desktop effects enabled
GeForce 8800 GTS 512/PCIs/SSE2
Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz
8 GB RAM

Mark Kilgard said...

Zack,

Thanks for evaluating NV_path_rendering.

I want to encourage you to re-evaluate the latest NV_path_rendering because the latest Release 300 drivers (301.42 is a WHQL driver now) have remarkable NV_path_rendering performance improvements.

I built your qt_svg example on a system with a GeForce GTX 460. I measured your qt_svg example spnning the tiger at 184 to 160 fps.

With a 295.51 drivers (prior to the NV_path_rendering tuning in Release 300), I measured 190 fps running nvpr_svg at the same resolution as your qt_svg example.

By just upgrading to the 301.42 driver, the same nvpr_svg configuration runs at 460 to 430 fps.

If I run at 4 samples/pixel as qt_svg does, the performance is over 720 fps.

A faster GPU can run even faster.

Give NV_path_rendering another spin with a Release 300 driver. I think you'll like the new results.

- Mark Kilgard, NVIDIA

Anonymous said...

I am curious as to how complex is " too complex" for these approaches. I am playing with nv_path_rendering (total newbie) with an svg file with lots of complexity (probably around 20M vertices) and may be running into limits of my gpu hardware.