Week 4: Optimizing iterators

So what happened last week? Except that Slovakia upset Russia in Hockey at Vancouver, Slovakia is also fixing Krita performance thanks to the donors, from all around the world, who made that possible. Enough about nationality, let’s talk about what happened.

The plan for the last week was to optimize iterators. We use tiled buffers for our memory management. You usually need undo function in the image processing application and storing the whole image is not a good idea. You rather break the image into tiled parts, rectangles and you store and restore them. But affects the whole application, you have to iterate pixels in filter, you have paint with the brush and that means to composite one image over another image. So basically iterators are very important for Krita. Its speed is priority of course.

We knew, from the measures  I did first week, that fetching those tiles is a slow process. That’s why we have a rectangle iterator which iterates the image tile by tile. But sometimes you need to iterate the image in lines. E.g. in the filters or under some special conditions when you need the access to the pixel before and after in the same line.

That’s why we have a line iterators, we have a vertical and a horizontal iterator. I was optimizing the line iterators. Instead of fetching the tiles every width/height of the tile, we now pre-fetch the tiles to some buffer, lock them and work with them, til we can skip to the next row of tiles. The tiles are 64×64 big by the way. Another important thing is that Krita is threaded and one thread is doing a projection updates (the final image you see in Krita) and other can work on some other stuff like painting or filtering. So we have to lock&unlock the tiles. I patched the vertical and the horizontal iterator.

Ben Dansie: Monkey

We managed to get speedup like 1.28x til 6.21x faster iterating. Cyrille Berger then suggested that we have quite complicated structure for iterators with many function calls, so let’s try a new API for iterator with less function calls. So my task was to code or better said port the horizontal iterator to the new API. The iterator is ready for testing. According the benchmarks, the new iterator is 1.5x faster then old API. Also Cyrille added a lot of custom magic to it so that it is faster. We are closer to the data manager speed, which limits the speed of iterators.

On Monday I also fixed the abr brush so that we can paint with abr brushes, but the work is not finished yet. This week I will work on it and also I will try to use the new iterator in KisPainter for compositing and explore the time results. Then at the end of the week I fly to Netherlands, where we are having our Krita sprint in Deventer, in Boudewijn’s home. I will spent 11 days there with other Krita hackers. So a lot of coding in front of us. We will be visiting the Blender team also. I’m looking forward to it!

So far we make progress, Sven Langkamp made me happy saying that the speedup for painting even with some filters is amazing now compared to what it was before. Also the artist, Ben Dansie,  from the project Durian, the Blender project of the open movie, used Krita for making some art. Cool, isn’t it? Now back to rediscover your Krita code ™.

This entry was posted in Krita. Bookmark the permalink.

3 Responses to Week 4: Optimizing iterators

  1. Rsh says:

    Have you basically made the iterators have inline functions? And if yes, is the speedup with -O2 compile option or -O3? -O3 inlines even more agressively and can give massive speedups due to lack of function calls (it can also thrash CPU cache, but in my experience it does more good than bad)

  2. LukasT says:

    There are many calls to virtual functions, so in-lining is not possible.

  3. Michael Thaler says:

    The Curiously recurring template pattern (http://en.wikipedia.org/wiki/Curiously_Recurring_Template_Pattern) can be used to inline virtual functions (actually it replaces dynamic polymorphism by static polymorphism). However, it is kind of limited, so I don’t know if it would work in this case.

Leave a Reply