Week 9: Mirroring feature

March 29th, 2010

Last week we finished the part of the action plan regarding the performance. I started to work on usability improvements. The first usability improvement is canvas mirroring.

Canvas mirroring is useful in your painting work flow and Boudewijn explained that it is useful for artists to check errors in their paintings. In the classical painting you need some hardware for that — you need to find a real mirror and see if your painting looks creepy or not. In Krita you don’t have to buy a new mirror, you can use the canvas mirroring feature.

It is also useful when you paint something that supposed to be “almost” symmetrical. You paint ears of your hero or you paint head outlines. You can paint the first half, mirror the canvas and paint the second half and check back quickly. Other possibility is to use symmetrical feature I wrote for Softbrush lately.

In Krita you could already do this by transforming the image and it’s data but that can be slow if you are painting on big canvases. Krita aims to be professional tool, we want to support big canvases. So we need this also as projection feature. We will not transform the image data, we transform the projection of the image data.

I started to work on this and I had a feeling that this is not going to be about a lot of code but this task is going to be about orientate yourself in some complex part of the Krita called projection and about orientate in coordinate systems we have. You also have to find out and understand how the scrolling and zooming works.

I contacted mypaint maintainer maxy with some short mail asking how mirroring is done within mypaint. The answer was what I basically expected and it proved me that I might go in right direction.So first I started with mirroring the coordinates of the tools. We have few coordinate systems in canvas. First coordinate you get is the widget coordinate. That one is transformed according the scrolling offset. Then document origin, the top left coordinate of the image in canvas, also has to be taken into account. Next the zooming transformation follows.After this you are on document level coordinate system. On this level I mirror the coordinate. That means that the document is changed in mirrored manner. This works so far very nicely. For vertical mirroring you need to do transformation like scale(-1,1) and then translate(width,0) where width is the width of the document in my case. We mirror just vertically.

The next step is to mirror the projection of the image. I started to work in QPainter canvas as that one is the more complicated one. First I mirrored extra buffer QImage we have in canvas to be able to use RasterOp operations with Arthur. QImage::mirror()  was just quick&dirty hack to see what will be broken. So tools outlines were broken as they don’t know about the mirroring and some features worked in mirrored way. I removed the dirty hack and I used QTransform for transforming the QPainter. I left the QPainter transformed this time also for tools so the outlines were fixed.

The OpenGL canvas was similar to QPainter canvas, I needed to do just the transformation and again setup the QPainter which is used even in OpenGL based canvas for painting the tool outlines.

Then I started to work on fixing some related, quite complicated, bugs. I spent a lot of time with them without solution. The issue does not seem to be complicated. You just need to mirror some coordinates. But it is hard to know which ones, if you did not write that projection code. The issues are that when you scroll and zoom-in  in the part of the image, it became transparent. I suppose that projection thinks that part of the image is not visible so it does not update it or something like that. But I don’t know really if that is true and I can’t fix it either. Similar problem occurs in OpenGL canvas. I was not able to fix it either. I was trying but without lack. Try&error method did not bring any good results.

Then I spent time integrate the feature into Krita. You can mirror the canvas by View->Mirror Image (CTRL+I).To sum it up, the feature is in koffice-ko feature branch as trunk is frozen. It works with bugs we know about. We just need to fix them. The week I spent on it was nice start for this feature. Let’s hope the feature will be bug-free for Krita 2.3.

New brush in Krita: Softbrush

March 22nd, 2010

I’m quite busy right these days. I don’t have much time to blog about some nice stuff I have done in Krita in my “spare-thesis time”. I’m writing the thesis about brush engines I implemented for Krita and I wanted to have a brush that is very common among digital painters. I noticed that they heavily uses the basic Pixel brush. I decided I want to have some special pixel brush.

I stared to change the function that produce the brush mask and affects it’s softness. I selected Gaussian as it is nice function and I experiment with this function, but I found it complicated to control it (you setup sigma, uh what is sigma, you artist ask?). So let’s add some different function to the brush mask code. Oh, let’s put this decision to the artists hands, let’s give him some curve he can model as he want. We already has nice widget for that in Krita, so use it. So you can setup the brush mask by curve!

Default CurveSetup brush softness with the curve

On the picture you can see Curve and Gaussian. The top red point represents the value of the brush mask in the center of the brush mask and the red point down in the right bottom corner represents the value of the edge of the brush mask. So far the brush mask supports only elliptical shape but I have also plans for rectangular shape of the brush mask but I need to find a way how to extend the 24-hours limit of the day so that I can code it :) You can add as many control points as you wish and tweak the softness between the center and edges.

You noticed the Gaussian also there, the mode and it’s code works but it will need some more love in the future. It gives some nice results also sometimes. I will see how it will end up.

So let’s make crazy setting of the curve and you can end up with the brush mask like this one with interesting edges:

Funny maskStroke with custom curve to change brush mask

As you can see, the brush supports pressure with no problems. Thanks to the Cyrille’s sensors you can change size by various attributes of your tablet or you can use some silk ones like a time or a distance. The brush supports also a rotation so you can setup some ellipse shape of the brush with a diameter and an aspect ratio and do some funny stuff like this one:

Rotation Dynamics of the soft brush in action

You setup an angle for your shape and you can allow Krita to manage the rotation for you. As you noticed, the opacity has been changing also, yes, you can control opacity with pressure. Those attributes are present in Pixel brush in Krita also. Let’s go for some new one.

I added parameter density to brush so that you can simulate in some non-photorealistic way brushes like charcoal, chalk and crayon. Basically you setup how many pixels from the brush mask will be used in a single dab.  Very nice attribute, I managed to simulate a little charcoal drawing, here is my “developer” painting:

Softbrush, charcoal attemptSoftbrush density demo: Lukáš Tvrdý – House with charcoal

When you paint, your hand can shake. Maybe your hand is very stiff, and you don’t shake with your hand when you paint :D . So I added a parameter for you. It is called Jitter movement. You can setup how much the brush will be jittered when you paint. Maybe want to draw straight lines with shaky feeling. This is example of some mild jittering:

JitteringSoftbrush mild jittering

If you exaggerate the parameter, you can end up with the result that spray brush, my other brush,  do for you

Spraying with jitterSoftbrush crazy jittering

Last feature, that is a little experimental, is called a HSV dynamics. I suppose you know what is HSV, it is physical color model, which model the color in terms of tint – hue, saturation and intensity – value. It was invented for artists so that they can more powerfully control the color they want to choose. I decided to change the parameters of the color dynamically in a stroke. This is still WIP (work-in-process) but so far you are able to grow the parameter, shrink and control it by pressure. Here is some demo, the feature with pressure can produce images with more gradients easily and you can avoid pick-that-color-and-put-it-here problem a little bit:

HSV DemoSoftbrush: Hue growing, saturation shrinking and value growing

You can control the progression by a curve again and more control over the HSV can be done by using pressure option and using the tablet. These features are for 2.2. For 2.3 I already have some new feature and it is called mirroring and it was inspired by David Revoy video tutorial. He is using something very similar in video (link to 4:40).

You define the vertical axis with the softbrush with CTRL+LEFT click and then you paint in mirror mode. If you setup the axis outside the canvas, it is turned off.

As you paint, your brush mask is mirrored and then painted in the mirror mode. It is WIP also, we plan to make it more general so that every brush engine can use it. I get the idea as I was working on canvas mirroring.

I produced video, watch it here in action.

I plan to blog about other new achievements but again time is issue. Beside I work on Krita full-time, I write thesis so I don’t have time these days. If you want to help me, you can produce some art with my brush engines and then I can include your work in my thesis! I’m interested in pictures created with spray, softbrush, sumi-e (will be renamed to hairy brush), deform brush and particle brush and its combination. Some of the pictures you see here are already part of my thesis. You need to compile trunk for it but it is easy!

So far enkithan and n-pigeon provided a lot of nice paintings for me. I pick up the picture that n-pigeon did lately with just soft brush and spray brush.

Softbrush and spray brush: n-pigeon - Watercolor Softbrush & spraybrush: n-pigeon – Watercolor

Week 8: Vectorization cancelled

March 22nd, 2010

In week 8 I started working on a part of the Krita I had never touched before. It is actually a library inside KOffice called Pigment. Pigment is responsible for the color management. It contains some colorspaces we actually don’t use in Krita because they are too simple for Krita’s needs. They are intended for use by other applications in KOffice. So far the applicationss do not have color management, but they can if the developers want it. Pigment contains many useful classes which operates on the color in some colorspace in many ways like doing the math, compute the histogram. It also contains the implementation of the composite operations. And here was my interest as we wanted to try to vectorize the composite operations by using SSE instructions and the vectorization feature in GCC4.x.

So first I started to write benchmarks for the various composite operations. And then I started to work with GCC feature. Vectorization of the composite operation is already implemented in GEGL by Øyvind Kolås. Also GIMP 2.6.x is using MMX, SSE, SSE2 so I had inspiration and I was trying to map it to our implementation of composite operations. GEGL is using nice code, but GIMP is using assembler directly. I don’t have much experience with assembler. I wish I had write assembler lessons previous year at the university. It is not hard
to code something, but it is hard to do it correctly. I read somewhere in Inkscape mailing list that they had some assembler code to speedup some work but it ended up to be slower then code optimized by compiler.

I had quite a hard time and I did not manage to implement the vectorization even with help I get regularly from Cyrille and boud and other Krita hackers and
GEGL hacker. We stopped it on Wednesday because we discovered that the issue is more complicated then we thought and it would require much more than two days to finish. Maybe another week or even weeks. And the result could be not faster. One of the problems we discovered was that the RGBA 8-bit colorspace uses the unsigned char datatype (quint8 in Qt) for the memory storage but when you do a composite operation, you have to retype it to the bigger data type like a int32. Why? If you have a pixel in quint8 with value 255 and other pixel with value 200 and you add or multiply them, you overflow the data type. And the result is bad. If you retype, you have solved that issue. I studied the GIMP code and how it is
implemented there. It is solved using MMX instuctions for this case. The MMX technology supports both saturating and wraparound modes. You can read about that more in details here. Another issue was that GIMP does not compute every composite operation with vectorization but only some composite operations are implemented this way.

So we decided to start to work on the other item in the action plan which is mirroring of the canvas and possibly rotation. On Thursday and Friday I was back in the canvas code. So far I have working code of the mirroring of the events from input devices. Now the hardest part will be to implement  mirroring in the projection. Projection is the code responsible for correct displaying of the zoomed image and it computes the image you see in the canvas when you scroll or move or some tool paint it’s outline or some part of the
image is changed by some tool. The task will continue also for OpenGL canvas as we have two canvases in Krita.

You see, the vectorization week does not bring any speedup, but I don’t want you to be sad so I decided to write a blog post about other Krita work I do in my spare time. Read it here.

Week 7: 12 times faster smudge

March 15th, 2010

This week of the sponsored Krita developer was a little shorter. On Monday I traveled back from Holland to Slovakia. I was at the Krita Hackfest 2010. We worked there quite hard so we decided to take a break for me for 2 days. So I started to work on Thursday and finished on Friday.

As I wrote in my previous blog post, I started to work on the smudge brush, because according the bug report it was horribly slow. Although I wrote many brush engines, this one is not mine. But I have many experiences with brush engines so I hoped to use them.

Thanks to the consultations with Cyrille Berger,author of this brush engine, I managed to speed up the smudge. I removed some unnecessary memory device and then I had to write famous custom bitBlt function which takes selections saved in different memory class into account. The performance bottleneck was transformation of our paint device into selection.

We introduced fixed paint device longer time ago which should speed up the composition of the brush masks because it is lightweight memory device, but in the smudge it introduced some workarounds which slowed down the performance. I removed that workarounds but it was complicated, that’s why nobody did it before. I ported some of my paintops to use the fixed paint device because according our benchmarks it is really faster.

Here is the table with the performance times of smudge in our KisStrokeBenchmark, where the brush engine draws the big stroke:

1. 7,519 msec per iteration (total: 7519, iterations: 1)
2. 5,275 msec per iteration (total: 5275, iterations: 1)
3. 1,202 msec per iteration (total: 1202, iterations: 1)
4. 653 msec per iteration (total: 653, iterations: 1)

You can see, that the speedup factor is almost twelve. The table consists of the iterations when optimizing. From initial time to the final optimization.

On Friday I was done so I started to work on a vectorization of the compositing operations. Sounds cool, nah? First I wrote benchmarks for our composite operations and now I will continue to write example code which will use the vectorization in gcc4.x. Here is what I’m going to study and use. The point is to speedup the composition of course.

Week 5&6: Krita hackfest

March 7th, 2010

I’m writing this blogpost from Boud’s kitchen. After 10 days of the Krita hackfest I feel quite dizzy. What I was doing?

According the action plan, this two weeks could be spent on some unfinished business. So we decided to work futher on the Adobe Photoshop brush integration. I fixed the type of the brush internally. Photoshop brushes for Krita consist of the alpha-map and the presets of the Photoshop’s brush engine. I ported the GIMP plug-in which loads the bitmaps in the abr brush. But the presets were not parsed.

Krita team: Where is the vision?

Frob wrote the python script which I mentioned previously in my blogpost. Frob and Alexandre Prokoudine are working on reverse engineering of the Photoshop’s resources like brushes and gradients. And their work includes also parsing of the brush engines presets. So I ported the script from python to Qt/C++ for Krita. The last step is to integrate the script. That was not done yet.

Then I arrived to Deventer, the Netherlands. Dmitry Kazakov was waiting for me at the airport and we went together to Deventer from Amsterdam. The weekend was gone so fast. We discussed a lot of issues together with Peter Sikking, interaction designer. That was very very interesting for me. I also met other Krita hackers I haven’t met yet like Adam or Vera. It is nice to be together, because you can be very productive.Together with Dmitry we fixed a bug in few minutes. It would take much more time if we would do it like a Report Bug -> Assign -> Fix.

Week 6 was the hackfest for me, Sven, Cyrille and Boud here in Deventer. We decided to give a try to 1/4 computation of the brush mask to have faster painting. This time it was implemented correctly. Before I tried to copy the 1/4 of the mask to the other parts of the mask. This time I decided to compute the brush mask with bi-linear interpolation. I spent 2 days on it fixing some artefacts that started to occur. Final measurements showed that the 1/4 computation is more expensive then the function itself as this time the arctan2 was not involved. Also other benchmark with arctan2 involved showed that the code is equally fast as the version without the interpolation. Maybe some bigger brushes will be faster but I need to investigate. The code is in trunk, we just need to find it’s usage.

Another optimization was for flood fill tool. Write benchmark, valgrind, identify the bottlenecks, try to fix them, do it few times and I was done. Cyrille helped me a lot, he is very good developer and he had always some good advices how to get the speedup. Finally I managed to make flood fill run 3.5 times faster.
Here is table and you can see how time dropped:

Start of the optimization
1. 5,921 msec
2. 5,752 msec
3. 5,574 msec
4. 3,581 msec
5. 1,711 msec
End of the optimization

The flood fill is still quite slow in the Krita. I tested on 4096×4096 pixel image. The performance problem now is that there is difference function used in the flood fill algorithm and the function convert the pixels to LAB. I suppose that in GIMP it is probably done on RGB 8-bit colorspace. But in Krita there are many colorspaces and the best way to do the difference is on L channel in LAB as Cyrille told me.

Then we were in Amsterdam in the Blender studios. We met Ton Roosendaal and the Durian project team. It was exciting for us. I managed to talk with Angela, the Durian artist from Canada. I had nice chat with her, lovely person. I also managed to talk with Ben, the artist who tried Krita and he also produced some art with it. I was in Amsterdam for the first time. I managed to see arrogant Amsterdam bikers, nice architecture and the cannal. We had long but nice walk around the town.

At the hackfest we discovered more performance issues. So I spent some time on smudge brush. I started with simple fixes and I started to work on some complicated issues. I’m writing bitBlt function for our fixed device. Fixed device is used for brush masks and it is faster then using paint device with tiles as the brush masks are usually small and they does not require tiles and undo. We need more function for bitBlt when the selection is stored in fixed device. So far I  the smudge is 1.42x faster.

Krita headquarter

Krita guys also started to work on various stuff that Peter Sikking proposed, e.g. we have a new widget and scratch box so I had to fix paintops also. Also I rewrote the deform brush. I’m trying to unify the brush engines in Krita and I wanted to add some new features to old brush engines, so rewrite was needed. The code was almost one and a half year old. I did a lot of work on brush engines also. Usually many fixes etc, but that is for other blogpost maybe later.

I would like to thank Boudewijn and Irina for taking care of us. The home-made food was “best thing ever” as Vera would say. I had nice time with the Krita team at Rempt’s house. Thank you for inviting me!

Busy week, I’m really tired and I’m looking forward for some rest and my home.

Wacom tablet does not work in Qt

February 27th, 2010

I’m at Deventer, we are having Krita sprint. I’m trying to implement some new stuff in Krita and I can’t :’( Tablet stopped to work for me in Krita and basically in Qt. I don’t see tablet events (qt example in widgets/tablet does not show anything), so no pressure, no tilt, just mouse events.

I have Fedora 12, Qt 4.6.2. linuxwacom was replaced by xorg-x11-drv-wacom, so there are no utilities like wacdump or xidump now.I removed old configuration lines in xorg.conf. They were needed long time ago, I used to setup my tablet according wacom tablet. I depend on the evdev now. The tablet work in mypaint and in Gimp. I have pressure support there.

I don’t have tablet pressure in Qt example in qt4/examples/widgets/tablet nor in Krita.

Cyrille proposed to check this command:

xlsatoms | grep -i wacom
303     Wacom Tablet Area
304     Wacom Rotation
305     Wacom Pressurecurve
306     Wacom Serial IDs
307     Wacom Strip Buttons
308     Wacom Wheel Buttons
309     Wacom TwinView Resolution
310     Wacom Display Options
311     Wacom Screen Area
312     Wacom Proximity Threshold
313     Wacom Capacity
314     Wacom Pressure Threshold
315     Wacom Sample and Suppress
316     Wacom Enable Touch
317     Wacom Hover Click
318     Wacom Tool Type
319     Wacom Button Actions
430     Wacom Stylus
431     Wacom Cursor
432     Wacom Eraser

So it looks ok. Have you got idea how to fix it? It seems that it is Fedora issue, other Krita developers
have working tablet support out of the box in the OpenSuse. I checked if the tablet support is compiled in Fedora, it is.

Krita hackers are offering me CD of OpenSuse. I like my Fedora, I don’t give up so far…Please, help if you can ;)

Sven Langkamp has also this same problem as I do, he is also on Fedora 12 these days.

Update 28.2.2010:
I filled bugreport for Fedora https://bugzilla.redhat.com/show_bug.cgi?id=569132

Also for Qt http://bugreports.qt.nokia.com/browse/QTBUG-8599

UPDATE: It works now with some patch. Fedora will ship some update soon. Thanks to Thomas Zander for the patch and Fedora guys for providing me test build!

Week 4: Optimizing iterators

February 22nd, 2010

So what happened last week? Except that Slovakia upset Russia in Hockey at Vancouver, Slovakia is also fixing Krita performance thanks to the donors, from all around the world, who made that possible. Enough about nationality, let’s talk about what happened.

The plan for the last week was to optimize iterators. We use tiled buffers for our memory management. You usually need undo function in the image processing application and storing the whole image is not a good idea. You rather break the image into tiled parts, rectangles and you store and restore them. But affects the whole application, you have to iterate pixels in filter, you have paint with the brush and that means to composite one image over another image. So basically iterators are very important for Krita. Its speed is priority of course.

We knew, from the measures  I did first week, that fetching those tiles is a slow process. That’s why we have a rectangle iterator which iterates the image tile by tile. But sometimes you need to iterate the image in lines. E.g. in the filters or under some special conditions when you need the access to the pixel before and after in the same line.

That’s why we have a line iterators, we have a vertical and a horizontal iterator. I was optimizing the line iterators. Instead of fetching the tiles every width/height of the tile, we now pre-fetch the tiles to some buffer, lock them and work with them, til we can skip to the next row of tiles. The tiles are 64×64 big by the way. Another important thing is that Krita is threaded and one thread is doing a projection updates (the final image you see in Krita) and other can work on some other stuff like painting or filtering. So we have to lock&unlock the tiles. I patched the vertical and the horizontal iterator.

Ben Dansie: Monkey

We managed to get speedup like 1.28x til 6.21x faster iterating. Cyrille Berger then suggested that we have quite complicated structure for iterators with many function calls, so let’s try a new API for iterator with less function calls. So my task was to code or better said port the horizontal iterator to the new API. The iterator is ready for testing. According the benchmarks, the new iterator is 1.5x faster then old API. Also Cyrille added a lot of custom magic to it so that it is faster. We are closer to the data manager speed, which limits the speed of iterators.

On Monday I also fixed the abr brush so that we can paint with abr brushes, but the work is not finished yet. This week I will work on it and also I will try to use the new iterator in KisPainter for compositing and explore the time results. Then at the end of the week I fly to Netherlands, where we are having our Krita sprint in Deventer, in Boudewijn’s home. I will spent 11 days there with other Krita hackers. So a lot of coding in front of us. We will be visiting the Blender team also. I’m looking forward to it!

So far we make progress, Sven Langkamp made me happy saying that the speedup for painting even with some filters is amazing now compared to what it was before. Also the artist, Ben Dansie,  from the project Durian, the Blender project of the open movie, used Krita for making some art. Cool, isn’t it? Now back to rediscover your Krita code ™.

Week 3: Photoshop brushes

February 15th, 2010

Last week was devoted to add support for abr and vbr brushes. The abr file format comes from Adobe Photoshop, the later comes from Gimp. The motivation for this were the big collections on the web. Sites like deviantart.com are full of the artist’s collection of the abr brushes. The motivation is also interoperability with GIMP and with proprietary Adobe Photoshop. vbr brush support was postponed because it was too much work just to support abr brushes.

First day I started with exploring the abr format, so that we know each other a bit. I started to write simple code to parse it, but I found out that the issue is so complicated. Valek Filippov is working on the python script which parses abr brush, so I started with that one. Important note: We support only brush format v1, v2 and v6.2 (Photoshop CS2). The newest format is not documented at all and there is not open-source implementation around. It would require a lot of reverse engineering and that takes so much time, that we would need much more time then one week for it.

Next day I realized that it would go too slow to just work with the script and I felt that we need to draw soon. So me and Boudewijn, we googled for other sources and we found GIMP plug-in which is maintained version of the work based on this code. It parses just the bitmaps from the abr file format. That’s the basic thing you need to parse out of the brush file format. There are many other options saved in abr brushes, like the setting of the brush engine (attributes like diameter, spacing, angle…) but we don’t parse them yet. Valek Filippov’s script does. We have chance to have better support for the brushes, the settings look like our brush engine presets.

I was porting the ANSI C code which used the glib to Qt. It took two days to port it correctly. Next two days has been devoted to integration into Krita. The architecture for the brushes is powerful, but it is not easy, so I spent time investigating how it works and what do you need to do so that your brushes are correctly displayed and painted.We load all the brushes in the memory when Krita starts. Krita uses nice resource architecture, but it did not have support for importing one file as many resources. Abr brush consists of many bitmaps, which we use as brush masks so far. Sven Langkamp helped with that. Boudewijn Rempt helped  a lot by doing preliminary integration code and then I was fixing it so that it works. We decided to  reuse the brush chooser dialog we use for gbr brushes (other Gimp brush format). For the future it would be cool to have more features in the brush chooser dialog, or maybe we will need to write different widget for abr brushes. Some categories would be useful.

The painting with abr brushes does not work yet. I will work on the bugs. We get the correct image in the brush generation code, but we paint plain colored rectangle instead. There are some minor bugs that needs to be fixed too.

For the future it would be useful to support the attributes of the brush engine saved in the format. Our brush engine should support most of them, maybe some features will have to be coded and some are hard to implement as there is no specification for them.

Here is screenshot showing loaded abr brush into our brush chooser. The code is already in trunk, so you can at least see your brushes in the chooser. I hope we will be able to paint tonight!

13:57 update: The problem is fixed and now I paint with the abr brushes!

Brush chooser with Abr brush masks

Week 2: Optimized Autobrush. Faster painting.

February 8th, 2010

The second week of our action plan for optimizing Krita was devoted to optimizing painting in Krita. Although there are many great paintops in Krita, digital painters tend to use most of the time the simple default brush engine which we call Pixel Brush. Painters can use GIMP brushes here, the text brush, but the most used brush tip is called Autobrush. You can setup the brush attributes like shape (circle, rectangle) and you can change the ratio to get an ellipse. Then you can change softness by vertical and horizontal fading. If you play with spikes and ratio, you get stars and other funny shapes. The brush has many dynamic attributes thanks Cyrille Berger’s work on concept called sensors. E.g. tablet can control the size by pressure, by tilt or anything you want (e.g. interesting is drawing angle) and it can be tuned by curve.

The algorithm, which computes the brush mask, stamped on the canvas regular as you stroke, is computed by the KisCircleMask::valueAt(). It is a computationally expensive function according valgrind logs we did week ago and many times before. David Revoy, team member of the Durian project, said that using 70px brush on 2500×2500 image was very slow in Krita. So we needed to optimize that.

I started with exploration of the code. I’m not the author of the autobrush, though I did most of the paintops in Krita (10 paintops are mine out of 19, many are experimental). First catch was the interpolation in the brush mask computation. We called valueAt() 4 times per pixel of the brush mask. We found out with Cyrille that the valueAt function used to take integer parameters a long long time ago and double values of the brush mask pixel positions were computed with interpolation. So I decided to remove the interpolation as the function has been capable to take double input long time ago. And the results of the valueAt() are more precise then interpolation. The benefit was great. Painting was 4x faster. Benchmark for random lines with changing size according pressure dropped from 18 seconds to 4 seconds on 4096×4096 image. Check it in the wiki, related table is called Just with performance fix.

From the valgrind logs we noticed that the atan2 function is called too often. “Chickenpumpsuggested some cool old school tricks in comments. And so we gave that a try. I used double hashing with QHash in QHash for the 2D function atan2, but that was very slow due to low cache hit ratio and expensive hashing. Then Cyrille posted some links with free code which implemented a fast atan2 with an internal lookup table. So I ported that code to Krita. Cyrille did some magic stuff like computation with fixed precision on library loading time and some little tune ups to speed the fast atan2 computation and we managed to get more speed up around 1.3x faster then without fast atan2 function. There is probably some more room for optimizations as the fast atan2 implementation uses a quite small lookup table. Also I tested some other implementations, but it had problem with precision. It had 3 degrees error. That is too much for us, so I dropped that.

I remembered a quite interesting magic function for fast inverse square root used in Quake III. So I gave it a try as we use inverse square root in valueAt() too. I found out by benchmark that fast inverse square root is slower then directly computing the inverse square root (1/sqrt(x)). It used to be 4 times faster a long time ago. Probably Intel implemented that in processors already. Or the optimization done by compiler was not so effective in case of fast version. Again we dropped that.

Most of the use cases for painting include brush masks which are symmetrical. The algorithm could compute just 1/4 of the mask. Next step was implementing this.

First version used 4 pointers to the memory and compute 1 pixel and copy 3 pixels to the right region. I managed to get another 1.7x speed up (from 3.555 ms to 1.9 ms).

Memory access is very important and can slow down computation. It is like when you use setPixel/pixel method to access pixels in pixel buffer – you supposed to use scanlines, that is faster. Here is some interesting article about it. If you don’t have something to read, here is also nice CPU memory bible.

4 iteratorsFirst version used 4 iterators over image pixels. One computation per pixel. And copy the values.

So I decided to make it little faster just by using two pointers. I compute 1/4 of the mask and copy this part to the NW region. And then I copy the rows in the lower part of the mask in correct order – I mirror it.

2 iterators version Improved version used two iteratiors and memcpy the second half of the brush mask.

I found out on friday evening that it does not work though. Circle masks seems symmetrical from user point of view, but they are not. The brush mask respects sub-pixel precision in Krita, so the edge pixels of the circle are not symmetrical. The sub-pixel precision is visible when you work with big zoom level. I have an idea for computation 1/4 of the brush mask, but I decided to post-pone it.

Other possibilities are still around:

  1. mip-mapping : pre-compute various levels of brush mask to buffer and interpolate the masks. We do this for Gimp brush. We would interpolate two computed brush masks instead of computate the single mask. Maybe it could be faster, maybe not. The reason for mipmap in GIMP brush painting we have, was to increase the quality of the scaled brushes as Adrian Page, hacker in the Krita team, wrote me in an email. Mip-mapping would require to split rotation from the mask computation. This can lead to different results regarding of the brush mask. Now we consider the rotation in the mask computation. Then we would rotate un-rotated mask by image processing – rotate image. Some conformation rendering test would be needed. The advantage would be support for rotation of the gimp brushes in Krita.
  2. cache the brush mask for mouse users: cache the dab as the mask doesn’t change. This would be nice if we did not compute sub-pixel precision. But we do that, so the cache hits ratio would be very small. It would be usable for 100% zoom, when sub-pixel position is zero. And of course big condition for checking of the parameter changes would be required.
  3. Compute the mask with graphics card – use shaders: that would be cool, I have some initial experience with shaders but integration would be harder and probably too experimental for our plan. I’m mentioning this as we discussed this with Sven Langkamp in Oslo and so that it is not forgotten.
  4. We will probably do some garbage recycling – memory allocation is slow, we can benefit from recycling memory. It is a matter of discussion on IRC at #krita on freenode. You are welcome to join.

Final time of the computation in benchmark for random lines is 1,449.2 ms. It dropped from 18,576 ms. So the painting was 16xtimes faster.But I revert the 1/4 of the brush mask speed up, so the current speed is 3.555 ms. Painting will be 6xtimes faster. The speed is considered to be usable for big brushes now. I invite you to do check-out of the trunk and try to play with big brushes. 200 px is now very usable on my laptop. What about yours?

I updated my WordPress blog. I dropped the previous classic WordPress theme and selected the default one – lazy developer. I did not like the font in the previous one. I don’t have much time to play with web-designing these days. But at least I customized the default Kubrick theme. I changed the fixed width of the theme to wider values. I did also simple custom header with some random strokes with my paintops in Krita. I hope you will like it. Every image in the blogpost is made in Krita.

First week of Krita full-time hacking

February 1st, 2010

I finished all my exams on time, so last Monday I could starting working full-time on Krita. I’ve now been at it for a full week! How does it go?

First week according the plan was aimed at measuring the speed of Krita. We talked about our bottlenecks on IRC regulary. We also talked about them in Oslo. But we didn’t have any numbers. Sven Langkamp did some benchmarks using QTime on iterators, there were some performence tests scattered in unit tests etc. Boudewijn has decided to use new benchmark features from Qt4.6. So I wrote 10 benchmark classes where we benchmarked stuff like internal data memory management for images – our tile engine. Some access classes which access pixels for use called iterators. They allow to iterate over pixels in various ways. Vertically, horizontally , in small rectangles, randomly etc.

Another important thing is compositing. That is the work of class called KisPainter (something like QPainter in Qt but with different complicated features not available in QPainter). We benchmarked the speed of bitBlt operation with two types of memory storage. First is KisPaintDevice and second one is called KisFixedPaintDevice. The later one is lightweight version of the first one. It is similar to QPaintDevice in Qt but again with more complicated features.

Øyvind Kolås a.k.a. pippin, Gegl developer is around. Pippin shared his knowledge about benchmarking with us on irc (btw come and visit as at #krita on irc.freenode.net) We decided to make tests for blur and brightness/contrast filter. The first one is convolution filter, the second one is here because we wanted to be able to compare to GIMP.

We also made a benchmark for the image projection. The projection benchmark loads an image in Krita’s native format, computes the whole image constructed from various types of layers like group, filter, adjustment layer etc. and in the end we again save the file into native Krita format. Our focus through this plan is to speed up painting. So we can’t avoid stroke benchmarks.

Result image from the benchmark of the strokes

Thanks go to Sven Langkamp for his work on presets saving/loading. Using paintop presets, we can use the benchmark code I did, for any paintop. We benchmarked our autobrush default paintop. It is most used paintop for digital painters. I’m very happy about this benchmark as I can test my other paintops easily, all I need to do is create a preset for any paintop, save it in Krita and run the benchmark with the preset.

There results of the benchmars are on our wiki.

So it looks: The data manager, which is responsible undo/redo and basically for storing and retrieving data is fast. It allows us to read/write data at very high speeds.  From 1333.3 Mb/s to 1628.4 Mb/s, according to the benchmark results. We benchmarked on 4096×4094 RGBA image (64 Mb) which was read/wrote 100 times. For comparison memcpy for two buffers of the same size as the image we used is almost the same speed. There is benchmark code for memcpy in KisDatamanagerBenchmark, you can try it yourself if you want :)

Horizontal and vertical iterators are 11 times slower then Rectangular iterator. The reason is that there is no caching. From valgrind logs we see that fetching and switching tiles is very slow, so we need to implement caching there. Every 64 pixels a new tile is fetched and switched to. Why? Our tile is size of 64 pixels. This slow downs the iteration. We will cache the both iterators to avoid switching and fetching. We will cache tiles, we don’t do that now. The rectangular iterator is quite fast, does not offer so many opportunities to improve. The random iterator on the other hand is the slowest one. Again fetching and switching to tiles is expensive. Some caching strategy would be handy for moving around the image. But the use case for using the random accessor is different so the cache strategy should be somehow adaptive. The random accessor is 13-times slower than rectangular one.

The compositing operation also known as BitBlt of KisPainter is used very often. There is room for improvements, because currently it uses a slow iterator – random iterator. The speed is very decent, but we will try to make a lot faster.

Filters are very slow compared to GIMP. At least according the numbers pippin provided. But first we need to improve underlying issues like the aforementioned iterators to improve their performance. We blur the image with speed of 0.8 Mb/s. Here we need to optimize the convolution painter. Also the speed could be better when we optimize the horizontal iterator.

Strokes are slow because of the recomputation of the brush mask is needed  every dab – every time brush touches the canvas in certain point. That’s slow. You can see from the valgrind log, that the math function atan2 is slow. We have to cache the result to avoid this.

The conclusion at the end of the first week is that we need to cache iterators and cache the brush mask. Some parts of Krita have very nice speed like the tile engine. Then we have slow parts like iterators. I think we can gather a good performance boost with our plan.