Archive for February, 2010

Wacom tablet does not work in Qt

Saturday, February 27th, 2010

I’m at Deventer, we are having Krita sprint. I’m trying to implement some new stuff in Krita and I can’t :’( Tablet stopped to work for me in Krita and basically in Qt. I don’t see tablet events (qt example in widgets/tablet does not show anything), so no pressure, no tilt, just mouse events.

I have Fedora 12, Qt 4.6.2. linuxwacom was replaced by xorg-x11-drv-wacom, so there are no utilities like wacdump or xidump now.I removed old configuration lines in xorg.conf. They were needed long time ago, I used to setup my tablet according wacom tablet. I depend on the evdev now. The tablet work in mypaint and in Gimp. I have pressure support there.

I don’t have tablet pressure in Qt example in qt4/examples/widgets/tablet nor in Krita.

Cyrille proposed to check this command:

xlsatoms | grep -i wacom
303     Wacom Tablet Area
304     Wacom Rotation
305     Wacom Pressurecurve
306     Wacom Serial IDs
307     Wacom Strip Buttons
308     Wacom Wheel Buttons
309     Wacom TwinView Resolution
310     Wacom Display Options
311     Wacom Screen Area
312     Wacom Proximity Threshold
313     Wacom Capacity
314     Wacom Pressure Threshold
315     Wacom Sample and Suppress
316     Wacom Enable Touch
317     Wacom Hover Click
318     Wacom Tool Type
319     Wacom Button Actions
430     Wacom Stylus
431     Wacom Cursor
432     Wacom Eraser

So it looks ok. Have you got idea how to fix it? It seems that it is Fedora issue, other Krita developers
have working tablet support out of the box in the OpenSuse. I checked if the tablet support is compiled in Fedora, it is.

Krita hackers are offering me CD of OpenSuse. I like my Fedora, I don’t give up so far…Please, help if you can ;)

Sven Langkamp has also this same problem as I do, he is also on Fedora 12 these days.

Update 28.2.2010:
I filled bugreport for Fedora https://bugzilla.redhat.com/show_bug.cgi?id=569132

Also for Qt http://bugreports.qt.nokia.com/browse/QTBUG-8599

UPDATE: It works now with some patch. Fedora will ship some update soon. Thanks to Thomas Zander for the patch and Fedora guys for providing me test build!

Week 4: Optimizing iterators

Monday, February 22nd, 2010

So what happened last week? Except that Slovakia upset Russia in Hockey at Vancouver, Slovakia is also fixing Krita performance thanks to the donors, from all around the world, who made that possible. Enough about nationality, let’s talk about what happened.

The plan for the last week was to optimize iterators. We use tiled buffers for our memory management. You usually need undo function in the image processing application and storing the whole image is not a good idea. You rather break the image into tiled parts, rectangles and you store and restore them. But affects the whole application, you have to iterate pixels in filter, you have paint with the brush and that means to composite one image over another image. So basically iterators are very important for Krita. Its speed is priority of course.

We knew, from the measures  I did first week, that fetching those tiles is a slow process. That’s why we have a rectangle iterator which iterates the image tile by tile. But sometimes you need to iterate the image in lines. E.g. in the filters or under some special conditions when you need the access to the pixel before and after in the same line.

That’s why we have a line iterators, we have a vertical and a horizontal iterator. I was optimizing the line iterators. Instead of fetching the tiles every width/height of the tile, we now pre-fetch the tiles to some buffer, lock them and work with them, til we can skip to the next row of tiles. The tiles are 64×64 big by the way. Another important thing is that Krita is threaded and one thread is doing a projection updates (the final image you see in Krita) and other can work on some other stuff like painting or filtering. So we have to lock&unlock the tiles. I patched the vertical and the horizontal iterator.

Ben Dansie: Monkey

We managed to get speedup like 1.28x til 6.21x faster iterating. Cyrille Berger then suggested that we have quite complicated structure for iterators with many function calls, so let’s try a new API for iterator with less function calls. So my task was to code or better said port the horizontal iterator to the new API. The iterator is ready for testing. According the benchmarks, the new iterator is 1.5x faster then old API. Also Cyrille added a lot of custom magic to it so that it is faster. We are closer to the data manager speed, which limits the speed of iterators.

On Monday I also fixed the abr brush so that we can paint with abr brushes, but the work is not finished yet. This week I will work on it and also I will try to use the new iterator in KisPainter for compositing and explore the time results. Then at the end of the week I fly to Netherlands, where we are having our Krita sprint in Deventer, in Boudewijn’s home. I will spent 11 days there with other Krita hackers. So a lot of coding in front of us. We will be visiting the Blender team also. I’m looking forward to it!

So far we make progress, Sven Langkamp made me happy saying that the speedup for painting even with some filters is amazing now compared to what it was before. Also the artist, Ben Dansie,  from the project Durian, the Blender project of the open movie, used Krita for making some art. Cool, isn’t it? Now back to rediscover your Krita code ™.

Week 3: Photoshop brushes

Monday, February 15th, 2010

Last week was devoted to add support for abr and vbr brushes. The abr file format comes from Adobe Photoshop, the later comes from Gimp. The motivation for this were the big collections on the web. Sites like deviantart.com are full of the artist’s collection of the abr brushes. The motivation is also interoperability with GIMP and with proprietary Adobe Photoshop. vbr brush support was postponed because it was too much work just to support abr brushes.

First day I started with exploring the abr format, so that we know each other a bit. I started to write simple code to parse it, but I found out that the issue is so complicated. Valek Filippov is working on the python script which parses abr brush, so I started with that one. Important note: We support only brush format v1, v2 and v6.2 (Photoshop CS2). The newest format is not documented at all and there is not open-source implementation around. It would require a lot of reverse engineering and that takes so much time, that we would need much more time then one week for it.

Next day I realized that it would go too slow to just work with the script and I felt that we need to draw soon. So me and Boudewijn, we googled for other sources and we found GIMP plug-in which is maintained version of the work based on this code. It parses just the bitmaps from the abr file format. That’s the basic thing you need to parse out of the brush file format. There are many other options saved in abr brushes, like the setting of the brush engine (attributes like diameter, spacing, angle…) but we don’t parse them yet. Valek Filippov’s script does. We have chance to have better support for the brushes, the settings look like our brush engine presets.

I was porting the ANSI C code which used the glib to Qt. It took two days to port it correctly. Next two days has been devoted to integration into Krita. The architecture for the brushes is powerful, but it is not easy, so I spent time investigating how it works and what do you need to do so that your brushes are correctly displayed and painted.We load all the brushes in the memory when Krita starts. Krita uses nice resource architecture, but it did not have support for importing one file as many resources. Abr brush consists of many bitmaps, which we use as brush masks so far. Sven Langkamp helped with that. Boudewijn Rempt helped  a lot by doing preliminary integration code and then I was fixing it so that it works. We decided to  reuse the brush chooser dialog we use for gbr brushes (other Gimp brush format). For the future it would be cool to have more features in the brush chooser dialog, or maybe we will need to write different widget for abr brushes. Some categories would be useful.

The painting with abr brushes does not work yet. I will work on the bugs. We get the correct image in the brush generation code, but we paint plain colored rectangle instead. There are some minor bugs that needs to be fixed too.

For the future it would be useful to support the attributes of the brush engine saved in the format. Our brush engine should support most of them, maybe some features will have to be coded and some are hard to implement as there is no specification for them.

Here is screenshot showing loaded abr brush into our brush chooser. The code is already in trunk, so you can at least see your brushes in the chooser. I hope we will be able to paint tonight!

13:57 update: The problem is fixed and now I paint with the abr brushes!

Brush chooser with Abr brush masks

Week 2: Optimized Autobrush. Faster painting.

Monday, February 8th, 2010

The second week of our action plan for optimizing Krita was devoted to optimizing painting in Krita. Although there are many great paintops in Krita, digital painters tend to use most of the time the simple default brush engine which we call Pixel Brush. Painters can use GIMP brushes here, the text brush, but the most used brush tip is called Autobrush. You can setup the brush attributes like shape (circle, rectangle) and you can change the ratio to get an ellipse. Then you can change softness by vertical and horizontal fading. If you play with spikes and ratio, you get stars and other funny shapes. The brush has many dynamic attributes thanks Cyrille Berger’s work on concept called sensors. E.g. tablet can control the size by pressure, by tilt or anything you want (e.g. interesting is drawing angle) and it can be tuned by curve.

The algorithm, which computes the brush mask, stamped on the canvas regular as you stroke, is computed by the KisCircleMask::valueAt(). It is a computationally expensive function according valgrind logs we did week ago and many times before. David Revoy, team member of the Durian project, said that using 70px brush on 2500×2500 image was very slow in Krita. So we needed to optimize that.

I started with exploration of the code. I’m not the author of the autobrush, though I did most of the paintops in Krita (10 paintops are mine out of 19, many are experimental). First catch was the interpolation in the brush mask computation. We called valueAt() 4 times per pixel of the brush mask. We found out with Cyrille that the valueAt function used to take integer parameters a long long time ago and double values of the brush mask pixel positions were computed with interpolation. So I decided to remove the interpolation as the function has been capable to take double input long time ago. And the results of the valueAt() are more precise then interpolation. The benefit was great. Painting was 4x faster. Benchmark for random lines with changing size according pressure dropped from 18 seconds to 4 seconds on 4096×4096 image. Check it in the wiki, related table is called Just with performance fix.

From the valgrind logs we noticed that the atan2 function is called too often. “Chickenpumpsuggested some cool old school tricks in comments. And so we gave that a try. I used double hashing with QHash in QHash for the 2D function atan2, but that was very slow due to low cache hit ratio and expensive hashing. Then Cyrille posted some links with free code which implemented a fast atan2 with an internal lookup table. So I ported that code to Krita. Cyrille did some magic stuff like computation with fixed precision on library loading time and some little tune ups to speed the fast atan2 computation and we managed to get more speed up around 1.3x faster then without fast atan2 function. There is probably some more room for optimizations as the fast atan2 implementation uses a quite small lookup table. Also I tested some other implementations, but it had problem with precision. It had 3 degrees error. That is too much for us, so I dropped that.

I remembered a quite interesting magic function for fast inverse square root used in Quake III. So I gave it a try as we use inverse square root in valueAt() too. I found out by benchmark that fast inverse square root is slower then directly computing the inverse square root (1/sqrt(x)). It used to be 4 times faster a long time ago. Probably Intel implemented that in processors already. Or the optimization done by compiler was not so effective in case of fast version. Again we dropped that.

Most of the use cases for painting include brush masks which are symmetrical. The algorithm could compute just 1/4 of the mask. Next step was implementing this.

First version used 4 pointers to the memory and compute 1 pixel and copy 3 pixels to the right region. I managed to get another 1.7x speed up (from 3.555 ms to 1.9 ms).

Memory access is very important and can slow down computation. It is like when you use setPixel/pixel method to access pixels in pixel buffer – you supposed to use scanlines, that is faster. Here is some interesting article about it. If you don’t have something to read, here is also nice CPU memory bible.

4 iteratorsFirst version used 4 iterators over image pixels. One computation per pixel. And copy the values.

So I decided to make it little faster just by using two pointers. I compute 1/4 of the mask and copy this part to the NW region. And then I copy the rows in the lower part of the mask in correct order – I mirror it.

2 iterators version Improved version used two iteratiors and memcpy the second half of the brush mask.

I found out on friday evening that it does not work though. Circle masks seems symmetrical from user point of view, but they are not. The brush mask respects sub-pixel precision in Krita, so the edge pixels of the circle are not symmetrical. The sub-pixel precision is visible when you work with big zoom level. I have an idea for computation 1/4 of the brush mask, but I decided to post-pone it.

Other possibilities are still around:

  1. mip-mapping : pre-compute various levels of brush mask to buffer and interpolate the masks. We do this for Gimp brush. We would interpolate two computed brush masks instead of computate the single mask. Maybe it could be faster, maybe not. The reason for mipmap in GIMP brush painting we have, was to increase the quality of the scaled brushes as Adrian Page, hacker in the Krita team, wrote me in an email. Mip-mapping would require to split rotation from the mask computation. This can lead to different results regarding of the brush mask. Now we consider the rotation in the mask computation. Then we would rotate un-rotated mask by image processing – rotate image. Some conformation rendering test would be needed. The advantage would be support for rotation of the gimp brushes in Krita.
  2. cache the brush mask for mouse users: cache the dab as the mask doesn’t change. This would be nice if we did not compute sub-pixel precision. But we do that, so the cache hits ratio would be very small. It would be usable for 100% zoom, when sub-pixel position is zero. And of course big condition for checking of the parameter changes would be required.
  3. Compute the mask with graphics card – use shaders: that would be cool, I have some initial experience with shaders but integration would be harder and probably too experimental for our plan. I’m mentioning this as we discussed this with Sven Langkamp in Oslo and so that it is not forgotten.
  4. We will probably do some garbage recycling – memory allocation is slow, we can benefit from recycling memory. It is a matter of discussion on IRC at #krita on freenode. You are welcome to join.

Final time of the computation in benchmark for random lines is 1,449.2 ms. It dropped from 18,576 ms. So the painting was 16xtimes faster.But I revert the 1/4 of the brush mask speed up, so the current speed is 3.555 ms. Painting will be 6xtimes faster. The speed is considered to be usable for big brushes now. I invite you to do check-out of the trunk and try to play with big brushes. 200 px is now very usable on my laptop. What about yours?

I updated my WordPress blog. I dropped the previous classic WordPress theme and selected the default one – lazy developer. I did not like the font in the previous one. I don’t have much time to play with web-designing these days. But at least I customized the default Kubrick theme. I changed the fixed width of the theme to wider values. I did also simple custom header with some random strokes with my paintops in Krita. I hope you will like it. Every image in the blogpost is made in Krita.

First week of Krita full-time hacking

Monday, February 1st, 2010

I finished all my exams on time, so last Monday I could starting working full-time on Krita. I’ve now been at it for a full week! How does it go?

First week according the plan was aimed at measuring the speed of Krita. We talked about our bottlenecks on IRC regulary. We also talked about them in Oslo. But we didn’t have any numbers. Sven Langkamp did some benchmarks using QTime on iterators, there were some performence tests scattered in unit tests etc. Boudewijn has decided to use new benchmark features from Qt4.6. So I wrote 10 benchmark classes where we benchmarked stuff like internal data memory management for images – our tile engine. Some access classes which access pixels for use called iterators. They allow to iterate over pixels in various ways. Vertically, horizontally , in small rectangles, randomly etc.

Another important thing is compositing. That is the work of class called KisPainter (something like QPainter in Qt but with different complicated features not available in QPainter). We benchmarked the speed of bitBlt operation with two types of memory storage. First is KisPaintDevice and second one is called KisFixedPaintDevice. The later one is lightweight version of the first one. It is similar to QPaintDevice in Qt but again with more complicated features.

Øyvind Kolås a.k.a. pippin, Gegl developer is around. Pippin shared his knowledge about benchmarking with us on irc (btw come and visit as at #krita on irc.freenode.net) We decided to make tests for blur and brightness/contrast filter. The first one is convolution filter, the second one is here because we wanted to be able to compare to GIMP.

We also made a benchmark for the image projection. The projection benchmark loads an image in Krita’s native format, computes the whole image constructed from various types of layers like group, filter, adjustment layer etc. and in the end we again save the file into native Krita format. Our focus through this plan is to speed up painting. So we can’t avoid stroke benchmarks.

Result image from the benchmark of the strokes

Thanks go to Sven Langkamp for his work on presets saving/loading. Using paintop presets, we can use the benchmark code I did, for any paintop. We benchmarked our autobrush default paintop. It is most used paintop for digital painters. I’m very happy about this benchmark as I can test my other paintops easily, all I need to do is create a preset for any paintop, save it in Krita and run the benchmark with the preset.

There results of the benchmars are on our wiki.

So it looks: The data manager, which is responsible undo/redo and basically for storing and retrieving data is fast. It allows us to read/write data at very high speeds.  From 1333.3 Mb/s to 1628.4 Mb/s, according to the benchmark results. We benchmarked on 4096×4094 RGBA image (64 Mb) which was read/wrote 100 times. For comparison memcpy for two buffers of the same size as the image we used is almost the same speed. There is benchmark code for memcpy in KisDatamanagerBenchmark, you can try it yourself if you want :)

Horizontal and vertical iterators are 11 times slower then Rectangular iterator. The reason is that there is no caching. From valgrind logs we see that fetching and switching tiles is very slow, so we need to implement caching there. Every 64 pixels a new tile is fetched and switched to. Why? Our tile is size of 64 pixels. This slow downs the iteration. We will cache the both iterators to avoid switching and fetching. We will cache tiles, we don’t do that now. The rectangular iterator is quite fast, does not offer so many opportunities to improve. The random iterator on the other hand is the slowest one. Again fetching and switching to tiles is expensive. Some caching strategy would be handy for moving around the image. But the use case for using the random accessor is different so the cache strategy should be somehow adaptive. The random accessor is 13-times slower than rectangular one.

The compositing operation also known as BitBlt of KisPainter is used very often. There is room for improvements, because currently it uses a slow iterator – random iterator. The speed is very decent, but we will try to make a lot faster.

Filters are very slow compared to GIMP. At least according the numbers pippin provided. But first we need to improve underlying issues like the aforementioned iterators to improve their performance. We blur the image with speed of 0.8 Mb/s. Here we need to optimize the convolution painter. Also the speed could be better when we optimize the horizontal iterator.

Strokes are slow because of the recomputation of the brush mask is needed  every dab – every time brush touches the canvas in certain point. That’s slow. You can see from the valgrind log, that the math function atan2 is slow. We have to cache the result to avoid this.

The conclusion at the end of the first week is that we need to cache iterators and cache the brush mask. Some parts of Krita have very nice speed like the tile engine. Then we have slow parts like iterators. I think we can gather a good performance boost with our plan.