2010-07-25

Canon P-150 scanning timings

The second power cable doesn't seem to be all that effective in boosting the speed of the scanner.

Using a custom SANE frontend, the timings for scanning 3 duplex A4 pages with B&W and 600DPI settings:

0: TIMING 17555283 usecs
1: TIMING 25179 usecs
2: TIMING 14859625 usecs
3: TIMING 24944 usecs
4: TIMING 15765811 usecs
5: TIMING 8385 usecs

With the second USB cable connected:
0: TIMING 14738761 usecs
1: TIMING 21129 usecs
2: TIMING 13816726 usecs
3: TIMING 17589 usecs
4: TIMING 13548819 usecs
5: TIMING 25447 usecs

The timings vary quite a bit (plus minus one second), I'm suspecting the backend does some silly synchronization dance with the scanner, but using the second cable causes the scanner to scan marginally faster (it seems that the scanner doesn't pause as often with the second cable).

I should also mention that there seems to be some stability issue with the second cable (the scanner jams internally and the backend returns Error during device I/O at sane_start()).

Also, while developing the frontend, it becomes clear that the backend/scanner combination has some issues when problems appear. It seems that the backend is unable to reset the scanner at all and one must resort to closing the scanner case physically (which power offs the scanner) and opening it again. Luckily the feeding mechanism is pretty simple to clear from jams (the scanner doesn't autofeed the page if it starts in jammed position).

2010-07-24

Canon P-150 and Linux

The time is ripe for a blog methinks, and what better way to start it than a rant about proprietary drivers from a large multinational and a product review (all in one!).

I set out to find a suitable scanner for developing OCR software related to accounting and invoice processing. Since using Windows is a big no-no for me personally (due to many reasons), I knew that the project would become "interesting" very quickly. Another factor contributing to the problem was that my task was outside the regular hobbyist needs, which limits the amount of pre-existing information available via googling.

The scanner I was looking for would preferably have all of these features (in this order):
  • Linux drivers
  • Support duplex scanning (a lot of paper invoices in Finland are printed on both sides)
  • Support multiple page feeding without operator intervention
  • Take as little space on the desktop as possible.
  • Be not too expensive.
  • Be relatively easy to find in Finland
Now, since I'd really like to support HP in their Linux support (hplip), that was the natural starting point for my enquiries. Sadly, there doesn't seem anything from HP that I could choose upon. Also, in the document imaging category, the price range gets up quite quickly and it then becomes quite hard to find reviews and comparisons of devices on the web.

After some time spent googling and reading semiautomated link-hording review-sites (always fun), I ended up with Canon P-150. It fits the bill for all of my requirements, and at least some of the existing reviews had positive outcomes.

Now, I've fought many a battle with Canon and their Linux "support" over the years. So, I knew that even if that they market that a "SANE compatible Linux driver" is available, I took it with some grain of salt. But they have a driver at least, how bad can it be?

Using a scale of 0-10 where 0 means that the vendor is a complete Microsoft-lackey to 10 being .. well, I don't really know what, I'd like to write Intel or HP here, but really, even they're closer to 6-7 on this scale. So, let's assume 10 means a vendor that supports Linux on all their products that they sell or at least provide complete documentation on how to implement 100% support on Linux. None of the product vendors in the current mass market fit this bill.

So, back to Canon. Previously I'd set them in somewhere 3-4 on average Linux support.

Using a spare day (it's still my holiday), I bought the scanner and started playing with it. True to the existing reviews, the scanner is quite compact and does seem like a nice piece of hardware.

Some sore points about the product (none of which were show-stoppers for me):
  • Using a gloss finished black plastic parts is bad. While it definitely adds to the wow-factor, having your fingerprints all over the device does not.
  • The guides on the device are all plastic. While this won't be a problem for stationary device use, it might become a problem if you lug the device around or decide to pack it away from the desktop. This will probably end up breaking the guides in the end.
  • You might need two USB ports to power the device. I've only used on USB port so far and haven't tested whether the scanning is any faster with more power over the USB.
(2010-07-26 update after scanning about 500 two-sided paper sheets at 600DPI, B&W): The ADF needs serious hand-holding. It's not possible to leave the scanner working on it's own. Especially the first page to be scanned and the last page need manual "twiggling" in order to be fed into the mechanism. Having more papers helps, but this is a serious drawback in the ADF. Another issue is that now and then the scanner decides to go into "ADF jammed" state (or just sits there while it has fed a bit of the next page), and will not come out of it short of a full power-cycle. So, I wouldn't recommend this scanner for automated pipelines, since an operator is necessary at all times. Which is a shame, the scanner otherwise has performed quite nicely.

Now, back to Linux.

The drivers that are available from Canon support site (which are relatively easy to find), come in zipped files.

Inside the ZIP (d1024mux.zip) you'll find a deb, rpm and source tarball for both P-150 and DR-150 for the SANE backend.

Some issues with the current drivers (1.00 - 0.02, which is the first release and I doubt there will be a subsequent release):
  • The deb and RPM files are for 32-bit mode only. While the 64-bit Linux distros support running mixed binaries easily nowadays, having only 32-bit drivers is an issue for SANE backends. The backend will be loaded via dlopen (as an .so file), so it can't be used in a 64-bit program (utilizing SANE). This means that you can't use the binary packages if you're running a 64-bit Linux (without doing all kinds of irritating operations first).
  • The Debian packaging control file is done wrong. It uses temporary paths as the file members, and the end result is dpkg -L spewing out paths that just aren't there (most of the files are placed under /opt/Canon/ at install time).
  • The source tarball is also interesting. It is a mixture of proprietary binary only code, pre-built binary components and source code. How nice.
So, assuming you're running a 64-bit host, what to do? I wish I could say "easy", but.. I guess SANE doesn't really support proprietary binary drivers so the process is easy to muck up like Canon has.

So, the process goes more or less like this (assuming a Debian-like target):
  1. Retrieve the source tarball for sane-backends-1.0.19. This is the version that is mentioned in the somewhat terse README from Canon.
  2. Extract it in parallel directory with "cndrvsane-p150-1.00-0.2". Yes, it needs to be parallel, since the makefile rules within the cndrvsane use a relative path addressing (../../sane-backends-1.0.19).. How nice.
  3. configure and make the sane-backends first. Do not install. Also, my build din't actually even finish, but it doesn't seem to matter. The only thing that is necessary from this step is the sane backends config.h and the dependency files (courtesy of libtool, our "helper").
  4. Switch to the cndrvsane source directory
  5. run configure
  6. fakeroot make -f debian/rules binary
  7. This should result in the proper deb file that can be installed. The file list will still be wrong as per the original deb, but at least the architecture is mostly correct. Most of the files will install under /opt/Canon.
  8. Make a symlink from /usr/local/lib/canondr to /opt/Canon/lib/canondr . Based on stracing (with -f) scanimage, this is the path under which the backends are accessed for some reason and the symlink is not otherwise made properly (I was too lazy to fix the deb control files, and it shouldn't be my job).
The proprietary bit is the 32-bit binary called "canondr_backendp150". It's an application written in C++ and links against libpthread. What are the odds of it being deadlock free? Your guess is as good as any. Since the backend will run in a separate process from your SANE frontend, it can stay 32-bit (as long as your system can run 32-bit C++ programs).

The client and library shim parts are under GPL (although the file copyright headers do suggest that Canon reserves all rights to them, which to my mind is just plain wrong, especially since the shims don't seem to do much of anything except pass the stuff onto the backend). IANAL.

What's left is writing up a proper udev rule for the scanner and then playing with the scanner using scanimage. Other pages cover this well enough, so good luck with that (just remember to switch off the scanner from the auto-connect mode).

The only feature that I've been unable to use so far, is the top-panel button. Seems like there are two principal ways of doing this:
  • scanbuttond, which uses libusb to poll the button states using its own backend code which has been reverse engineered from USB traffic dumps. The project seems dead, or at least in deep hibernation. Needless to say it doesn't support P-150.
  • kscannerbuttons, which uses --wait-for-button functions in existing SANE backends. Since the proprietary P-150 backend doesn't support this option, kscannerbuttons can't be used.
Parting words to Canon:
It would be so much easier for me to recommend your products without your shenanigans with Linux support. Even having a public contact point for Linux issues would be nice, so I could report the issues. Heck, I could even send you patches if you'd just give me a chance.

So, Canon stays in the 3-4 category for now (it could be worse).