Maybe its a sign of my OCDness, but one thing that keeps me awake at night is the amount of paperwork I get, and must hold onto just in case I need it some day. I dread the knock on the door, two scary looking MIBs from the IRS demanding form JU313378223 that they’d sent me 6 years ago, and if I don’t have it they’d gleefully pluck my savings straight out of my bank account to give to Dick Cheney’s old friends at Halliburton in Iraq.
Either way, the end result is that I wind up hoarding vast amounts of paperwork that I will probably never need, but I can’t be sure enough to throw it away.
If you are fortunate and have the organizational ability to file away the paperwork (or perhaps you don’t but you have a spouse, boyfriend, or girlfriend that does), then you may even be able to dig out what you need when it finally turns out that you need it.
Then I asked myself: Weren’t we all supposed to be living in a paperless world by now?
So I started to look into possible solutions to the “paper hoarding” problem. I looked at “document management systems”, but they were all extremely expensive, and so far as I could tell, didn’t necessarily do what I needed.
After a 5 minute installation and setup, which was relatively painless on my Mac, you just place a document in the scanner, and hit the “Scan” button. It supports up to (I think) 30 pages at once, and it is very fast, scanning up to 18 pages a minute.
And here is the best bit. With a little bit of configuration, and using some bundled Optical Character Recognition (OCR) software, it will scan your documents, convert them to searchable PDFs, and save them – all automatically! This is very powerful when combined with the Mac’s Spotlight search, because it means that you can now do keyword searching of your scanned documents. This is a lot more convenient and flexible than having to rely on some manual filing system.
The way the OCR works is interesting, because the recognized text is stored “under” the image of the text in the PDF, so when you view or print the PDF you are still looking at the original scan, but when you search it it will highlight the appropriate piece of text, like this:
Its not perfect, but its close. A few of my current gripes are:
- While scanning is fast, the OCR process takes a few seconds per page. This wouldn’t be a big problem other than if you are scanning multiple documents, you need to wait for the previous document to be processed completely, before scanning the new document. There may be some way to use scripting to get it to queue documents somehow, but it doesn’t do it out of the box.
- It also comes with bundled business card recognition software. Unfortunately, it doesn’t seem to work so well, and its not smart enough to decide automatically whether to treat something as a document or a business card.
- At over $400, its not cheap, although that is relative (its a hell of a lot cheaper than those document management systems I looked at).
All in all, its a good solution. With an intelligent backup system (I use .Mac, but friends have recommended Jungle Disk so I’ll probably try that soon) you are pretty much guaranteed never to lose an important document, and you’ll be able to find it when you need it.