Scanning advice needed ! - Printable Version +- HP Forums (https://archived.hpcalc.org/museumforum) +-- Forum: HP Museum Forums (https://archived.hpcalc.org/museumforum/forum-1.html) +--- Forum: Old HP Forum Archives (https://archived.hpcalc.org/museumforum/forum-2.html) +--- Thread: Scanning advice needed ! (/thread-52226.html) |
Scanning advice needed ! - Valentin Albillo - 02-23-2004 Hi all:
I've just got hold of a very good scanner and I'm
As you may see, the material is very varied, so the print The work involved is alredy utterly overwhelming as it is so I might as well do it properly at first try. The final result will surely be a great addition to the worlwide HP fan community pool of past knowledge. Thanks in advance and best regards from V.
Re: Scanning advice needed ! (somehow long...) - Vieira, Luiz C. (Brazil) - 02-23-2004 Hi, V; I'm glad you're gonna share your treasures with us here, and I cannot help telling you I'm a lot curious about it all. Thanks! For what I can tell you about my own experience scanning and OCR processing (not that much, but about 90% successful results), I enumerate some procedures of mine I take as useful; hope they help you, too.
- try separating originals as for typewritten, with and without graphics (and set graphics as single with lines and complex with gray-scaled images), hand written with colored or black pen; About pdf generation and image storage: I think PDF is a great distribution format, and it somehow "protects" ownership. I found that the best Windows-based SW (I think) to compose images into a final PDF "booklet" is Imaging, the standard Windows image/scanner manager. I don't like to use Imaging to process images or scanners, but I like it to compose a set of images into pages of the same document. It shrinks and stretches images with different sizes to fit inside a default page size, shrunk images do not loose resolution in final PDF (if you have the MoHPC CD's, have a look at the Portuguese version of the 82104A Owner's Handbook, HP67/97 compatibility) and final size is fairly acceptable. I don't know image processors used in other platforms (Mac, Linux-base PC's, etc.), but I know that generating a PDF from an original doc under Linux is a standard procedure (default), and you'll need some extra, non-standard plug-ins to generate PDF in Windows. Wow! I wrote too much. When we write too many things, there is a potential margin of errors... Anyway, I think I did not forget the main themes. I'd like to add that this is all based in an original text I prepared (in Portuguese) to my friend José Ernesto, and I'm not sure I actually sent it at the time I wrote it... Zé, if I did not, please forgive me... <:^( Hope this helps you, Valentin. Cheers.
Luiz
Re: Scanning advice needed ! - one correction - Vieira, Luiz C. (Brazil) - 02-23-2004 Hi, folks; please, where you read: KEEP ALL OF YOUR ORIGINALS IN SAFE and process copies of them; read instead: KEEP ALL OF YOUR ORIGINAL FILES WITH ORIGINAL SCANNED IMAGES IN SAFE and process only the files with copies of them; Not so much a meaningfull suggestion. Cheers.
Luiz (Brazil)
Re: Scanning advice needed ! - Terry Ingram - 02-23-2004 Hello Valentin, From a user's point of view, include only one page per scan. It's tempting to double-up and save scanning time, as was done with the HP-71b Reference Manual on the MoHpc DVD. Thanks for sharing your information. Looking forward to seeing the SHARP information.
Terry
Re: Scanning advice needed ! (somehow long...) - Valentin Albillo - 02-23-2004 Hi, Luiz: Luiz posted: "I'm glad you're gonna share your treasures with us here, and I cannot help telling you I'm a lot curious about it all." Thanks a lot for your comprehensive and detailed help, I'll try and follow your savvy advices and I'm sure the results will be excellent. I'm not very sure whether I'll keep each page as a standalone image file or else I'll group them into PDF documents. Both approaches have its pros & cons. Another possibility is to group them into .cbz files, for very easy and convenient reading using CDisplay, while also compressing them a little more still. As for your curiosity, believe me, there are things very rarely seen, if at all, such as the long questionnaire that HP sent us HP-67 owners in order to survey for wish-list features to be incorporated in a next model (the HP-41C, for sure) .. a great many mostly unpublished HP-67/97, 25C, 34C, 41C, and 71B programs .. not to mention the incredibly interesting (if only for historical reasons) correspondence I had with many great PPC contributors at the very beginning of 41C's synthetics exploration. Reading them gives you the feeling of being transported back to the golden days of PPC , almost like reading new, unpublished issues :-) The one and only problem (apart from knowing proper scanning techniques) is that of sheer volume. It'll take a *long* while to get a significant portion of it in electronic format, and there's the problem of where to put that many megabytes so that they're accessible on-line to everyone interested. Not an easy matter, though I can always resort to rotating contents.
Thanks again and best regards from V.
Re: Scanning advice needed ! - Valentin Albillo - 02-23-2004 Hi, Terry: Terry posted: "Thanks for sharing your information. Looking forward to seeing the SHARP information." Certainly, some 100+ unpublished programs for the SHARP PC-1211 (TRS-80 PC-1) and later models, many of them pretty curious indeed.
Thanks for your interest and best regards from V.
Re: SHARP models - Terry Ingram - 02-23-2004 Hi Valentin, I have the TRS-80 PC-1, PC-2, & PC-3. These are really outstanding machines, with the PC-1 a joy to use. But on the topic of sharing... I have the "Game", "Business Finance", & "Personal Finance" packs for the PC-2 (sharp 1500). These are supplied on cassette tapes with also paper booklets. Do you know if anyone has been granted permission to share these materials. I would like to also share this information, but concerned about infringment.
Terry
Re: Scanning advice needed ! - Dave Shaffer (Arizona) - 02-23-2004 Valentin, I second Luiz's choices of file types. I almost always produce PCX files as my first output from a scan. They are lossless and can always be changed into something else later if necessary. Further (lossless) compression (up to 80% or so) of the PCX files seems to be possible with ZIP. Producing PDF files from PCX files also seems to provide compression (with, as far as I can see) no loss of detail. I use Adobe Acrobat 5 for this. Under NO CIRCUMSTANCES use JPG compression. It's not too bad for general images, but horrible for text and line drawings (in B&W or color). As to resolution: if these are to be archival scans, it might be worth the time and space to "overscan." While 300 dpi will usually give acceptable results, I almost always use 600 dpi. Again, as Luiz notes, you can always throw things away AFTER you have scanned. Another point: I have found with all my scanners (mid-range from all kinds of manufacturers - HP, Epson, Canon, Microtek) that despite expectations, a B&W scan does not produce the same fine resolution as a gray scale scan. It must have something to do with the sensor. If I need the best resolution, I generally scan in gray and then adjust the contrast and brightness in post-processing to give an essentially B&W image (again, at the cost of scanning time and storage space).
The bottom line: make some practice scans, and play with your post-processing software to decide on an optimal strategy.
Thank you very much ! [no text] - Valentin Albillo - 02-23-2004 Indeed !
Best regards from V.
Re: Scanning advice needed ! - Bill (Smithville, NJ) - 02-23-2004 Hi Valentin, Here's some ideas that I use when scanning: For Black-White scans, use 300 DPI for Letter size, 600 DPI for smaller size, or for letter size with small text, such as program listings. Based on the originals, select a standard size paper space to scan to. That way all the scanned images are the same size. Use an imaging program to clean up the scans. I use the Kodak Imaging program that comes with Windows. Zoom in on the scan, then delete any artifacts such as staple/punch holes/or dust specs. Although this takes quite a bit of time to do for each scanned page, it makes a world of difference to the finished product. I hate seeing black marks up the side of a scanned page. It also saves a lot of ink when it's printed out. If scanning a manual, include the blank pages - no need to scan them - just make sure the PDF file includes them. That way the PDF file can be printed to a double sided printer and recreate the original with the pages numbers on the correct page edge. If the original requires, scan to either gray or color. Use an imaging program to increase contrast/color balance. Then reduce the number of colors to a lower level. I just finished scanning the Advantage module manual. Each page was scanned at 8.42" by 5.49", at 300 DPI color. Loaded each page into Microsoft Photo Editor, increased contrast to 60%. This made the backgroud completely white. Then loaded each page into Kodak Imaging, erased the edge spiral binding marks. I left it at true color, but could have lowered it to 256 colors to save space. Added the blank pages and created PDF. The finished file when printed, using Adobe scale to Fit Page, creates a great Advantage Manual enlarged to letter size. I just bound it and use it for daily reference.
Re: SHARP models - Valentin Albillo - 02-23-2004 Terry posted:
"I have the TRS-80 PC-1, PC-2, & PC-3. These are really outstanding machines, with the PC-1 a joy to use." Agreed. It was in many aspects much better than its egregious contemporary, the HP-41C, though sadly many HP fundamentalists of the time wouldn't touch it with a ten feet pole. Good for them.
"I have the "Game", "Business Finance", & "Personal Finance" packs for the PC-2 (sharp 1500) [...] Do you know if anyone has been granted permission to share these
The following pacs are available for public download at Sharp PC-1500 computer (TRS-80 PC-2) resource page: PC-2 Business Finance among lots of other software, so it seems safe to assume that either permission has been granted or that site's webmaster couldn't care less about any copyright left on such old, obsolete, monetarily worthless software. Personally, I wouldn't give a damn, too, and would make it available as well. The risk of being 'sued' for doing so seems to me far less than the risk of an asteroid striking dead center on my very roof and I can live with that, thank you very much. Best regards from V.
Re: Scanning advice needed ! - Vieira, Luiz C. (Brazil) - 02-23-2004 Hello, Dave; thank you for your complementary and valuable information. After reading your post I noticed that I forgot to add some info about OCR, and that PCX B&W scanned images with 300 dpi or more (depending on sharpmess and typeset size) may give you text files too close to the original information. I'm achieving good results on recovering text information from scanned images. And it's a fact that 600 dpi is a lot better, and most OCR software will generate fewer errors when applied to higher resolution images. If information is the target and time to do it is not an issue, text documents generated from images with OCR are the ones that will occupy the smallest space, as you surely know about. And you can re-design the page as you wish. Anyway, if historical reasons are the target, then let's keep them as they are, with the highest resolution possible ;^) Best regards and thank you again.
Luiz (Brazil)
Re: SHARP models - Terry Ingram - 02-23-2004 Hi Valentin, Thanks for the link to the sharp/trs-80 resource page. I'm impressed with the amount of information & code listings available. I had also contemplated porting some of the sharp/trs-80 code listings into hp-71b code. The graphics/games may be a problem, but I'm guessing most of the business stuff would convert relatively easy. I have noticed the hp-71b handles strings in a non-conventional way, also some common basic statements are named differently. Not really sure if there would be any interest... I'm way off the original topic now. I should post a query regarding anyone's interest I guess.
Thanks, Valentin, some of this is already scanned FYI - Gene - 02-23-2004 Gene: HI. Thought I'd make sure you knew that some of this is already available ... might save you some effort. Of course, if you don't like how it has already been done, feel free to scan it better. :-) You wrote: This includes... a) tons and tons of progam listings (most of them unpublished), b) HP publications (Digest, Keynotes, Journal, brochures, marketing materials, courses, internal documentation), GENE: All the keynotes are scanned and on disks offered by Jake Schwartz. He also has all (?) of the calculator/portable computer Journal articles scanned. I think the Digests are done. c) user publications (Australian PPC Technical Notes),
e) hundreds of HP-calc related letters to and from other HP fans of old (like John McGechie's or Tom Cadwallader's), not to mention SHARP programs and materials.
Re: Scanning advice needed ! - Ángel Martin - 02-23-2004 Valentin, looking forward to *finally* seeing and enjoying the programs from "Matematica Avanzada"... if I could bias your selection, can you start with these first? Animo con la tarea!
Best, Re: Scanning advice needed ! - David Ramsey - 02-23-2004 OK, before I start, here are my bona fides: * I was a programmer on both Macintosh and Windows versions of Caere's "OmniPage" OCR program for 8+ years. * I've been scanning old calculator programs for Gene since October of last year. That said, here's my advice: 1. Forget OCR for the program listings. Most of the documents you scan will likely be complex enough to require extensive manual tweaking. Also, as you noticed, the quality varies dramatically and even for the good documents, the OCR engine's recognition dictionary won't have most of the terms anyway. I've been scanning hundreds of old programs (ftp://ftp.neko.com/HP_Docs) and have simply reproduced each page as a 300dpi monochrome image (600dpi in a few cases). 2. PDF is the only way to go. You ABSOLUTELY don't want to use some proprietary solution or one that's only available on a specific platform. 3. A scanner with a sheet feeder is virtually a must. I've been using the HP 5550c, a nice little scanner that sells for $300 in the US. It's inexpensive, reasonably fast (about 4 pages per minute sustained), and can handle double-sided originals.
BTW, feel free to copy and distribute or archive any of the stuff I've done so far.
Re: Scanning advice needed ! - Ángel Martin - 02-24-2004 David, quite an impressive archive! I can only start to *imagine* all the work you've put to it, a great achievement. Thanks for sharing it with all of us.-
Best, Re: Valentin, some of this is already scanned FYI - Valentin Albillo - 02-24-2004 Thanks for the caveat, Gene, much appreciated. I'll follow your advice and avoid re-scanning those materials, then.
Best regards from V.
Thank you very much for ALL your advices ! :-) - Valentin Albillo - 02-24-2004 Much appreciated, indeed.
Best regards from V.
Re: Valentin, some of this is already scanned FYI - Gene - 02-24-2004 Well, take a look at how they were scanned and then decide if you like it or not. :-) Can't wait to see the goodies you have.
Gene
Re: Thank you very much for ALL your advices ! :-) - Ángel Martin - 02-24-2004 Valentín, your project is intrinsically interesting to all of us, and thus the warm response. In fact, I'm hardly containing my enthusiasm to finally see you unearth all those gems from their sleep. One thing that it's been popping up in my mind is making the *ultimate* compilation of math programs, for which I'm more than sure yours will more than qualify. Can't you already see the "ADV MATH" ROM label?? :-)
Does that intrigue you? I'll be extremely happy to be your ROM compiler if you feel like getting into this project!
Saludos, |