OCR from dot matrix?

Information from and to the site administrators.

Moderator: Alastair

Post Reply
Message
Author
iamaran
Posts: 100
Joined: Tue Apr 06, 2010 9:01 pm

OCR from dot matrix?

#1 Post by iamaran » Tue Nov 23, 2010 9:57 pm

There are two solutions on the Stairway to Hell (BBC Micro) website that are scans of dot-matrix printouts.
Has OCR improved to an extent where this could be done mostly automatically with a little correction, or is someone (ie me!) going to have to type it all in?
The games are Cordelia and The Last Days of Doom.
http://www.stairwaytohell.com/gamehelp/index.html

dave
Posts: 606
Joined: Tue Aug 21, 2007 10:20 pm

Re: OCR from dot matrix?

#2 Post by dave » Wed Nov 24, 2010 11:46 am

The bad news is that the solution for Cordelia won't go through SimpleOCR.

The good news is that the solution for LDoD goes through with a bit of encouragement (i.e. load into GIMP, manually change some of the greys to black, set palette to black and white, save). It gets around 80% of the text correct first try. The last page took me about 15 minutes to do:
The player's true mission is now revealed to save the planet from the nothingness. The nothing can see into every corner of the player's mind and is convinced that organic life is worthless. Indeed, the history of the human race convinces it that humans are habitual murderers. The player must convince the nothing otherwise. His first test is to avoid conflict with three imitation 'cavemen' in a sequence which takes place entirely in his own mind.

The puzzle underlying the sequence is the old game 'Hare and Hounds', in which the player takes the role of the hare, and may move in any direction. The cavemen, or hounds, can only move forwards or sideways. If the hare and one hound meet the game is lost. lf the hare can pass the hounds, the hare (or player) has won. The programming for this puzzle was immense, since the cavemen have a winning strategy! Thus their strategy had to be programmed to be good - to make winning difficult for the player - but not perfect. This is much harder than programming perfect play!

[yes to axe] (without the axe, the cavemen attack when adjacent) [reject both landing sites] S, N, S, N, W, NE, N, N.

By a fluke the player has avoided conflict. The intelligence of his intellect is now tested by four IQ-style questions:

[answers are: ache, 422, rickshaw, stake].

The nothing is now uncertain. It poses another scenario, based on the player's memories of the Wild West, in which the player is ethically required to kill, but will confirm the nothing's belief that humans murder if he does so. A marshal needs the player's help against some gunfighters who will arrive imminently.

The solution will require some ingenuity. In a nearby saloon three cutout characters are standing, to give verisimilitude to the sequence. By first immobilising the marshal (using violence the player can throw out each character to be shot at by the gunfighters, until their ammunition is expended. At the end of this dreamlike sequence, the marshal arrests the gunfighters and the player is once more confronted with the nothing:

[yes to revolvers] E, IN, HIT MARSHALL, GET ALL, THROW MINISTER, THROW DRUNK, THROW BARMAN

An interrogation then ensues between the nothing and the player's mind. Difficult philosophical questions are asked: was it ethical for the player's crew to die so that he could live, for example! There are no right answers to these questions! (So player responses make no difference.) However it is the little dog which finally provides the chink in the nothing's armour, and brings the story to a satisfying close.

Details are deliberately omitted here, so that a reviewer can sit back and enjoy the ending!

iamaran
Posts: 100
Joined: Tue Apr 06, 2010 9:01 pm

Re: OCR from dot matrix?

#3 Post by iamaran » Wed Nov 24, 2010 12:38 pm

Thanks for that Dave. I'll get typing Cordelia!

**UPDATE** After cleaning up, rotating the image etc the free online OCR (http://www.onlineocr.net/) has done a good job. I'll proof-read, correct and upload later.

Post Reply