Constance Ring by Amalie Skram
This is a digital, free version of Constance Ring (English Wikipedia), Constance Ring (Norwegian Wikipedia) by Amalie Skram (English Wikipedia), Amalie Skram (Norwegian Wikipedia). It is based of a scan of the book published in 1943 publication by Gyldendal, available at the Norwegian National Library.
The book is available in the public domain license. This license allows republication of the material, even for commercial purposed.
However, this edition of the book is available under the CC BY-SA 4.0 license allowing for commercial reproduction of the book as long as it is attributed back to this book. See LICENSE.md for details.
Conversion
This book is created by downloading the PDF with all the pages from nb.no, extracting the images and then performing OCR on them. The default OCR from the national library wasn't very good.
The text files are then slightly massaged by removing headers and footers, making chapters headers into Markdown headers and de-hyphentating the text.
The Epub is then generated by Pandoc.
Proofreading
Run make spellcheck-words
to run hunspell over the document. It will read
dict
as the personal dictionary and output any misspelled words into
spellcheck-words
.
The book is written in a much older Norwegian that what the dictionary contains so there are many misidentifications.
Process:
For each word in spellcheck-words
:
- Check if the word is in the original PDF. If so, add it to the dictionary. This happends when the word is just too old.
- Check with manual spellcheck with
make spellcheck
. This is more tedious as it runs through the entire document from the beginning.
Hunspell
Hunspell has affix (.aff) files and dictionary (.dic) files. The documentation
claims that it should be possible to load single .dic
files as "special
dictionaries", however I can't get that to work.
However, the base Norwegian affix file can be used as the affix file for the spellchecker by creating a symlink, problem solved.