Last month DYA reported that data website TheGenealogist had released new, higher resolution scans for the 1851, 1861 and 1871 censuses for England and Wales, adding to an upgraded set already available for 1891. These original page images are a major resource for family historians and we wanted to know more about the technology behind it all. So we spoke to TheGenealogist's head of online development, Mark Bayley:
What sort of scanning technology have you used to achieve these new scans?
As you can imagine, technology has come a long way since the days we first scanned the census, in early 2000. We recently purchased a new high resolution, full grayscale film scanner for this project. We also scanned silver halide film rather than inferior diazo to make sure we got the best image we could for every census page.
Loading speed is crucial for website users. How do you manage to load these images quickly as well as preserve the detail?
We use a technology called Deep Zoom, originally developed by Microsoft to allow users to view huge images (even those that are gigabytes in size), but is still instant to load, even on a mobile device. It does this by splitting an image into layers. The first layer is just a thumbnail size and the final layer would be as if you zoomed right in to see the individual pixels. We then chop each layer up into small 500 by 500 pixel tiles. These tiles are only around 5 to 7 Kb big – so small that they can load instantly, even on a slow mobile connection.
We then only show the tiles for the area the user has zoomed and panned to. So if they're fully zoomed out, they see a thumbnail instantly. If they zoom right in to see a person's name, they can see incredible detail, but it would have only loaded the tiles for that specific area and therefore, again, loads instantly.
What's the full resolution now available and what was it before?
The images were originally bitonal (i.e. black and white, with no shades in between), making it hard to see finer detail or where enumerators had left their marks over the writing. These were scanned at 200 dpi – that is dots per inch, a way of measuring digital resolution.
The new images have been scanned in greyscale, allowing you to see those hidden layers of markings, in 600 dpi. So we have tripled the resolution, meaning you can zoom in three times further. You have to see it with comparison images to really see the difference – it's phenomenal!
How would you say these new scans can make research easier for people?
The extra resolution and being able to see these hidden layers of writing with greyscale images mean you can discover extra information that was previously hidden, across the page, from occupation to birthplace or even a name.
Last month we had an article about details revealed in these scans relating to a murder case. Have they led to any other interesting discoveries so far?
As we've been checking the images, we've noticed a variety of interesting records, including some rather warn and torn pages that are now so much easier to read. One such example from Gloucestershire 1861 included a couple who left a tavern early hoping to dodge the enumerator. When confronted, they refused to cooperate! Zooming in using the new scans reveals what the enumerator wrote: 'Left early in the morning of 8th April and refused to give any information.'
We've also found quite a few images containing rough, hand-drawn maps on the enumeration district description page, making it easier for us future genealogists to see where an ancestor might have lived:
On a more humorous note, we also found a doodle of a man (perhaps a self-portrait of the enumerator) on a page describing an enumeration district.
Find out more at TheGenealogist.co.uk/census