Stains as Spectral Curves

Over the course of the last few months, the Library of Stains team has been analyzing data collected from the hundreds of images we gathered during three trips to the Library of Congress, the University of Pennsylvania, and the Universities of Wisconsin and Iowa.

Multispectral data analysis can take many shapes and forms. As part of the Library of Stains project, the team has applied a methodology specific to stains.  This post takes you through the steps of that process as it relates to characterizing stains and what is, and is not, possible to know.

The project uses Image J software, freely accessible on the internet, and the Paleo Toolbox, which was designed by Dr William (Bill) Christens-Barry of Equipoise Imaging.

The data that come to us straight from the camera images first need to be changed into .tif files.  This is done through Capture One, a Phase One software, and both sets of images can be saved in the same file. New software is being released soon and will do this step automatically.

Next, all .tif files need to be flattened.  Flattening images is done in two steps.  First you need to clean the flats – a series of shots of a white piece of paper taken on the same day as the images. Second, you flatten the set of images for a given side (or folio) to create an image that is evenly balanced agains the white light spectrum.  Using Image J again, the software does the work and automatically processes the series of “flats” in line with the manuscript images. The flattening process is a pretty easy learning curve, but somewhat time-consuming.  The good news is that this step will also be done automatically with the new software.

Once the files have been flattened, an intermediary step recreates a color image using Image J. This takes the flattened .tif files from a specific side (folio) and with the light information from the full stack of multispectral images, recreates a color image.  The Library of Stains team wanted to be able to have .jpg images that can be annotated and the specific areas on a given folio identified for the elements that were analyzed. 

Eventually the image will end up in Digital Mappa – a data curation software environment being used for our data visualization. In the meantime, to make our data analysis workflow as efficient as possible, we have put all the RGB color images into a powerpoint presentation.



Now, with the Powerpoint up on one computer, and Image J on another, we’re ready to begin analysis. We will be using  Image J once again to plot the z-axis for each spot that will serve to create a spectral curve.  For the the stain project, this meant we would be plotting a z-axis for the substrate (parchment or paper), the inks, the red and blue pigments used for rubrication and decoration, and of course, as many stains as our hearts desired.

To move from image files to spectral curves on an excel spreadsheet requires a few steps:

  1. Using the 10 non-filtered wavelengths we imaged for each side, we opened them in Image J and configured them into one stack.
  2. Scrolling through them highlights how different components on a given folio react to different light wavelengths.
  3. Plotting the z-axis on a particular part of the image is easy with Image J.  Outline the portion to be plotted with a small rectangle and choose “Plot z-axis.” The results immediately appear as a curve on the screen.  The curve can also be viewed as a list of number values.
  4. These values are plotted in the appropriate columns on an excel spreadsheet and labelled whichever element.  As shown below, we have columns for substrate, inks, pigments and stains.
  5. In order to see the true reflectance, the z-axis of specific component needs to be adjusted against the values for the white color checker.  Indeed, the white color checker values are the first to be plotted and inserted into the appropriate columns for each of the material components.
  6. Following the methodology devised by colleagues at the Preservation and Testing Division of the Library of Congress, on the spreadsheet the formula is automatically calculated by dividing the intensity of the ink, substrate, pigment or stain by the intensity of the white swatch on the color checker. This then is what is plotted on the x and y axis of the spectral curve – the x axis showing light wavelengths, and the y axis, reflectance levels.

Then comes the fun part.  We are able to begin to decipher the curves.

Spectral Curves for inks stains on University of Wisconsin manuscript MS 170A, no. 8.
Spectral Curves for all blue inks found in the University of Wisconsin manuscripts.
Spectral curves for all red inks found in the University of Iowa manuscripts.
Spectral Curves for possible wax stains in the University of Iowa manuscripts.
















Preliminary results from the University of Iowa and Wisconsin show variations of ink curves, as well as possible curves on a number of folios that may indicate wax residues.  Further analysis is underway and final results, alongside the data itself, will be ready for open access by August 31, 2018.

With much thanks to Leah Pope Parker, PhD candidate in English at the University of Wisconsin, for her intellectual contributions to this project, as well as countless hours analyzing and visualizing data.


Multispectral Imaging: People, Processes & Technology

Michael B. Toth

President, R.B. Toth Associates

Alberto Campagnolo, Erin Connelly and Heather Wacha setting up manuscripts at SIMS for multispectral imaging as part of the “Stains Alive” project.

Alberto Campagnolo, Heather Wacha and Erin Connelly have discussed the Stains Alive Project, citing our imaging technology, work processes and data output in support of this unique scientific study into stains on ancient manuscripts. This builds on almost two decades of work we’ve put into developing our equipment and techniques. As we continue our journey from the University of Pennsylvania, Schoenberg Institute of Manuscript Studies (SIMS) and Library of Congress, and venture out on the rest of our journey to the Universities of Wisconsin and Iowa (I’ll try to ignore the forecasts of 16°F and snow showers) I thought I’d discuss what some consider the more mundane aspects of these projects that often go unrecognized.

A small sample of the standardized data output from “Stains Alive” imaging at SIMS

The methodologies and technologies we use for multispectral imaging today are based on our 18 years of experience in narrowband multispectral imaging systems development. Yet for all the advances in the latest equipment – higher resolution sensors, better signal to noise ratio, improved illumination panels – success or failure of these projects depends on more than just the technology. A successful program also requires solid work processes and dedicated people. This is where systems integrators and program managers come in – not just to make sure all the technology is working together, but to ensure the project is fully supported by the processes and people as well. Without these – especially the latter – a project can yield some pretty pictures for scholars and conservators to gasp and drool over, but might not successfully produce and preserve the solid corpus of standardized data and metadata for future generations to study.

Multispectral imaging sequence of images in a darkened room, each illuminated by different wavelengths of narrowband LEDs shining on the manuscript from Ultraviolet to Infrared light

As Alberto noted in his blog, the current narrowband multispectral imaging system used for this project includes commercial-off-the-shelf hardware and software for digital spectral image capture and viewing with the integrated system. This includes customized image processing software developed by Bill Christens-Barry of Equipoise Imaging to allow users to exploit the spectral images, utilizing techniques developed in other scientific and cultural heritage studies.

Our Phase One high-pixel-count camera takes a series of high-quality digital images, each illuminated by a specific wavelength of light from banks of light emitting diodes (LEDs). Everyone tends to get excited about being able to observe multispectral imaging of manuscripts: the sequences of various colored lights are visually compelling, you are seeing new features on an object and are part of leading edge studies. But Heather, Erin and Alberto are learning that after a few sequences of images in a dark room, many people don’t have the patience for more and excuse themselves. System operators like Meghan Hill Wilson and the PRDT team in the Library of Congress, the CHIC team at the John Rylands Library in Manchester, the digitization team in the Duke Libraries, Cerys Jones at UCL, and Damian Kasotakis in the Sinai are the unsung heroes of these projects, as they work in dark rooms day after day setting up and imaging manuscript leaf after manuscript leaf. This is where checklists are needed to make sure mistakes don’t creep in.

Compressed pseudocolor image of SIMS manuscript inner cover digitally processed from a sequence of captured images

The resulting image set is then digitally processed and combined to reveal residues and features in the object that are not visible to the eye in natural light. These processed images generated from the captured images provide the data needed for research into stains and residues.   Lots of data! Each archival 16-bit Tiff image from our current 60 Megapixel Phase One monochrome camera is about 117 MB in size, and we capture 15-18 images in a sequence. So each sequence yields about 2 GB of captured data. Multiply that by the number of leaves imaged and we are quickly piling up data. By the end of this short project, the team will have collected about a quarter terabyte of captured data alone.

Workshop on multispectral imaging system and processing tools at SIMS

While the processed images are usually stored as 8-bit Tiff images, with multiple processed images available from each sequence – including some larger pseudocolor images – they add up to yet more data to store. With open source image processing tools and training, scholars and conservators can now produce their own processed images to meet their research needs, which also need to be managed.


All these data require good metadata and file structures, for without it we would be blindly trying to find data across hard drives and the cloud. And when we found them, we wouldn’t be able to remember details about the imaging, spectral illumination or object. This highlights the additional unsung heroes of our multispectral imaging: the data managers and administrators. The dean of this cadre is Doug Emery, whose pioneering work on data management and preservation on the Archimedes Palimpsest Project was “recognized” by Program Director Will Noel’s dedication in his book (below):

“To Doug Emery, Whose critical contribution to this project goes unrecorded in this book. Sorry. Metadata doesn’t sell. Thank you so much! Will Noel”

Data Manager Doug Emery and Data Administrator Susan Marshall standardizing and organizing Sinai Palimpsests Project data and metadata

Doug’s work, and that of so many others responsible for the metadata and data output, has proven critical to multispectral imaging programs ranging from various palimpsest projects to David Livingstone’s Diaries, Top Treasures at the Library of Congress, and mummy masks around the globe. Starting with the Archimedes Palimpsest Metadata Standard (really a specification) Doug, Bill Christens-Barry and I developed over a decade ago, multispectral imaging data management has advanced on the shoulders of pioneers working with the Image Interoperability Information Framework, Dublin Core, the Text Encoding Initiative, and others. Only with the diligence and attention to detail provided by dedicated data managers and administrators have large amounts of multispectral image data have been archived and made available online for global access.

Training workshop for PACSCL members at SIMS on multispectral image processing and work flow to meet users’ diverse goals

For Stains Alive and our other multispectral imaging projects, we use the latest technology, which is always getting better. At SIMS and for the Philadelphia Area Consortium of Special Collections Libraries (PACSCL) we were able to try out the latest 100 MP Phase One camera back thanks to a loan from Digital Transitions. The CMOS sensor allowed us to autofocus and captures more detail in larger images, while capturing even more high quality data. With these new cameras, illumination panels, processors and other technologies for multispectral imaging, we also have to continuously improve our work processes. Most of all, we need to ensure the people on the team have all the resources they need to carry out their goals.

And the journey begins

20171106_113058 (1).jpgLast week we all convened in sunny Philadelphia to begin imaging stains from the Chemical Heritage Foundation and Penn Libraries manuscript collections.

With the generous help of Mike Toth from R.B. Toth Associates and Sarah Reidell, Margy E. Meyerson Head of Conservation of the Kislak Center for Special Collections, Rare Books and Manuscripts, the multispectral imaging system was set-up in a small windowless–i.e. perfect for imaging!–room within the Conservation Studio.

System setup
System setup
setting up a manuscript for imaging
Alberto Campagnolo and Sarah Reidell carefully setting up a manuscript for imaging
Heather Wacha, Erin Connelly, and Alberto Campagnolo setting up a manuscript

Over the course of two intense days we imaged stains from the pages and covers of fourteen manuscripts ranging from the 13th to the 16th century, thus beginning to build our dataset of stains. The manuscripts include nine alchemy texts from the Othmer collection, Chemical Heritage Foundation (Othmer MS 1 pictured below) and five medical texts from the Schoenberg Institute for Manuscript Studies and Penn Libraries collection.

We used a Phase One IQ260 Achromatic camera, a 60 megapixel 16-bit monochrome digital back with 8964 x 6716 pixel CCD array at 6.0 micron pixel size, with an iXR body and 80mm lens producing 675 ppi resolution images. The special illumination necessary for multi-spectral imaging was provided by a third-generation LED light system designed by Dr. William (Bill) Christens-Barry of Equipoise Imaging that produces very specific and narrow bands of illumination, ranging from ultraviolet light (370nm) to the near infrared (940nm).1 Because of the nature of the project, we also utilized long-pass green and red filters to detect fluorescence energy: the filters remove the illumination wavelength, but let through longer fluorescence emission that can be recorded in the captured image, thus allowing the characteristic spectra of substrate, colourant, or contaminant substances to be more completely determined and analyzed.
The camera-light-filter system is integrated within a software that simplifies the operation and records unified metadata at each step.

The result of the imaging is a sequence of photographs, one for each  different illumination and filter setting, as it can be seen below.

Image stack
Example of image stack for Left cover of Othmer 1, CHF
2017-11-16_16h56_46 (
Animation of image stack for Left cover of Othmer 1, CHF

Different materials react differently to each wavelength, and details that are not visible in natural light begin to appear and be clearly noticeable. Notice, for example, how the stain in the cover above appears and disappears, depending on the illumination.
One detail of particular interest is a writing in the upper part of the cover that was almost invisible to the naked eye, but that becomes immediately distinguishable and readable under infrared light (see detail images below).

Writing on Left cover of Othmer 1 (CHF)
Detail of writing on Left cover of Othmer 1 (CHF)

Capturing the photographs (and managing the metadata) is only the first step. For a deeper understanding of the data recorded and the variety of material responses to the different wavelengths, one needs to process the stack of images and analyze the data through statistical algorithms capable of simplifying it and of finding patterns in it.
This kind of analysis, thanks to colour reference cards positioned in the scene, can also reconstruct colour images, despite the fact that the camera is achromatic, i.e. agnostic to colour information (see below).

Reconstructed colour photograph of Othmer 1 (CHF)
Reconstructed colour photograph of Othmer 1 (CHF)

One output than can prove particularly useful in distinguishing different components — i.e. materials reacting in different ways under the different lights — is a false colour image, where different components are assigned an arbitrary colour to help discerning similar and dissimilar light responses.

False colour detail of Othmer 1 (CHF)
False colour detail of Othmer 1 (CHF)

It is through this kind of data analysis that we’ll try to distinguish and characterize stains in the coming months.

We thank Mike Toth, Bill Christens-Barry, James (Jim) Voelkel, William (Will) Noel, Doug Emery, and Sarah Reidell and everyone else involved with our imaging session at the University of Pennsylvania for their help and support.
We thank CLIR for their constant assistance (above and beyond financial support) and encouragement.

The team
The team: (from the top left) Erin Connelly, Mike Toth, Bill Christens-Barry, Heather Wacha, Alberto Campagnolo

1. We imaged at: 370nm (UV); 448nm (deep blue); 476nm (blue); 499nm (cyan); 519nm (green); 598nm (amber); 636nm (red); 740nm (IR1); 850nm (IR2); 940nm (IR3). UV in italics, visible light in roman characters, and infrared frequencies in bold.

Going on a Stain Hunt

The imaging schedule for the #Stains Alive project has been set. PIs will soon be welcoming Mike Toth, multispectral imaging expert from R. B. Toth and Associates, to the University of Pennsylvania, the University of Wisconsin – Madison, and the University of Iowa. But before he arrives with his imaging equipment, the first task is to single out interesting-looking stains in our respective collections.


MS 257, f. 110r
University of Wisconsin, Special Collections, MS 257, f. 110r
Screen Shot 2017-10-21 at 7.51.41 PM
University of Pennsylvania, MS Codex 115.

In other words, we’re going on a stain hunt. For us, what makes a stain a possible candidate for imaging is its size, shape, placement on the page, color, and the genre of manuscript in which it appears. Stains found in alchemical texts or book of recipes may not be the same kind of stains that appear in a Bible or literary text. Indeed, a note on the book of remedies at right suggests that the stain is due to “a chemical spilled on the ms by an alchemist.”





IMG_2672Screen Shot 2017-10-18 at 12.36.06 PMInformation about the stains chosen for imaging is catalogued in a spreadsheet, starting with the call number and the folio on which the stain appears. Since a camera will be attached to a copy stand and sit above the manuscript, we also measure the x and y axis of the folio/manuscript, as well as the z axis, i.e. how far the folio comes up from its base. If the stain appears on the recto of a folio, the z axis is measured from the back cover (for those manuscripts written in European languages) up to the folio with the stain. If the stain appears on the verso of a folio, the z axis is measured from the front cover up to the folio. Changing the focus on the camera each time a new stain is placed underneath it takes time, so having an idea of how far the stain sits above the table facilitates streamlining the sequencing of images so that the workflow can be as effective as possible.


MS 170a, Box 1, no. 8
University of Wisconsin, Special Collections, MS 170a, Box 1, no. 8

In between the measuring and the cataloguing of these manuscripts, it’s always nice to step back from measuring and entering data, and momentarily travel back through time to be in the room in which there was the manuscript and the person who accidentally, or perhaps intentionally, left a stain –  a visible physical trace of human interaction that has endured through the centuries and can be studied today with technology like multispectral imaging.

‘A Boke of Practyk:’ Stains of Medicine and Alchemy

Chemical Heritage Foundation, Othmer MS 2, fol. 41r

Many of the most interesting manuscript stains are found in the bindings and folios of soiled, heavily-used medical and alchemical texts. The Middle English quote in the title of this post comes from the introduction of a stained fifteenth-century medical text, Lylye of Medicynes (Oxford, Bodleian Library MS Ashmole 1505, fol. 4r). This self-described ‘boke of practyk’ reflects the scientific content of these texts and their subsequent use for practical purposes by medical practitioners or alchemists. Signs of this practical use may be obvious, such as burn marks from furnaces, but innocuous-looking stains (e.g. appearing to be water damage) may contain hidden information about medicinal or chemical solutions, or even heavy metal contamination, which is not evident by sight alone. This is the kind of data we will be looking for through multi-spectral imaging of these stains.

Recipes to kill parasitic worms added to the treatise 'De coloribus urinae,' UPenn MS. Codex 133, 16th c.
Recipes to kill parasitic worms, ‘De coloribus urinae,’ University of Pennsylvania, MS Codex 133


The majority of manuscripts identified as potential candidates for stain analysis in this project have never been accessible in either print or digitized formats. This is especially true of medical and alchemical texts, which are often less well-known than medieval literary works and less ‘beautiful’ than the illuminated and decorated texts commonly regarded as world treasures. The analysis of these manuscripts will provide truly novel data particular to the specific object, but also foundational to the examination of similar manuscripts in the future.

Soiled alchemical text, Chemical Heritage Foundation, Othmer MS 2, fol. 63r

We will be examining a range of scientific manuscripts across our partner institutions. Specifically, we are fortunate to be partnering with The Othmer Library of Chemical History at the Chemical Heritage Foundation, which houses over 140,000 objects relevant to chemical history, including an invaluable collection of medieval alchemy manuscripts. Until recently (through the work of the Bibliotheca Philadelphiensis project, 2016 – 2019), these manuscripts were undigitized and many have never had a printed edition.

Alchemy texts in particular are often known as ‘books of secrets’ both in medieval and modern language. The novel data contained in the stains of these alchemical texts will not only reveal interesting information about the history of the book and its practical use, but it will help researchers who will be touching and interacting with the physical object. We envision that scholarly audiences will use our data and methodology to advance knowledge into the provenance of manuscripts, their uses within a historical context, their working environment, their transmission, and their circulation. For conservators and librarians new information will help determine proper storage conditions, as well as health and safety issues, in particular the identification of heavy metal or chemical contamination in alchemical manuscripts.

Dirty Old Books

Screen Shot 2017-09-01 at 6.17.08 PM
Free Library Lewis 003, f. 18v.

At some point in your career as a reader of books, you may have accidentally spilled coffee or left a stain on a book you were reading, just like someone did with the book in the image at left. Stains in and on books are usually seen as inconveniences at best and tragedies at worst. The Library of Stains project proposes to focus on these oft-disparaged “dirty” old books and the stains found in them, using them as a tool for gathering scientific data that will provide clues to how previous generations used and stored their reading material.  This project examines a variety of stains found on parchment, paper, and bindings from medieval manuscripts.  The data will provide a new approach for learning about the history of the book, book conservation, the materiality of books, and will offer both scholars and the public an opportunity to engage in the intimate connection between readers and what they read.

The Library of Stains project is conceived broadly as a first foray into providing a fixed dataset for characterized stains that are commonly found on manuscripts, a sound methodology for the replication of gathering and analyzing the data, and a clear explanation for how to implement and use the database as a means to further the study of medieval manuscripts and their conservation. In so doing, the Library of Stains hopes to equip scholars with additional tools for analyzing their manuscripts vis à vis provenance, use, transmission, preservation and materiality.  The project also aims to engage both scholarly and public audiences with the intrigue of studying manuscripts traditionally pushed aside and dismissed due to their “dirty” or “stained” appearances. Contextual information will be provided concerning each manuscript studied in order to elicit public participation in the making and identifying of stains.  If not coffee stains, as humans we are probably all guilty of leaving some sort of stain, perhaps a tear on an old letter, or blood from an accidental cut on a recipe book.  This project will bring together a human audience in order to explore and study the human experience, be that a medieval person’s relationship to a manuscript or how that information relates to our interactions with books today.