From time to time I am reminded of the complexities and nuances inherent in web accessibility. This happens most often when talking to someone new to the effort. I recently visited with a faculty member who is just beginning to address accessibility in her courses. She knows that PDF’s need to be properly tagged to be accessible and she was enthusiastic about doing it, even when it involved retrofitting accessibility elements into her PDF’s. But she was quickly at a stalemate for articles in her course CMS. When this happened she assumed she was doing something wrong, rather than identifying that what was wrong was the document itself. The experience made me reflect on the fact that while we often talk with faculty members about making sure that they have PDF’s that are accessible, we might not provide them with a good enough starting point.
Readers of this blog who are familiar with PDF accessibility know that to start to address accessibility of an existing PDF document, you must have a PDF that was properly created. However, many faculty members don’t create the PDF’s they end up with. If the document is not composed of true text, you must use Optical Character Recognition (OCR) on the document before other accessibility information can be added. While this step is less common all the time (thank heavens), it is important that we help faculty or staff understand how they can tell if PDF’s contain true text to begin with.
This was the step that eluded the faculty member with whom I was talking. She did not know that she needed to determine if her PDF was a scanned image or not. Because of this she assumed that her first step would be the tagging of the document itself. Of course she could not do this because it was in fact an image. Then she assumed she was doing something wrong.
There are many ways for non-technical people to determine if the PDF they wish to make accessible contains true text. Here are two ideas I shared with her and both were easy for her to implement. First, I asked her to see if she could select and copy some portion of text, and second I told her to try to search for some text element on the page. Both were successful strategies for her to determine which pages she could, and could not tag. Getting focus directly onto the text is critical if you will add an element to it (e.g., Header, ordered list).
If you are working with faculty on accessibility issues, or are a faculty or staff member yourself working on accessibility, you may be privy to similar situations in accessibility –- where a simple step was overlooked that proved significant for the person trying to create accessible content. We are very interested in hearing your experiences and sharing them with others. Please consider sharing your tips and experiences here.
I’ve always recommended returning to the original document from where the PDF came (i.e., the word processed, presentation, spreadsheet files) and do your corrections there. It is much easier to fix Headings and add ALT descriptions in Word than in Adobe Acrobat Pro. In most cases, the re-converted file will bring over the corrected content and be more accessible. But my experiences have varied widely. It seems to depend on which versions of MS Office and AA Pro you are using.
Also, the latest version of Office (MS Office 2013) does a pretty good job of importing a PDF (not a image/scan) file and converting it into a decent Word document. So, it the user does not have the original document and only the PDF, they can try importing it into Office 2013 and working from there.
I too would be interested in hearing from others on their experiences in this realm.
~j