Abstract
Abstract:
Reading is a pervasive activity in our daily life. We read text printed on books and documents, shown on directional signs and advertisement, and displayed on computer and smartphone screens. People who are blind can read text using OCR on their smartphone; those with low vision may magnify onscreen content. But these tasks are not always easy. Reading a document with OCR requires taking a well-framed picture of it at an appropriate distance, something that is hard to do without visual feedback. Accessing "scene text" (e.g., a name tag or a directional sign) is even harder, as one needs to first figure out where the text might be. Screen magnification presents a different set of problems. One needs to manually control the center of magnification using the mouse or trackpad, all the while maintaining awareness of the current position in the document (the "page navigation problem").
In this talk, I will present a number of different projects in my lab addressing these problems. First, I will show how fast "text spotting" algorithms can be used to generate real-time feedback for blind users, indicating the presence of scene text in the camera's field of view, or guiding the user to take a correctly framed picture of a document. I will then propose a simple gaze-contingent model for screen magnification control. Although our system currently uses an IR-based eye gaze tracker, we are planning to integrate it with an appearance-based tracker using data from the computer's own camera. During the talk, I will present a number of experimental studies with blind and low vision participants, motivating and validating the proposed technology.
https://users.soe.ucsc.edu/~manduchi/