Welcome to OCTess

An automated spectral domain optical coherence tomography data extraction tool

About Our Tool

Using an optical character recognition machine learning model, we developed and validated an algorithm for extracting macular optical coherence tomography data, yielding an accuracy comparable to a human extractor of 99.97% while being significantly more efficient.

Purpose

Manual data extraction of spectral domain optical coherence tomography (SD-OCT) reports from large databases is a significant time- and resource-intensive process. To that end, we have created an optical character recognition (OCR) algorithm to automatically extract clinical and demographic data from Cirrus SD-OCT macular cube reports. Read our paper here! (pending approval)

How it works

The Zeiss Cirrus 5000 and 6000 monocular PDF outputs are the only files that we accept. Demographic variables for extraction include the patient’s name, birthdate, laterality (left or right eye), SD-OCT scan date, and gender. Clinical variables for extraction include superior, central superior, nasal, central nasal, inferior, central inferior, temporal, central temporal and central macular thickness, average cube volume, average cube thickness, signal strength and foveal coordinates. Watch the video for a short demonstration. Read more about it here!

Tesseract

Our algorithm utilizes an open-source OCR engine called Tesseract (pyTesseract version 0.3.9) to convert images to text. “OCTess” (i.e. portmanteau of OCT and Tesseract) was evaluated on two different Tesseract engines: one legacy version which works by recognizing character patterns and a newer, recurrent neural network-based OCR engine. Both engines are publicly available and have been developed by Google. Learn more about Tesseract here!

Seconds Per Document

Percent Accuracy

Second Improvment

Times Faster

Credits

Authors and Developers

This web app is an implementation of OCTess: An Optical Character Recognition Algorithm for Automated Data Extraction of Spectral Domain Optical Coherence Tomography Reports. Published in Retina (2023)


Michael Balas, MD(C)1; Josh Herman, MD(C)1, Nishaant (Shaan) Bhambra, MD2, Jack Longwell, HBSc3, Marko M Popovic, MD, MPH(C)4, Isabela M Melo, MD4,5, Rajeev H Muni, MD, MSc4,5

1 Temerty Faculty of Medicine, University of Toronto

2 Faculty of Medicine, McGill University

3 Department of Mathematics and Statistics, McMaster University

4 Department of Ophthalmology & Vision Sciences, University of Toronto

5 Department of Ophthalmology, St. Michael’s Hospital/Unity Health Toronto



App developed by Jack Longwell

F.A.Q

Frequently Asked Questions

Contact

Contact Us

Name:

Jack Longwell



Location:

36 Queen St E, Toronto, ON M5B 1W8