Share |

Automated OCR Validated for Use in Large Dutch Vaccine Study

    When it comes to entering data about thousands of patients at various 
    stages in a very large medical vaccine trial, it pays to think small in terms 
    of how many times human hands should intervene. 
    
    That's sound advice from Keith Passaur, president and owner of eDocfile, 
    Inc. (Valrico, Fla.), who recently delivered Optical Character Recognition
    (OCR) software to organize and automate filing of up to 4,000 documents
    daily for the world's second-largest medical vaccine trial. 
    
    Happening right now in the Netherlands, the trial involves 85,000 patients
    from several remote health centers and 340,000 documents. File by OCR
    (eDocfile) had been purchased by the hospital running the trial to help
    manage and file documents. File by OCR extracts text from pdf or tiff files.
    The extracted text is parsed and used to rename and relocate the file to
    build a file folder hierarchy. 
    
    Passaur was tasked with modifying the OCR program so the study files 
    could be named based on index information (patient number, center 
    number, and document type). He received a six-page tri-fold form that 
    would be filled out by each patient. On the third page of the form, a vertical 
    number readable with OCR provided the patient number and center 
    number where the form was generated. 
    
    Completed forms would be scanned at remote centers in a duplex manner, 
    creating a two-page tiff file for sending to the hospital. There, the scanned 
    image would be separated into six individual pages and the vertical 
    number extracted for filing purposes. The file would then be re-assembled 
    into a five-page pdf (page 6 was blank) and filed based on the OCR 
    contents. All processing of documents would be done in a batch process 
    after scanning, freeing users to move on to other tasks while the OCR 
    process was underway.
    
    Writing Scripts to Save Time
    
    Since each document would be processed in the same way, Passaur 
    automated the steps to initiate the file command lines with Macro 
    Scheduler, a Windows script-writing tool from MJT Net (Shaftesbury, UK). 
    Says Passaur: "Macro Scheduler allows us to automate very complicated 
    functions, such as parsing out OCR text content and batch renaming of 
    files." eDocfile has used Macro Scheduler for about seven years to 
    automate repetitive steps in Windows-based software the company 
    develops for its clients. 
    
    Once the steps are put together in the script, they can easily be modified 
    for use in other programs to automate similar actions and reduce the 
    likelihood of error. "There was no reason a highly paid staff member should 
    have to manually perform these steps for every document coming in," 
    Passaur adds.
    
    Macro Scheduler was used to automate steps throughout the OCR 
    filing process, he notes. 
    
    Each center had been assigned a certain range of patient numbers, so to 
    check for missing or misfiled documents Passaur created a macro that 
    compared the list of assigned numbers for that center to patient numbers 
    to validate the OCR. Also, since all files must be accounted for, he wrote a 
    macro to extract the patient number, center number, and the trial stage 
    from an Excel spreadsheet for validation.
    
    Because OCR is not 100% accurate, Passaur used Macro Scheduler to 
    write scripts that would test and retest the captured data for errors. The net 
    result is that 1 out of every 1,000 documents has to be manually filed.
    
    The Macro Scheduler-modified program was installed in fall 2009, and the 
    complete process was validated by an external auditing company. The 
    hospital is processing 1,300 documents and more than 2,600 faxes daily, 
    with users manually processing the three or four fails each day with the 
    manual processing tools built into the software. 
    
    "With that volume, it pays to keep human hands where they belong, with 
    patients, and not keying in file codes," notes Passaur.
    
    For more information on File by OCR, visit http://www.edocfile.com/. For 
    information on Macro Scheduler, visit http://www.mjtnet.com/