Recognize entities in scanned PDFs
End-to-end example of regular NER pipeline: import scanned images from cloud storage, preprocess them for improving their quality, recognize text using Spark OCR, correct the spelling mistakes for improving OCR results and finally run NER for extracting entities.
Extract tables from selectable PDF documents with the new features offered by Spark OCR.
Extract Data from FoundationOne Sequencing Reports
Use our transformer to parse patient info, genomic and biomarker findings, and gene lists.