Home > Extract Tables & Structured Data

Spark NLP in Action

Spark NLP – English

Extract Tables & Structured Data

Recognize entities in scanned PDFs

End-to-end example of regular NER pipeline: import scanned images from cloud storage, preprocess them for improving their quality, recognize text using Spark OCR, correct the spelling mistakes for improving OCR results and finally run NER for extracting entities.

Live Demo
Colab Notebook

Extract tables

Extract tables from selectable PDF documents with the new features offered by Spark OCR.

Live Demo
Colab Notebook

Extract Data from FoundationOne Sequencing Reports

Use our transformer to parse patient info, genomic and biomarker findings, and gene lists.

Live Demo
Colab Notebook