State of the Art Natural Language Understanding

John Snow Labs’ NLP is an open source text processing library for Python & Scala that’s built on top of Apache Spark and TensorFlow. It provides production-grade versions of the latest research in natural language processing – raising the bar on accuracy, speed, and scalability.

Download here

Unmatched Speed & Scale

Spark NLP was 80x faster than spaCy to train locally on 2.6MB of data.

Scale to a Spark cluster with zero code changes.

Read more

State of the art accuracy

First production-grade versions of novel deep learning NLP research.

Use pre-trained models to train to fit your data.

Read more

Most widely used in the enterprise

Widely deployed production-grade codebase.

New releases every 2 weeks since 2017.

Growing community.

Read more

Most Widely Used AI Frameworks and Tools

Why Us?


Spark NLP 2.0 obtained the best performing academic peer-reviewed results.


Spark-NLP is 1-2 orders of magnitude faster than spaCy to train NLP models locally


Zero code changes are needed to scale a pipeline to any Spark cluster.

Out Of The Box Functionality

What We Offer

Trainable to understand your language

Spark NLP is optimized for training domain-specific NLP models, so you can adapt it to learn the nuances of jargon and documents you must support.

We all speak many languages…

Introducing Spark NLP at Top Level AI Conferences

Spark NLP: How Roche automates knowledge extraction from pathology and radiology reports

Spark NLP in action: Intelligent, high-accuracy fact extraction from long financial documents

Spark NLP in action: How SelectData uses AI to better understand home health patients