was successfully added to your cart.
Annotation Lab
Annotation Lab
Per Server / Per Hour
  • Human-in-the-loop AI, from annotation to published model API’s
  • High-productivity annotation UI with keyboard shortcuts & pre-annotations
  • Support for annotating text, images, video, audio, and HTML
  • Annotate entities, lationships, classification, normalization, and more
  • Increase inter-labeler agreement with guidelines, reviews, compare versions
  • Active learning: Pre-annotate data, train regularly, and improve on the fly
  • Published trained models as REST API’s
  • Define teams, projects, and users
  • Role-based access & permissions
  • Customize workflow & task tags
  • Full audit trail
Healthcare AI Lab
Healthcare AI Lab
Per Server / Per Hour
  • Best-in-class toolset for a collaborating team of healthcare data scientists, analysts, annotators & ops engineers
  • Managed Python & R notebooks
  • Index, search & visualize documents
  • SQL Lab, visualization & dashboards
  • Model deployment & governance
  • + the Data Annotator
  • Includes Spark NLP for Healthcare
  • Includes Medical Terminologies
  • Includes Healthcare Data Library
  • Includes Life Science Data Library
  • Hardened for processing PHI data Enterprise-grade identity manager: MFA, OTP, AD/LDAP integration, password controls, passive safety, etc.
  • 360 monitoring & alerting
  • Central log collection & audit trails
Cleanroom AI Lab
Cleanroom AI Lab
Per Server / Per Hour
  • Best-in-class platform for full lifecycle healthcare AI at enterprise scale
  • + 100 unified tools & services to take AI projects from concept to production
  • Everything in the Healthcare AI Server plus data integration, MLOps & SecOps
  • Hardened Kubernetes cluster supports on-premise, cloud, or hybrid setup
  • Elastic: Scale from 5 to 5,000 machines without downtime
  • High availability & failover
  • Allocate CPU & GPU to users & projects
  • Mix hybrid storage & compute nodes
  • Configurable auto-scaling
  • Integrate your own apps, with SSO
  • White labeled: brand as your own
  • Publish dashboards, models & data API’s to third parties



Free, forever, unlimited, for personal and commercial use. Spark NLP is released under an Apache 2.0 open-source license – including the pre-trained models and documentation.

Each license includes the software libraries in all supported languages, the pre-trained models that are included with it, premium support, and all updates to the software & models that are released during the subscription period.

Spark NLP for Healthcare and Spark NLP & OCR are licensed as an annual subscription, payable once a year in full. There are two license types: Per Server, which allows use of the software on one machine; and Per Cluster, which allows use of the software on an unlimited Apache Spark cluster.

No. The only limitation is that each license allows using the software on one server or one cluster, based on the license type you choose.

The software will stop processing documents – for both training and inference. If you choose to buy a license, we will provide you new credentials that will reactivate it. Otherwise, you must uninstall the software. In any case, data you have already processed is yours to keep.

Running the Software

Python, Java, and Scala.

Spark 2.3.x and 2.4.x.

We officially support AWS, Azure, Databricks, Cloudera, and GCP.

Yes. Spark NLP is used heavily in high-compliance industries like healthcare, life science, finance, and insurance where on-premise deployments are common. Most single-machine, Spark, Hadoop, and Kubernetes distributions are supported.

Yes. Make sure to allocate enough memory & compute power for your use case.

Yes. Make sure to allocate enough memory & compute power for your use case.

This depends heavily on your use case. For training custom models based on the BERT family of embeddings, at least 8 cores and 64GB of memory are recommended. For inference, as little as 1 core and 8GB may be enough. Using GPU’s will usually provide faster execution at a higher cost.


The cost depends on which edition you need (Healthcare or OCR), the type of license (per server or per cluster), the level of support (8x5 or 24x7), and the number of licenses you need. Please email us with those details and we’ll reply with an exact quote.

Online bank transfers (ACH or wire), checks, and all major credit cards.

Yes! Please email us to describe your situation and needs.


No. You install and run the software on your infrastructure. The software does not “call home” and no data or results are sent to John Snow Labs.

You do. We will never ever see them.

This is not a SaaS solution – instead, you run the software on your infrastructure. Nothing ever gets sent to John Snow Labs or another third party. Spark NLP is designed for high-compliance, locked-down environments.

No, after an initial installation & downloading of pre-trained models.



Yes. Spark NLP is designed to enable you to train & tune your own models for most tasks.

Yes. Here’s an example: (notebook)

Yes. Here’s an example: (notebook)

Yes. Here’s an example: (notebook)

Absolutely! Here’s an example: (notebook)

The full list is available here. Expect the list to keep growing over time.


Email support@johnsnowlabs.com, call us at +1-302-786-5227, or start a chat on spark-nlp.slack.com. Paying customers get a private Slack channel, so that you can ask your questions privately.

Same business day 8x5 support is included with all subscriptions. We can also provide 24x7 support for production systems – please email us if you require it.

Yes. Spark NLP in Action includes links to runnable Google Colab notebooks in Python.