Spark NLP 6.3.1 focuses on strengthening distributed local LLM inference by upgrading the jsl-llamacpp backend to a newer llama.cpp release, while also delivering important improvements in document structure handling and metadata consistency.
This enables you to use the latest LLMs and embeddings compatible with llama.cpp and perform advanced ingestion of tables and images.
π₯ Highlights
Upgraded jsl-llamacpp backend to llama.cpp tag b7247, bringing upstream performance improvements, stability fixes, and expanded model compatibility for local LLM inference.
Improved Reader2X annotator capabilities with structural position metadata for tables and images and integration with AutoGGUFVisionModel
The jsl-llamacpp backend has been upgraded to llama.cpp tag b7247, applying upstream fixes and enabling the use of the latest LLMs. These benefit distributed LLM workloads in Spark NLP and affects the annotators AutoGGUFModel, AutoGGUFEmbeddings, AutoGGUFVisionModel, AutoGGUFReranker:
Performance and memory improvements, bug fixes from upstream llama.cpp for offline LLM inference within Spark NLP pipelines
Better support for newer GGUF/GGML model variants. This means you can now load models such as gpt-oss, Qwen3 and embeddinggemma.
Structural Metadata for Document Readers
Previously, our document parsers (HTMLReader, XMLReader, WordReader, PowerPointReader, ExcelReader) relied heavily on positional or page-based coordinates for layout metadata.
However, non-PDF formats such as HTML, XML, DOC(X), PPT(X), and XLS(X) do not have fixed pages
To ensure deterministic element referencing and structural traceability across all document types, we needed to adopt a unified DOM-like metadata model.
This change standardizes metadata extraction so every element can be uniquely identified and re-located within its source document, independent of visual layout.
These additions enable layout-aware downstream processing and more precise filtering especially for HTML and rich document formats.
Reader2Image Integration with AutoGGUFVisionModel
Previously, you could use Reader2Image to ingest images from various file formats into Spark NLP. However, processing was limited to Spark NLP native VLM implementations (such as Qwen2VLTransformer).
Reader2Image now supports interoperability with our llama.cpp backend with AutoGGUFVisionModel by introducing flexible handling of encoded vs. decoded image bytes and optional prompt output.
Added a new boolean parameter useEncodedImageBytes to control whether the image result stores:
true: Encoded (compressed) file bytes for models like AutoGGUFVisionModel
false: Decoded pixel matrix for models such as Qwen2VLTransformer
outputPromptColumn parameter to optionally output a separate prompt column containing text prompts as Spark NLP Annotations. This is the required format for AutoGGUFVisionModel.
Platform Setup Documentation
Added official documentation and instructions for setting up and running Spark NLP on Microsoft Fabric, simplifying configuration and improving developer onboarding on the platform. You can see them at Spark NLP - Installation
π Bug Fixes
Sentence metadata is now consistently included in DocumentAssembler outputs when using LightPipeline.
Fixed an issue where resetting the cache in ResourceDownloader could fail under certain conditions.
Fixed a document parsing bug where some HTML elements (such as section titles or diagnosis entries) could appear multiple times in the parsed output.
Improved robustness when loading ONNX BertEmbeddings models with non-standard output tensor names.
β€οΈ Community Support
Slack β real-time discussion with the Spark NLP community and team
GitHub β issue tracking, feature requests, and contributions