New
v6.0.0
What's Changed
- Fixed memory leaks (#977)
- This version fixed a long-standing issue where memory would rise over time, eventually leading to a crash.
- Reduced runtime and memory usage for most users by updating default formats (#916).
- Fixed compatibility with Electron main process (#925)
- Fixed bug where user-provided parameters were overwritten by defaults (#975).
Breaking Changes
- All outputs formats other than
textare now disabled by default.- To re-enable the
hocroutput (for example), set the following:worker.recognize(image, {}, { hocr: true })- See here for a list of possible output formats.
- To re-enable the
- The JavaScript object output format (
blocks) was tweaked.- Only the array of blocks (
blocks) is returned.- Previous versions would automatically generate lists of every unit of text (
words,symbols, etc.).- If needed, these should now be generated by the user.
- Previous versions would automatically generate lists of every unit of text (
- Only text-based blocks are reported.
- Previous versions reported non-text blocks when detected by Tesseract (e.g. line segments).
- The shape of some objects were changed.
- See the type declarations for reference on properties.
- The main properties--
textandbbox--are unchanged.
- Only the array of blocks (
- Various functions and options marked as depreciated previously have been removed.
- This includes
worker.initializeandworker.loadLanguage, along with several depreciated options from v2.
- This includes
See #993 for additional discussion about this release.
New Contributors
- @IgorAufricht made their first contribution in https://github.com/naptha/tesseract.js/pull/971
Full Changelog: https://github.com/naptha/tesseract.js/compare/v5.1.1...v6.0.0