[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes & More

old.reddit.com

[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes & More

old.reddit.com

Lemmit.Online botMAB to

Machine LearningEnglish · 1 month ago

[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes & More

old.reddit.com

Posted in r/MachineLearning by u/SouvikMandal • 22 points and 4 comments

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/SouvikMandal on 2025-06-12 15:41:51+00:00.

We’re excited to share Nanonets-OCR-s, a powerful and lightweight (3B) VLM model that converts documents into clean, structured Markdown. This model is trained to understand document structure and content context (like tables, equations, images, plots, watermarks, checkboxes, etc.).

🔍 Key Features:

LaTeX Equation Recognition Converts inline and block-level math into properly formatted LaTeX, distinguishing between $...$ and $$...$$.
Image Descriptions for LLMs Describes embedded images using structured <img> tags. Handles logos, charts, plots, and so on.
Signature Detection & Isolation Finds and tags signatures in scanned documents, outputting them in <signature> blocks.
Watermark Extraction Extracts watermark text and stores it within <watermark> tag for traceability.
Smart Checkbox & Radio Button Handling Converts checkboxes to Unicode symbols like ☑, ☒, and ☐ for reliable parsing in downstream apps.
Complex Table Extraction Handles multi-row/column tables, preserving structure and outputting both Markdown and HTML formats.

Huggingface / GitHub / Try it out:

Huggingface Model Card

Read the full announcement

Try it with Docext in Colab

You must log in or register to comment.

Chat

Machine Learning

machinelearning

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Community locked: only moderators can create posts. You can still comment on posts.

This subreddit is temporarily closed in protest of Reddit killing third party apps, see /r/ModCoord and /r/Save3rdPartyApps for more information.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
5 users / 6 months
1 local subscriber
20 subscribers
2.4K Posts
1 Comment
Modlog

mods:
Lemmit.Online bot

[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes &amp; More

[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes &amp; More

[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes & More

This is an automated archive made by the Lemmit Bot.

[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes & More

[P] Nanonets-OCR-s: An Open-Source Image-to-Markdown Model with LaTeX, Tables, Signatures, checkboxes & More