How AxLiner Is Built | Handwritten OCR Engine

Model path

The extraction path starts with a vision-language model in the Qwen2-VL and olmOCR direction: document pixels are turned into a representation the model can read, and the prompt path is narrowed toward page reading. The result stays attached to a file, page, row, and reviewable output.

Vision pass

Document pages become visual tokens after rotation, contrast, patch, and page-shape preparation.

Document prompt

A 7B-class Qwen2-VL and olmOCR-style extraction path is guided toward handwriting, forms, and tables.

Cell graph

Headers, rows, totals, merged areas, and column relationships are rebuilt before export.

Owned job flow

Queue state, file metadata, retries, and downloads sit around the model so a batch can survive real use.

Table structure

A workbook is more demanding than plain OCR text. The pipeline has to preserve the idea of a header, detect when writing belongs to the next row, keep totals attached to their column, and avoid turning a ruled table into a paragraph. Preview, editable cells, corrected downloads, and batch comparison all depend on that consistent schema.

Visionpage patches, rotation, low contrastprepared

Readinghandwriting tokens and table promptstuned

Structurecells, headers, merged regionsmapped

Exportworkbook schema and review stateready

Batch layer

Queue admission, workers, storage metadata, file ownership, and retry-safe outputs carry the run beyond one web request. Job state, result files, and review actions stay together because the time saved comes from the whole group, not only the first image.

Engineering team designing the AxLiner batch processing layer

Excel review

The comparison view puts the source beside the table so teams can correct cells, mark reviewed files, and download the batch. The target is fewer repeated keystrokes and editable spreadsheets that remain useful after they leave AxLiner.

Developer building the AxLiner review board interface

Convert files Read benchmarks