Amyuni OCR組件是基于具有Amyuni PDF技術的Tesseract開源項目,它可以用于處理和創(chuàng)建PDF文檔。Tesseract庫以低成本提供高可靠性并且避免了有關商業(yè)OCR工具的方面的煩惱,因為現(xiàn)有的這些商業(yè)OCR工具對于來說通常是基于每個頁面的或者是成本高的離譜。
The Amyuni OCR module is based on the Tesseract Open Source project with the Amyuni PDF technology being used to process and create the PDF documents. The Tesseract library provides high reliability at a low cost and avoids developers the annoyances related to licensing commercial OCR tools which are often licensed on a per-page basis or at a ridiculously high cost to the developer.
The OCR module enables developers to: |
|
- Convert non-searchable PDF files into searchable PDFs
- Create searchable PDF documents out of various image formats such as multi-page TIFF, JPEG or PNG while applying text recognition on the images
- Compress image based PDF documents using high compression JBIG2 or more standard CCITT, JPEG and PNG compression formats
|
需要培訓、定制、外包?
請聯(lián)系我們!:800018081
慧都專業(yè)技術團隊幫助您提高效率,節(jié)省成本,降低風險!
* 關于本產品的分類與介紹僅供參考,精準產品資料以介紹為準,如需購買請先行測試。
特點 |
在PDF Creator中直接打開多頁的TIFF文件,用于OCR(光學字符識別)并以PDF格式保存到文檔中。 轉換基于圖像的PDF文檔或不可檢索的PDF文檔為可檢索的PDF文檔。 使用PBIG2標準進行壓縮,大大減小了掃描后的文檔的大小。同時,也使用了其他標準,如CCITT, JPEG或PNG。 支持多種語言,如英語、法語、意大利語、德語、葡萄牙語、西班牙語、荷蘭語和越南語。 在英語文檔上可高達98%的精確度。 提出的文本能夠在PDF文檔中可見或者隱藏。在這兩種情況下,該文本盡可能的接近原始文本。 提取的文本可以保存為一個普通的文本文件而不是一個PDF文件。 柵格化任何PDF文檔使得轉換這些文檔為一個基于圖像的可檢索的或不可檢索的PDF文檔。 得益于一個強大而穩(wěn)健的PDF庫,它可以創(chuàng)建高度優(yōu)化的和具有良好結構的PDF文檔,創(chuàng)建的PDF文檔可以通過電子郵件發(fā)送或者通過任何PDF兼容的瀏覽器查看瀏覽。
|
支持平臺 |
支持所有的Windows平臺,包括:
32位版本.: Windows 7, Windows 2008, Vista, XP, 2003, 2000 64位版本.: Windows 7, Windows 2008, Vista, XP, 2003
|
|
- Open multi-page TIFF files directly into PDF Creator for OCR (Optical Character Recognition) and save the documents in PDF format
- Convert image based or non-searchable PDF documents into searchable PDFs
- Apply JBIG2 Compression which heavily reduces size of scanned documents. Other standard compression formats such as CCITT, JPEG or PNG can also be used
- Support for multiple languages such as English, French, Italian, German, Portuguese, Spanish, Dutch and Vietnamese
- Obtain up to 98% accuracy on English language documents
- Extracted text can be either visible or hidden inside the PDF document. In both cases, the text is positioned as close as possible to the original text
- Extracted text can be saved to a regular text file rather than to a PDF file
- Rasterize any PDF document to convert it into an image based searchable or non-searchable PDF
- Benefit from a robust PDF library that can create highly optimized and well structured PDF documents that can be ed and viewed by any PDF compatible viewer
|
|
- Support for all Windows platforms include:
- 32-bit ver.: Windows 7, Windows 2008, Vista, XP, 2003, 2000
- 64-bit ver.: Windows 7, Windows 2008, Vista, XP, 2003
|