FYP demonstration of document parsing of booking documents
Find similar sentences in text using search query
Convert images with text to searchable documents
Search information in uploaded PDFs
Traditional OCR 1.0 on PDF/image files returning text/PDF
Identify and extract key entities from text
Process text to extract entities and details
Extract handwritten text from images
Extract text from multilingual invoices
Upload images for accurate English / Latin OCR
Extract text from images using OCR
Upload and analyze documents for text extraction and Q&A
Find relevant passages in documents using semantic search
Donut-booking-gradio is a tool designed to extract text from scanned booking documents. It is built as a proof-of-concept for a Final Year Project (FYP) focusing on document parsing and text extraction. The application leverages AI technology to analyze and extract readable text from scanned or image-based booking documents, making it easier to work with digital data.
• Live Interface: Provides a real-time interface for uploading and processing documents.
• Multi-Page Support: Capable of handling documents with multiple pages or scanned images.
• Text Extraction: Accurately extracts text from scanned booking documents using AI.
• Customizable Settings: Allows users to adjust settings for better extraction accuracy.
• Export Options: Enables users to export extracted text for further use.
• User-Friendly Design: Designed with a simple and intuitive user interface.
• Cross-Platform Compatibility: Works seamlessly across different operating systems.
1. What file formats does donut-booking-gradio support?
Donut-booking-gradio supports common image formats like JPG, PNG, and PDF.
2. How accurate is the text extraction?
The accuracy depends on the quality of the scanned document. Clear images yield better results, while blurry or distorted scans may reduce accuracy.
3. Can I process multiple pages at once?
Yes, the application supports multi-page documents, allowing you to process and extract text from all pages simultaneously.