FYP demonstration of document parsing of booking documents
Extract text from documents or images
Extract named entities from text
OCR that extract text from image of hindi and english
Find relevant text chunks from documents based on a query
Find relevant passages in documents using semantic search
RAG with multiple types of loaders like text, pdf and web
Find information using text queries
Upload images for accurate English / Latin OCR
Extract text from PDF files
Extract text from images with OCR
Extract PDFs and chat to get insights
Visual RAG Tool
Donut-booking-gradio is a tool designed to extract text from scanned booking documents. It is built as a proof-of-concept for a Final Year Project (FYP) focusing on document parsing and text extraction. The application leverages AI technology to analyze and extract readable text from scanned or image-based booking documents, making it easier to work with digital data.
• Live Interface: Provides a real-time interface for uploading and processing documents.
• Multi-Page Support: Capable of handling documents with multiple pages or scanned images.
• Text Extraction: Accurately extracts text from scanned booking documents using AI.
• Customizable Settings: Allows users to adjust settings for better extraction accuracy.
• Export Options: Enables users to export extracted text for further use.
• User-Friendly Design: Designed with a simple and intuitive user interface.
• Cross-Platform Compatibility: Works seamlessly across different operating systems.
1. What file formats does donut-booking-gradio support?
Donut-booking-gradio supports common image formats like JPG, PNG, and PDF.
2. How accurate is the text extraction?
The accuracy depends on the quality of the scanned document. Clear images yield better results, while blurry or distorted scans may reduce accuracy.
3. Can I process multiple pages at once?
Yes, the application supports multi-page documents, allowing you to process and extract text from all pages simultaneously.