Generate 3D room layouts from RGB panoramas
Answer questions about images by chatting
Run a dynamic script from an environment variable
Inpaint masks in videos
Generate custom Pepe meme images with text prompts
image captioning, VQA
a super consistent video depth model
Generate detailed step-by-step answers to questions
Prompt with Images in flux[dev]
Track, rank and evaluate open Arabic LLMs and chatbots
Easily remove your videos background!
Media understanding
Interpret and execute code with responses
Ask any questions to the IPCC and IPBES reports
Extract text from images using OCR
Remove image backgrounds with a click
Demo for DocLayout-YOLO
Generate summaries for long-form text
Display OCRBench leaderboard for model evaluations
An end-to-end (e2e) Voice Language Model by Fish Audio.