🔵 State of the Art MLearning.ai April 4, 2024
20+ AI Startups Take Center Stage. News, Papers, Spaces, your personal AI art reading list
🚨 MLearning & ART NEWS
AI-Powered Image Editing: ChatGPT & DALL-E Combine Forces
OpenAI has integrated the ability to edit DALL·E images directly within ChatGPT. This update, available across web, iOS, and Android platforms, allows users to modify AI-generated images seamlessly, enhancing user experience and streamlining the creative process.
10+ Best AI Image Generators of April 2024
Apple's Home Robotics: Siri's New Sidekicks?
Apple Inc. is venturing into the development of personal robots for home use, following the termination of its car project. Based in Cupertino, California, the tech giant aims to innovate in the field of robotics, seeing it as a potential new source of revenue. Apple's secret teams are working on a mobile robot that can navigate cluttered spaces and handle chores like washing dishes, as well as an advanced robotic smart display.
Google Mulls AI-Powered Search Fees
Google is considering a significant change in its business model by charging for AI-powered search features, aiming to boost revenue in the competitive AI landscape. As of April 04, 2024, this potential shift would mark the first time any of the company's core products falls behind a paywall. The move is driven by the recognition of the high cost and resource-intensiveness of these features.
AI Voice Cloning: Rapid Advancements
OpenAI and Stability AI have recently introduced groundbreaking advancements in AI-generated audio, with the launch of their respective tools, Voice Engine and Stable Audio 2.0. These tools can create high-quality audio content, including voice cloning, from just 10 seconds of audio, a significant improvement over previous technologies that required hours of audio samples.
AI Startups Shine in YC's Winter 2024 Batch
In Y Combinator's Winter 2024 batch, AI startups have emerged as the stars, despite a downturn in overall startup investment. These startups, including an AI-powered spreadsheet, have attracted over $225 million in funding. This surge in capital towards AI ventures, particularly generative AI, which has seen an almost eightfold increase from 2022 to 2023, suggests a strong AI bubble.
YC's AI Startups List:
Eggnog: AI-generated video with character consistency. https://www.eggnog.ai
Focal: Converts books/screenplays into movies. https://focalml.com
Magic Hour: AI video editing tools (face swap, lip-syncing, etc.). http://magichour.ai
Fluently: AI speaking coach for calls, providing feedback. https://t.co/fvZRqWhRkY
Alai: Create presentations using AI from text prompts.
Pico: Organizes iPhone screenshots using vision models.
Sonia: AI therapist via text and voice. https://t.co/FkT0Ae1pPx
Infinity AI: Generates movies from scripts with AI actors. https://infinity.ai
Lumina: AI automation for scientific literature reviews. https://www.lumina-chat.com
Wuri: AI app creating visual novels from web stories. https://www.wuri.ai
Maia: AI relationship coach offering guidance. https://bit.ly/m/ourmaia
Sonauto: Creates songs with AI from lyrics. https://sonauto.ai
ego: Game engine for creating 3D worlds using natural language. https://www.ego.live
PocketPod: AI-generated podcasts tailored to your interests. https://pocketpod.app
Arcane: Platform to create video games without coding. https://www.arcanelabs.ai
Kopia: Virtual try-on for clothing using AI. https://www.brands.trykopia.com
Aqua Voice: Voice-only text editor with advanced dictation. https://withaqua.com
HeartByte: AI copilot for writing fiction with a community. https://www.heartbyte.ai
Soundry: Platform for AI music generation. https://soundry.ai
MathGPTPro: AI math tutor answering questions. https://info.mathgptpro.com
K-Scale Labs: Open-source humanoid robots for hobbyists. https://kscale.dev
📝 SOTA MLearning & ART Papers
ALOHa: A New Measure for Hallucination in Captioning Models
A cutting-edge metric for assessing object hallucination in visual descriptions by AI models, surpassing the limitations of the CHAIR metric which only considers a fixed array of objects from MS COCO. It's shown to be more effective, spotting 13.6% more hallucinated objects in a specially annotated MS COCO subset and 30.8% more in the more diverse nocaps dataset. Code coming soon!
Deep Image Composition Meets Image Forgery
Traditional methods, relying on handcrafted features, fell short against complex real-life forgeries. The breakthrough comes from using state-of-the-art image composition models to automate the creation of highly realistic spliced images, serving as a richer training ground for deep learning models. Testing on the latest image manipulation detection models revealed that images from this novel dataset are more challenging to identify, indicating a significant step forward in generating and detecting sophisticated image forgeries.
Harder, Better, Faster, Stronger: Interactive Visualization for Human-Centered AI Tools
Creating AI tools that augment and enhance human capabilities rather than replacing them. Examples include applications in creative writing, temporal prediction, and user experience analysis. The study underscores the importance of visualization in designing future HCAI tools to ensure they are user-friendly and effective in amplifying human performance.
Many-shot Jailbreaking
These attacks, named Many-shot Jailbreaking (MSJ), show no effectiveness with just 5 shots but consistently work with 256 shots, indicating a scalable method to induce LLMs to produce violent or deceitful content.
🤗 MLearning & ART Spaces
Leonardo-AI-image-Creator-UPDATED
Allows users to generate images using AI technology from Leonardo.ai. It provides an interface to input prompts and adjust settings like negative prompt, guidance scale, steps, and seed to create custom AI-generated images.
ai-comic-factory
The AI Comic Factory is a Hugging Face Space created by jbilcke-hf that allows users to generate comics using AI. However, the Space is currently paused by its owner
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model
CRM is a feed-forward model which can generate 3D textured mesh in 10 seconds.
The pace of AI development is incredible. Can't wait to see what's next!
Tools like ChatGPT with DALL-E integration and AI startups from Y Combinator showcase a trend of making AI more accessible and user-friendly for a wider audience