All posts tagged: Multimodal

Greater Noida: Work set to begin soon on multi-modal logistics hub

Published by Parlour News

GREATER NOIDA: The Integrated Industrial Township Greater Noida Limited (IITGNL) has started the process to commence work on Multi-Model Logistics Hub (MMLH) at the city’s Bodaki region near Dadri by issuing on Friday a tender crucial to finalise an expert agency for the project, officials said, adding that applicants started submitting their bid on Sunday. The project will transform freight handling and industrial logistics in this belt that will boast of Noida International Airport scheduled to be operational by 2025-end. (HT Archive) The Delhi Mumbai Industrial Corridor (DMIC)- IITGNL has earmarked ₹5,881-crore budget for this Multi-Modal Logistics Hub that is touted as the largest logistics project in the country. The project will transform freight handling and industrial logistics in this belt that will boast of Noida International Airport scheduled to be operational by 2025-end. “The logistics hub will be developed on 311 hectares in Bodaki located in Dadri block of Greater Noida because this is at the junction of the Eastern and Western Dedicated Freight Corridors (DFC). This location is where both corridor meets, makes …

Hugging Face Expands LeRobot Platform With Multimodal Dataset for AI-Powered Cars

Published by Webster

Hugging Face announced the expansion of its LeRobot platform on Wednesday with a large dataset aimed at automotive automation. The online artificial intelligence (AI) and machine learning (ML) repository said that the dataset was created in collaboration with the AI startup Yaak. Dubbed Learning to Drive (L2D), the dataset was collected from a suite of sensors installed on 60 electric vehicles (EVs) over a period of three years. The open-source dataset is aimed at enabling developers and the robotics community to build spatial intelligence solutions for the automobile industry. Hugging Face Adds L2D Dataset to LeRobot In a blog post, the company detailed the new AI dataset, calling it “the world’s largest multimodal dataset aimed at building an open-sourced spatial intelligence for the automotive domain.” The entire dataset is more than 1PB (one PetaByte) in size, and was collected using sensor suites installed on 60 EVs operated by driving schools in 30 German cities for three years. Identical sensors were used to ensure consistency in the data collected. The LeRobot platform was launched last year …

Microsoft Announces Magma Foundation Model That Can Complete Multimodal Agentic Tasks

Published by Webster

Microsoft researchers announced a new foundation model on Wednesday that can perform agentic functions. Dubbed Magma, the artificial intelligence (AI) model is pre-trained on a large volume of datasets across text, images, videos, as well as spatial formats. The Redmond-based tech giant said that Magma is an extension of vision-language (VL) models and it can not only understand multimodal information but can also plan and act on them. The AI agent-enabled model can be used in a wide range of tasks including computer vision, user interface (UI) navigation, and robot manipulation. Microsoft Announces Magma Foundation Model In a GitHub post, Microsoft researchers detailed the new Magma foundation model. Foundation models are distinctive large language models (LLMs), which are built from scratch and are not distilled from any other model. They often become the baseline for other models in the series. Magma is unique in the sense that the AI model is pre-trained on a wide range of datasets. The researchers stated that the base architecture behind Magma is the Llama 3 AI model. However, Magma …

Samsung Galaxy S25 Ultra with multimodal AI agents, 50MP UWA camera launched in India – India TV

Published by Webster

Image Source : SAMSUNG Samsung Galaxy S25 Ultra Samsung recently held its Galaxy Unpacked event, unveiling its latest flagship smartphone, the Galaxy S25 Series. This new lineup succeeds last year’s Galaxy S24 Series and comprises the Galaxy S25 Ultra, Galaxy S25 Plus, and Galaxy S25 models. Building on last year’s AI advancements with features like Circle to Search, Samsung has now integrated AI at a system level, introducing multimodal AI agents that can understand text, speech, images, and videos for a seamless user experience. The Galaxy S25 series is powered by the Snapdragon 8 Elite processor. The premium Galaxy S25 Ultra features One UI 7, an enhanced Circle to Search capability, a 50MP ultrawide camera sensor, 10-bit HDR recording, and is built with durable titanium and the latest Corning Gorilla Armor 2. Below are the key specifications and features of the Samsung Galaxy S25 Ultra. Samsung Galaxy S25 Ultra price and availability The Samsung Galaxy S25 Ultra is priced at USD 1,299 (approximately Rs. 1,12,300) for the base model, which includes 12GB of RAM and …

Frame AI Glasses With Multimodal AI Capabilities Unveiled by Brilliant Labs

Published by Webster

Frame AI Glasses, a new artificial intelligence (AI)-powered wearable gadget, has been unveiled by Brilliant Labs. The device competes with similar wearable AI products such as Humane’s AI Pin and the Rabbit R1. The AI glasses by the company visually appear like standard prescription glasses but are equipped with a micro-OLED display and multimodal AI capabilities. The firm claims that the Frame AI Glasses can feel like “your glasses gave you AI superpowers”. The device is currently available for pre-orders and Brilliant Labs expects shipping to start by April. The Brilliant Labs’ website showcases the product and highlights some of its hardware specifications and features. It has also posted a video which teases the Frame AI Glasses and some of its functionalities. As per the company, the AI glasses weigh under 40g, which is comparable to the average prescription glasses. Sunglasses usually weigh a little less and are placed between 10g – 20g. Coming to the hardware, the AI glasses have two layers of lenses, with the outer being an augmented reality (AR) lens and …

Google Lumiere Multimodal AI Video Generation Tool Unveiled; Can Create 5-Second Videos From Text, Images

Published by Webster

Google unveiled its latest artificial intelligence (AI) model, Lumiere, last week. The new AI model is a multimodal video generation tool that can generate 5-second-long videos. It supports both text-to-video and image-to-video generation and joins existing AI models such as Runway Gen-2 and Pika 1.0. As per Google, Lumiere uses a Space-Time U-Net (STUNet) architecture that innovates how motion occurs in an AI video, making it appear realistic. The platform is not open to the public as of yet. In an accompanying preprint paper, the research team behind Lumiere explained that the major innovation in motion comes from creating the video in a single process instead of putting together still frames. Due to this, both the spatial (the objects in the video) and temporal (how things move around in the video) aspects of the video generation are created simultaneously. For the layperson, this results in perceiving motions as they occur in nature. To achieve this, Lumiere generates a larger number of 80 frames instead of Stable Diffusion’s 25 frames. “By deploying both spatial and (importantly) …

Parlour News India

Curated News from India

Years

Authors

Filter by Month

Filter by Categories

Filter by Tags

All posts tagged: Multimodal

Greater Noida: Work set to begin soon on multi-modal logistics hub

Hugging Face Expands LeRobot Platform With Multimodal Dataset for AI-Powered Cars

Microsoft Announces Magma Foundation Model That Can Complete Multimodal Agentic Tasks

Samsung Galaxy S25 Ultra with multimodal AI agents, 50MP UWA camera launched in India – India TV

Frame AI Glasses With Multimodal AI Capabilities Unveiled by Brilliant Labs

Google Lumiere Multimodal AI Video Generation Tool Unveiled; Can Create 5-Second Videos From Text, Images