• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Home
  • Contact Us

iHash

News and How to's

  • CleanMyMac One-Time Purchase: Lifetime License for $62

    CleanMyMac One-Time Purchase: Lifetime License for $62
  • UltraVPN Secure USA VPN Proxy: 3 Year Subscription + Free Antivirus for 30 Days for $29

    UltraVPN Secure USA VPN Proxy: 3 Year Subscription + Free Antivirus for 30 Days for $29
  • Wordela Vocabulary Mastery: Lifetime Subscription for $39

    Wordela Vocabulary Mastery: Lifetime Subscription for $39
  • Apple Watch Series SE 2nd Gen (2022) Aluminum with Silicone Band – 44mm/Starlight (Refurbished Grade A: GPS + Cellular) for $273

    Apple Watch Series SE 2nd Gen (2022) Aluminum with Silicone Band – 44mm/Starlight (Refurbished Grade A: GPS + Cellular) for $273
  • Scribbyo AI: Lifetime Subscription for $49

    Scribbyo AI: Lifetime Subscription for $49
  • News
    • Rumor
    • Design
    • Concept
    • WWDC
    • Security
    • BigData
  • Apps
    • Free Apps
    • OS X
    • iOS
    • iTunes
      • Music
      • Movie
      • Books
  • How to
    • OS X
      • OS X Mavericks
      • OS X Yosemite
      • Where Download OS X 10.9 Mavericks
    • iOS
      • iOS 7
      • iOS 8
      • iPhone Firmware
      • iPad Firmware
      • iPod touch
      • AppleTV Firmware
      • Where Download iOS 7 Beta
      • Jailbreak News
      • iOS 8 Beta/GM Download Links (mega links) and How to Upgrade
      • iPhone Recovery Mode
      • iPhone DFU Mode
      • How to Upgrade iOS 6 to iOS 7
      • How To Downgrade From iOS 7 Beta to iOS 6
    • Other
      • Disable Apple Remote Control
      • Pair Apple Remote Control
      • Unpair Apple Remote Control
  • Special Offers
  • Contact us

NVIDIA Launches Inference Platforms for Large Language Models and Generative AI Workloads

Mar 22, 2023 by iHash Leave a Comment

NVIDIA launched four inference platforms optimized for a diverse set of rapidly emerging generative AI applications — helping developers quickly build specialized, AI-powered applications that can deliver new services and insights.

The platforms combine NVIDIA’s full stack of inference software with the latest NVIDIA Ada, NVIDIA Hopper™ and NVIDIA Grace Hopper™ processors — including the NVIDIA L4 Tensor Core GPU and the NVIDIA H100 NVL GPU, both launched at GTC. Each platform is optimized for in-demand workloads, including AI video, image generation, large language model deployment and recommender inference.

“The rise of generative AI is requiring more powerful inference computing platforms,” said Jensen Huang, founder and CEO of NVIDIA. “The number of applications for generative AI is infinite, limited only by human imagination. Arming developers with the most powerful and flexible inference computing platform will accelerate the creation of new services that will improve our lives in ways not yet imaginable.”

Accelerating Generative AI’s Diverse Set of Inference Workloads
Each of the platforms contains an NVIDIA GPU optimized for specific generative AI inference workloads as well as specialized software:

  • NVIDIA L4 for AI Video can deliver 120x more AI-powered video performance than CPUs, combined with 99% better energy efficiency. Serving as a universal GPU for virtually any workload, it offers enhanced video decoding and transcoding capabilities, video streaming, augmented reality, generative AI video and more.
  • NVIDIA L40 for Image Generation is optimized for graphics and AI-enabled 2D, video and 3D image generation. The L40 platform serves as the engine of NVIDIA Omniverse™, a platform for building and operating metaverse applications in the data center, delivering 7x the inference performance for Stable Diffusion and 12x Omniverse performance over the previous generation.
  • NVIDIA H100 NVL for Large Language Model Deployment is ideal for deploying massive LLMs like ChatGPT at scale. The new H100 NVL with 94GB of memory with Transformer Engine acceleration delivers up to 12x faster inference performance at GPT-3 compared to the prior generation A100 at data center scale.
  • NVIDIA Grace Hopper for Recommendation Models is ideal for graph recommendation models, vector databases and graph neural networks. With the 900 GB/s NVLink®-C2C connection between CPU and GPU, Grace Hopper can deliver 7x faster data transfers and queries compared to PCIe Gen 5.

The platforms’ software layer features the NVIDIA AI Enterprise software suite, which includes NVIDIA TensorRT™, a software development kit for high-performance deep learning inference, and NVIDIA Triton Inference Server™, an open-source inference-serving software that helps standardize model deployment.

Early Adoption and Support
Google Cloud is a key cloud partner and an early customer of NVIDIA’s inference platforms. It is integrating the L4 platform into its machine learning platform, Vertex AI, and is the first cloud service provider to offer L4 instances, with private preview of its G2 virtual machines launching today.

Two of the first organizations to have early access to L4 on Google Cloud include: Descript, which uses generative AI to help creators produce videos and podcasts, and WOMBO, which offers an AI-powered text to digital art app called Dream.

Another early adopter, Kuaishou provides a content community and social platform that leverages GPUs to decode incoming live streaming video, capture key frames, optimize audio and video. It then uses a transformer-based large-scale model to understand multimodal content and improve click-through rates for hundreds of millions of users globally.

“Kuaishou recommendation system serves a community having over 360 million daily users who contribute millions of UGC videos every day,” said Yue Yu, senior vice president at Kuaishou. “Compared to CPUs under the same total cost of ownership, NVIDIA GPUs have been increasing the system end-to-end throughputs by 11x and reducing latency by 20%.”

D-ID, a leading generative AI technology platform, elevates video content for professionals by using NVIDIA L40 GPUs to generate photorealistic digital humans from text — giving a face to any content while reducing the cost and hassle of video production at scale.

“L40 performance was simply amazing. With it, we were able to double our inference speed,” said Or Gorodissky, vice president of research and development at D-ID. “D-ID is excited to use this new hardware as part of our offering that enables real-time streaming of AI humans at unprecedented performance and resolution while simultaneously reducing our compute costs.”

Seyhan Lee, a leading AI production studio, uses generative AI to develop immersive experiences and captivating creative content for the film, broadcast and entertainment industries.

“The L40 GPU delivers an incredible boost in performance for our generative AI applications,” said Pinar Demirdag, co-founder of Seyhan Lee. “With the inferencing capability and memory size of the L40, we can deploy state-of-the-art models and deliver innovative services to our customers with incredible speed and accuracy.”

Cohere, a leading pioneer in language AI, runs a platform that empowers developers to build natural language models while keeping data private and secure.

“NVIDIA’s new high-performance H100 inference platform can enable us to provide better and more efficient services to our customers with our state-of-the-art generative models, powering a variety of NLP applications such as conversational AI, multilingual enterprise search and information extraction,” said Aidan Gomez, CEO at Cohere.

Availability
The NVIDIA L4 GPU is available in private preview on Google Cloud Platform and also available from a global network of more than 30 computer makers, including Advantech, ASUS, Atos, Cisco, Dell Technologies, Fujitsu, GIGABYTE, Hewlett Packard Enterprise, Lenovo, QCT and Supermicro.

The NVIDIA L40 GPU is currently available from leading system builders, including ASUS, Dell Technologies, GIGABYTE, Lenovo and Supermicro with the number of partner platforms set to expand throughout the year.

The Grace Hopper Superchip is sampling now, with full production expected in the second half of the year. The H100 NVL GPU also is expected in the second half of the year.

NVIDIA AI Enterprise is now available on major cloud marketplaces and from dozens of system providers and partners. With NVIDIA AI Enterprise, customers receive NVIDIA Enterprise Support, regular security reviews and API stability for NVIDIA Triton Inference Server, TensorRT and more than 50 pretrained models and frameworks.  

Hands-on labs for trying the NVIDIA inference platform for generative AI are available immediately at no cost on NVIDIA LaunchPad. Sample labs include training and deploying a support chatbot, deploying an end-to-end AI workload, tuning and deploying a language model on H100 and deploying a fraud detection model with NVIDIA Triton™.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Source link

Share this:

  • Facebook
  • Twitter
  • Pinterest
  • LinkedIn

Filed Under: BigData

Special Offers

  • CleanMyMac One-Time Purchase: Lifetime License for $62

    CleanMyMac One-Time Purchase: Lifetime License for $62
  • UltraVPN Secure USA VPN Proxy: 3 Year Subscription + Free Antivirus for 30 Days for $29

    UltraVPN Secure USA VPN Proxy: 3 Year Subscription + Free Antivirus for 30 Days for $29
  • Wordela Vocabulary Mastery: Lifetime Subscription for $39

    Wordela Vocabulary Mastery: Lifetime Subscription for $39
  • Apple Watch Series SE 2nd Gen (2022) Aluminum with Silicone Band – 44mm/Starlight (Refurbished Grade A: GPS + Cellular) for $273

    Apple Watch Series SE 2nd Gen (2022) Aluminum with Silicone Band – 44mm/Starlight (Refurbished Grade A: GPS + Cellular) for $273
  • Scribbyo AI: Lifetime Subscription for $49

    Scribbyo AI: Lifetime Subscription for $49

Reader Interactions

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

  • Facebook
  • GitHub
  • Instagram
  • Pinterest
  • Twitter
  • YouTube

More to See

North America Holds the Largest Market Share in Artificial Intelligence at 43%

May 27, 2023 By iHash

Evolving the Swift Workgroups

May 27, 2023 By iHash

Tags

* Apple Cisco computer security cyber attacks cyber crime cyber news cybersecurity Cyber Security cyber security news cyber security news today cyber security updates cyber threats cyber updates data data breach data breaches google hacker hacker news Hackers hacking hacking news how to hack incident response information security iOS 7 iOS 8 iPhone Malware microsoft network security ransomware ransomware malware risk management Secure security security breaches security vulnerabilities software vulnerability the hacker news Threat update video web applications

Latest

CleanMyMac One-Time Purchase: Lifetime License for $62

Expires July 26, 2023 23:59 PST Buy now and get 29% off KEY FEATURES Meet your personal Mac genius — CleanMyMac X, the smart all-in-one tool that will make your Mac run like new again. CleanMyMac X removes unwanted apps and files from all corners of your macOS, including outdated caches, broken downloads, logs, and […]

UltraVPN Secure USA VPN Proxy: 3 Year Subscription + Free Antivirus for 30 Days for $29

Expires August 25, 2023 23:59 PST Buy now and get 87% off KEY FEATURES Get ultimate online protection with 3 years of UltraVPN + 30 days of Free Antivirus. This VPN offers fast speeds and a reliable server network, making it ideal for streaming. With military-grade AES-256 encryption and strong security features, you can browse […]

The personal threat landscape: securing yourself smartly

The personal threat landscape: securing yourself smartly

If you try to protect yourself against every threat in the world, you’ll soon run out of energy and make your life unbearable. Three-factor authentication here, a twenty-character password with musical notes and Chinese characters there, different browsers for different websites, and abstinence from social media don’t exactly sound life-asserting. What hurts the most is […]

Scribbyo AI: Lifetime Subscription for $49

Expires April 19, 2123 23:59 PST Buy now and get 94% off KEY FEATURES Are you exhausted from spending endless hours crafting content for your website or social media channels? Discover Scribbyo, the innovative AI content generator set to transform the way you produce content. With Scribbyo, you can access 37 supported languages, enabling you […]

Unleash the power of Amazon Kinesis Data Firehose and Elastic for enhanced observability

Unleash the power of Amazon Kinesis Data Firehose and Elastic for enhanced observability

As more organizations leverage the Amazon Web Services (AWS) cloud platform and services to drive operational efficiency and bring products to market, managing logs becomes a critical component of maintaining visibility and safeguarding multi-account AWS environments. Traditionally, logs are stored in Amazon Simple Storage Service (Amazon S3) and then shipped to an external monitoring and […]

Apple announces multibillion-dollar deal with Broadcom

Today Apple announced a new multiyear, multibillion-dollar agreement with Broadcom, a leading U.S. technology and advanced manufacturing company. Through this collaboration, Broadcom will develop 5G radio frequency components — including FBAR filters — and cutting-edge wireless connectivity components. The FBAR filters will be designed and built in several key American manufacturing and technology hubs, including Fort […]

Jailbreak

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.2.0

Pangu has updated its jailbreak utility for iOS 9.0 to 9.0.2 with a fix for the manage storage bug and the latest version of Cydia. Change log V1.2.0 (2015-10-27) 1. Bundle latest Cydia with new Patcyh which fixed failure to open url scheme in MobileSafari 2. Fixed the bug that “preferences -> Storage&iCloud Usage -> […]

Apple Blocks Pangu Jailbreak Exploits With Release of iOS 9.1

Apple has blocked exploits used by the Pangu Jailbreak with the release of iOS 9.1. Pangu was able to jailbreak iOS 9.0 to 9.0.2; however, in Apple’s document on the security content of iOS 9.1, PanguTeam is credited with discovering two vulnerabilities that have been patched.

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.1.0

  Pangu has released an update to its jailbreak utility for iOS 9 that improves its reliability and success rate.   Change log V1.1.0 (2015-10-21) 1. Improve the success rate and reliability of jailbreak program for 64bit devices 2. Optimize backup process and improve jailbreak speed, and fix an issue that leads to fail to […]

Activator 1.9.6 Released With Support for iOS 9, 3D Touch

  Ryan Petrich has released Activator 1.9.6, an update to the centralized gesture, button, and shortcut manager, that brings support for iOS 9 and 3D Touch.

Copyright iHash.eu © 2023
We use cookies on this website. By using this site, you agree that we may store and access cookies on your device. Accept Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT