You Can Now Run Your Own ChatGPT From Your Nvidia GPU

Chat with RTX is a new AI program from Nvidia. Credit: Nvidia

You’ve probably noticed generative AI tools such as Google Gemini and ChatGPT pushing their way into most of the technology we use every day. These tools are based on giant Large Language Models, or LLMs: networks trained on huge amounts of human data so that they can spit out realistic text or images or video.

You don’t need a cloud app to access these LLMs though—you can run them on your own computer too. You can benefit from everything these models offer while you’re offline, and you don’t have to hand over your prompts and conversations to the likes of Google or OpenAI either.

Now Nvidia has launched its own local LLM application—utilizing the power of its RTX 30 and RTX 40 series graphics cards—called Chat with RTX. If you have one of these GPUs, you can install a generative AI chatbot right on your computer and tailor it to your own needs.

How to get started

Before you start, make sure you’re running the latest drivers for your Nvidia GPU—the GeForce Experience app on your PC will help you with this—then head to the Chat with RTX download page. To run the tool, you need Windows 11, a GeForce RTX 30/40 series GPU (or RTX Ampere or Ada Generation GPU with at least 8GB of VRAM), and at least 16GB of RAM.

Bear in mind that Nvidia labels this as a “demo app,” which we’re assuming means it’s not in its fully finished form (and you may come across some bugs and glitches as a result). It’s also a hefty download, some 35GB in size, because it includes a couple of LLMs with it—something to note if you’re limited in terms of disk space or internet speed. The installation tool takes a while to finish all of its tasks too.

Nvidia Chat with RTX

Chat with RTX runs from your PC.Credit: Lifehacker

Eventually, you should find the Chat with RTX application added to your Start menu. Run it, and after a few moments of processing, the program interface will pop up in your default web browser. Up in the top left corner, you’re able to select the open source AI model you want to use: Mistral or Llama. With that done, you can start sending prompts as you would with ChatGPT or Google Gemini.

If you’ve used an LLM before, you’ll know what these generative AI engines are capable of: Get help composing emails, documents, text messages, and more, or get complex topics simplified, or ask questions that you might otherwise run a web search for (like “what’s a good party game for four adults?”).

Nvidia Chat with RTX

The app keeps a command prompt window open too.Credit: Lifehacker

The standard AI bot rules apply—try and be as specific and detailed as you can, and be wary of putting too much trust in the answers that you get (especially as this is a “demo”). Chat with RTX can’t look up current information on the web, so it’s not really suitable for producing answers that need to be up to date, but it will always try and give you an answer based on the masses of online text that it’s been trained on.

Down at the bottom of the interface you’ve got a button for generating a new response from the last prompt (if you’re not all that happy with the current one), an undo button (for going back to the previous prompt), and a delete chat button, which will clear your conversation history so you can start again. At the moment, there’s no way of exporting answers other than copying and pasting the text.

Adding your own data and YouTube videos

Even in this early form, Chat with RTX has a few useful tricks, one of which is the ability to base its answers on documents you provide: Maybe a week’s worth of research, or a series of reports you need to analyze, or even all of the fanfic you’ve been writing. Under the Dataset heading, select Folder Path, then direct the program towards the folder containing the documents you want to use.

The app will scan the folder you’ve pointed it towards—which might take a minute or two, if there are a lot of files in it—and then you’re free to start inputting your queries. The bot will scan the text looking for appropriate responses, then name the file(s) that it’s used at the end of the answer. You can ask for summaries, check facts, or get the bot to generate new text based on the text you’ve fed it.

Nvidia Chat with RTX

Chat with RTX can work with your own files too.Credit: Lifehacker

Again, to reiterate, this is an early version of a technology that is known to be less than 100 percent accurate—it’s not something you want to base boardroom decisions on yet. However, it’s fun to play around with an LLM that can work from documents you give it, whether they’re interview transcripts or volumes of poetry.

Speaking of transcripts, Chat with RTX is also able to analyze YouTube videos and offer responses based on them via the transcripts linked to the clips. (Based on the testing we’ve done, it can automatically generate transcripts for videos that don’t already have them.) This even works with entire YouTube playlists, so you can have the program run through a whole series of clips at the same time.

Nvidia Chat with RTX

You’re also able to point the program towards YouTube videos.Credit: Lifehacker

Select YouTube URL as the dataset, then paste the address of the video or the playlist in the box underneath. If you’re working with a playlist, you can specify the number of videos you want to include from it in the box on the right. Finally, click the download button on the far right, and Chat with RTX will download and process the transcript text, ready for whatever prompts you may have.

As with the document-scanning feature, this can be useful for getting summaries or picking out bits of important information, but we found it to be the most buggy part of the program to use—a program which, to be fair, is labeled as version 0.2. The app occasionally got confused about which video we were referring to, but if you need quick answers about lengthy videos and playlists that you don’t have time to watch, then Chat with RTX can be a useful tool to try.