• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Chat with NVIDIA RTX Tech Demo

AI...

All I see here is a glorified search machine, still, I'm sorry. So now you can burn several hundred watts locally to search within a limited dataset and wonder if the answer is actually correct. Yay!

I'm not seeing the magic, honestly.
It's better to see this for what it is : a toy. Install the thing, get some fun with it, then uninstall when you are bored. I'm still dumbfounded that people really used chatGPT for "research" when it couldn't search the web.

I wouldn't be surprised that many of Nvidia ML software are just their employees having fun and exploring various uses case of the tech. A lot of the RTX stuff is starting to become standard features in various software. Rn people are just trying stuff and what see what sticks. I wouldn't get my hope up for an actual "A.I chatbot" nor do I wish for it. Something highly intelligent but without empathy is dangerous.
 
I have one friend in the entire world and that is my GPU. So i will have a chat with my GPU...
 
Hi,
It's times like this I'm so glad I stopped jumping on new gpu's years ago and stayed using gtx hehe
Screw the "just buy rtx leather jacket man" nonsense I've not had any issues playing games I have.
 
Bleah! The grapes are sour when you can't reach them.

Recent history: Frame Generation
Before the AMD era: Bleah! Brrrrrrr! Bad! Fail!
After the copy... sorry ... AMD's answer: OoOoO! WooooW! Fantastic! Phenomenal!
 
that's interesting. Instead of getting some cheap AI speaker, get a 30-40 series card with 8 GB+ VRAM and have fun with your pricy AI bot:D:love:
 
that's interesting. Instead of getting some cheap AI speaker, get a 30-40 series card with 8 GB+ VRAM and have fun with your pricy AI bot:D:love:
How about, if you have an RTX card, you don't need to spend on AI speakers? (Wth is an AI speaker anyway?)
 
Bleah! The grapes are sour when you can't reach them.

Recent history: Frame Generation
Before the AMD era: Bleah! Brrrrrrr! Bad! Fail!
After the copy... sorry ... AMD's answer: OoOoO! WooooW! Fantastic! Phenomenal!
If it helps: I still don't like frame gen.
This is also absolutely nothing new, and not only can you just use chatgpt or bingchat/copilot, but you could already run LLM's on AMD GPUs.
 
If it helps: I still don't like frame gen.
This is also absolutely nothing new, and not only can you just use chatgpt or bingchat/copilot, but you could already run LLM's on AMD GPUs.
You can, but running a LLM is not the same as being able to feed it docs or videos as part of your questions.
This isn't meant to be something revolutionary. Just a little bonus for those that would rather tap into a LLM without sending their data to random servers.
 
Bleah! The grapes are sour when you can't reach them.

Recent history: Frame Generation
Before the AMD era: Bleah! Brrrrrrr! Bad! Fail!
After the copy... sorry ... AMD's answer: OoOoO! WooooW! Fantastic! Phenomenal!
You really oughta learn the difference between DLSS3 and FG on AMD, to help you better understand the world I think.

DLSS3 is limited to a single gen of cards.
FG can be modded in on any recent card and any game.

By extension, DLSS3 is a way to sell cards and force you to buy new cards and/or wait for Nvidia's game ready bullshit. You only get it if they allow you to.
FG is merely there as an incentive.
 
funny how NVIDIA's website says OS requirement is Windows 11, but here says Windows 10 or 11. Can you run it on W10?
 
that's interesting. Instead of getting some cheap AI speaker, get a 30-40 series card with 8 GB+ VRAM and have fun with your pricy AI bot:D:love:
8GB is not enough for real A.I. 16GB is where it starts to get good.
 
LLMs are super confident bullshit artists with narrow real world use case potential.

No idea how such a limited technology has got so overhyped.
 
A.I. is just the latest tech PR buzzword to sell more product. It offers little value, and it certainly offers very little in the way of intelligence, as you simply can't trust the output at all.

And love the block on the RTX20x0 series. Bloody nGreedia. :banghead:

I'm surprised @cvaldes hasn't berated you for getting online and mocking the relentless march of technology.
 
I told you RT was made for AI, no one listened. DLSS is just an excuse for the hardware. All you RTX believers have given Nvidia a stranglehold on the AI market and the whole world is going to suffer. Good job. :banghead:
But, the RT Cores responsible for Ray Tracing and Tensor Cores responsible machine learning are different hardware
 
Thank god that it only analyze your Youtube usage and not what one wach on CornSub....
Why do you think this is a bad thing? It's a locally run software

Assume you could submit your... user data... in some text based format to the local LLM, it could analyse it, and offer recommendations on how to make your fantasies a reality through a thorough statistical analysis
 
I imagine this chatbot could shorten development time significantly for future RPGs.

Now every NPC can tell you the entire lore of the game LOL
 
A shame I had to pass so many salty comments to read the interesting / constructive ones, there's a few broken records here.

I see a lot of potential in quickly extracting meaningful honed down data from any local datasets added, in fact I'll be proposing to my workplace adding an RTX GPU to a decent tower we already posses, feeding it hundreds of gigabytes of policies, reports, communications etc etc and seeing how far these legs can stretch.
 
funny how NVIDIA's website says OS requirement is Windows 11, but here says Windows 10 or 11. Can you run it on W10?
I see no reason why you couldn't run it on Windows 10, it's just a ton of Python stuff with an EXE GUI sitting in front of it. It's all open-source, so you can compile it on Windows XP, too, if you build the dependencies yourself, too
 
I see no reason why you couldn't run it on Windows 10, it's just a ton of Python stuff with an EXE GUI sitting in front of it. It's all open-source, so you can compile it on Windows XP, too, if you build the dependencies yourself, too
Do you need to compile it if it's Python?
 
This Nvidia developer forum has more info and resources for Chat with RTX.
There are other models that can be used(Llama 2 7B, and Code Llama 13B) plus I heard they are going to update with new Google Gemma models.

These guys made a nice guide to setting up Chat with RTX to run over LAN or WAN, as well as fix a text manipulation bug that was breaking cookies.

Chat with RTX requirements :
GeForce RTX 30 Series GPU or higher with a minimum 8GB of VRAM
Windows 10 or 11, and the latest NVIDIA GPU drivers.
 
How can I install the Llama 2 model. I have an 4060? please help?
 
How can I install the Llama 2 model. I have an 4060? please help?
Hello,

There are many roads to getting a model to run locally on your computer. I'll do a brief write up but there are plenty of videos on the topic already.

1. Pick your UI

There exist a few frontend applications to run the LLMs
a. StableLM
b. text-generation-webui by oobabooga

2. Pick your model

Here are a few things you should know about models:
a. They come in several parameter sizes. Common ones are 7B (B as in Billion), 13B... all the way up to 70B and maybe even more! Generally more parameters means more accurate output, but at the cost of greater computational requirements.
b. Because even 7B parameter models are difficult to run without beastly hardware, there are groups and individuals who quantise the models. This reduces their computational requirements, with minimal loss in output quality. (https://huggingface.co/TheBloke)
c. When picking a model match the VRAM requirement to what's listed on the model page. Try going up or down a quantisation depending on the output quality/ performance.
d. The base models like GPT, LLaMa, etc. get modified, optimised and uploaded to huggingface by various users. You will likely see these modified versions to be more popular than the base model.

For example, since you're specifically interested in LLaMa 2, here's a list of quantised models based on LLaMa 2: https://huggingface.co/TheBloke?search_models=llama2&sort_models=downloads#models

3. Running your model

Now that you've picked your UI and Model, it's time to run it. Note that there are many sliders you can modify to tweak your output. Consult documentation or a tutorial to understand what these sliders do.

I'm abstracting a lot of steps. Here's a video you can follow that's more in depth:

Sorry if I'm off topic. I found myself to be of the appropriate skill level to respond to this type of request.
 
Hello,

There are many roads to getting a model to run locally on your computer. I'll do a brief write up but there are plenty of videos on the topic already.

1. Pick your UI

There exist a few frontend applications to run the LLMs
a. StableLM
b. text-generation-webui by oobabooga

2. Pick your model

Here are a few things you should know about models:
a. They come in several parameter sizes. Common ones are 7B (B as in Billion), 13B... all the way up to 70B and maybe even more! Generally more parameters means more accurate output, but at the cost of greater computational requirements.
b. Because even 7B parameter models are difficult to run without beastly hardware, there are groups and individuals who quantise the models. This reduces their computational requirements, with minimal loss in output quality. (https://huggingface.co/TheBloke)
c. When picking a model match the VRAM requirement to what's listed on the model page. Try going up or down a quantisation depending on the output quality/ performance.
d. The base models like GPT, LLaMa, etc. get modified, optimised and uploaded to huggingface by various users. You will likely see these modified versions to be more popular than the base model.

For example, since you're specifically interested in LLaMa 2, here's a list of quantised models based on LLaMa 2: https://huggingface.co/TheBloke?search_models=llama2&sort_models=downloads#models

3. Running your model

Now that you've picked your UI and Model, it's time to run it. Note that there are many sliders you can modify to tweak your output. Consult documentation or a tutorial to understand what these sliders do.

I'm abstracting a lot of steps. Here's a video you can follow that's more in depth:

Sorry if I'm off topic. I found myself to be of the appropriate skill level to respond to this type of request.
I looked into that, but it is too low, like 1 word per minute. On the techpowerup website, it says that I can install the LLama2 model even if I only have an 8gb card: https://www.techpowerup.com/review/nvidia-chat-with-rtx-tech-demo/2.html how do I do it?
 
I looked into that, but it is too low, like 1 word per minute. On the techpowerup website, it says that I can install the LLama2 model even if I only have an 8gb card: https://www.techpowerup.com/review/nvidia-chat-with-rtx-tech-demo/2.html how do I do it?
The installed package has several .cfg files, these are text files that you can open with notepad. Look through them, in one of them is an entry that defines how much VRAM is required to install Llama2, lower that value and you should be able to install it
 
The installed package has several .cfg files, these are text files that you can open with notepad. Look through them, in one of them is an entry that defines how much VRAM is required to install Llama2, lower that value and you should be able to install it
after it is installed or the installer package folder? where is the directory?
 
Back
Top