![]() ![]() Won’t it be even more amazing if we can simply open up a browser and directly bring AI natively to your browser tab? There is some level of readiness in the ecosystem. The client side is getting pretty powerful. Specifically, can we simply bake LLMs directly into the client side and directly run them inside a browser? If that can be realized, we could offer support for client personal AI models with the benefit of cost reduction, enhancement for personalization and privacy protection. This project is our step to bring more diversity to the ecosystem. We also usually have to run on a specific type of GPUs where popular deep-learning frameworks are readily available. To build a chat service, we will need a large cluster to run an inference server, while clients send requests to servers and retrieve the inference output. These models are usually big and compute-heavy. Thanks to the open-source efforts like LLaMA, Alpaca, Vicuna and Dolly, we start to see an exciting future of building our own open source language models and personal AI assistant. We have been seeing amazing progress in generative AI and LLM recently. Please check out our GitHub repo to see how we did it. This opens up a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration. Everything runs inside the browser with no server support and accelerated with WebGPU. This project brings large-language model and LLM-based chatbot to web browsers. If you have a Apple Silicon Mac with 64GB or more memory, you can follow the instructions below to download and launch Chrome Canary and try out the 70B model in Web LLM. Llama 2 7B/13B are now available in Web LLM!! Try it out in our chat demo. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |