-
@ John Dee
2025-02-04 22:24:08
I've been using AI tools to help with coding for a while now, but it's always been copy/pasting into ChatGPT and DeepSeek. Faster iteration is better, and privacy is best. So it's time to figure out how to use these tools integrated into [VSCodium](https://github.com/VSCodium/vscodium) and running locally. After a quick review of the most popular extensions I've heard about, I settled on [Tabby](https://github.com/TabbyML/tabby).
## What the heck does this actually DO?
* Code completion - like autocomplete suggestions, looks at your code and suggests the next thing to type at the cursor. You can see it in gray text, then press Tab and it's added to your code.
* Chat - easily chat with an LLM about your code inside your editor, faster than copy/pasting into a website.
* Apparently it can write docs and tests too.
## Can I run it locally? Without a GPU? YES!
Tabby is optimized for small models that only need a few GBs of RAM. I'm using a fairly old [System76](https://system76.com/) Galago Pro with Core i5-10210U and 16 GB RAM. The fans spin up, and Tabby sometimes shows a warning about slow response time, but it seems usable.
## How do I get it set up?
This is the overview, look for step-by-step instructions at the end of this article.
1. Install the Tabby server, which runs locally.
1. Start the Tabby server and tell it to serve some models which it will download automatically.
1. Log in to the Tabby server web interface and create an admin account.
1. Install the Tabby extension in your editor.
1. Connect the Tabby extension to your Tabby server.
## How do I actually use it?
1. Click **Tabby** in the bottom-right corner to open the Tabby command palette.
1. Select **Chat** to open the Chat pane. Drag it to the right side if you want.
1. Or, *Ctrl-Shift-P* and type "tabby" to see some Tabby commands
1. Select some code and *Ctrl-Shift-P* "tabby" to see more Tabby commands
1. **Explain this** seems like a useful one.
Other than that, I don't know! I just started using it today.
## What model should I use?
As usual, it depends. If you're using CPU instead of GPU, start with the ones recommended by Tabby and shown in the Step-by-Step below. Tabby has a [registry of models](https://tabby.tabbyml.com/docs/models/) you can choose from, and a [leaderboard](https://leaderboard.tabbyml.com/) to compare them.
## Step-by-Step
This is for Ubuntu 24.04, and no GPU.
```
# install pre-reqs
sudo apt install build-essential cmake
sudo apt install protobuf-compiler libopenblas-dev
sudo apt install make sqlite3 graphviz
# install rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
# build tabby
git clone --recurse-submodules https://github.com/TabbyML/tabby
cd tabby
cargo build
# run tabby (downloads ~3 GB of models)
./target/debug/tabby serve --model StarCoder-1B --chat-model Qwen2-1.5B-Instruct
# browse to http://0.0.0.0:8080 and create an admin user
```
* Install the Tabby extension in your editor. VSCodium had it in the extensions store by searching for "tabby" or try this `ext install TabbyML.vscode-tabby`
* *Ctrl-Shift-P* and look for "Tabby: Connect to server"
* Use the default of `http://localhost:8080`
* Switch over to the Tabby web interface, click your Profile Picture and copy the auth token
* Paste that into Tabby somewhere. I'm sure you'll figure it out if you got this far.