|
| 1 | +{ width="40%" align=right .shadow} |
| 2 | + |
| 3 | +# AI Microservices - Introduction |
| 4 | + |
| 5 | +*Building AI-powered applications at the edge has never been easier!* |
| 6 | + |
| 7 | +**Jetson AI Lab now offers a collection of pre-built containers, each functioning as an AI microservice**, designed to bring flexibility, efficiency, and scalability to your projects. |
| 8 | + |
| 9 | +A **microservice** is a small, independent, and loosely coupled software component that performs a specific function.<br> |
| 10 | +In the [**Models**](../models.html) section of Jetson AI Lab, you'll find AI inference services accessible through a standardized REST API. |
| 11 | + |
| 12 | +These AI microservices are powerful building blocks that enable you to create cutting-edge edge AI applications with ease. |
| 13 | +Whether you're working on robotics, vision, or intelligent automation, you now have the tools to accelerate innovation. |
| 14 | + |
| 15 | +Let’s build something amazing together! 💡✨ |
| 16 | + |
| 17 | +## Launch the Microservice Server |
| 18 | + |
| 19 | +### Walk-through video |
| 20 | + |
| 21 | +<video controls> |
| 22 | + <source src="https://github.com/user-attachments/assets/585a6cc0-f434-4b87-87ad-bd8f12ad01aa" type="video/mp4"> |
| 23 | + Your browser does not support the video tag. |
| 24 | +</video> |
| 25 | + |
| 26 | +### Steps |
| 27 | + |
| 28 | +1. Go to [**Models**](../models.html) section of Jetson AI Lab |
| 29 | +2. Click the model of your interest (specifically, the small green box representing different Orin modules) to open the model card |
| 30 | +3. Check the parameter, change as needed, and click on the **:octicons-copy-16: ("Copy to clipboard")** icon in the code snippet under the "**Docker Run**" section |
| 31 | +4. Paste the `docker run` command in Jetson terminal and execute |
| 32 | +5. Once you see a line like the following (for the case of MLC based service), the server is up and ready |
| 33 | + |
| 34 | + ``` { .yaml .no-copy } |
| 35 | + INFO: Uvicorn running on http://0.0.0.0:9000 (Press CTRL+C to quit) |
| 36 | + ``` |
| 37 | +
|
| 38 | +## API Endpoints |
| 39 | +
|
| 40 | +| Method | Endpoint | Description | |
| 41 | +| ------ | -------- | ----------- | |
| 42 | +| `GET` | `/v1/models` | Get a list of models available | |
| 43 | +| `POST` | `/v1/chat/completions` | Get a response from the model using a prompt | |
| 44 | +
|
| 45 | +## Example Use of LLM Microservices with Curl |
| 46 | +
|
| 47 | +### `/v1/models` |
| 48 | +
|
| 49 | +=== ":material-list-box: Step-by-step Instruction" |
| 50 | +
|
| 51 | + 1. Execute the following on a Jetson terminal |
| 52 | +
|
| 53 | + ```bash |
| 54 | + curl http://0.0.0.0:9000/v1/models |
| 55 | + ``` |
| 56 | +
|
| 57 | + 2. Check the output. It should show something like the following. |
| 58 | +
|
| 59 | + ``` { .json .no-copy } |
| 60 | + { |
| 61 | + "object": "list", |
| 62 | + "data": [ |
| 63 | + { |
| 64 | + "id": "DeepSeek-R1-Distill-Qwen-1.5B-q4f16_ft-MLC", |
| 65 | + "created": 1741991907, |
| 66 | + "object": "model", |
| 67 | + "owned_by": "MLC-LLM" |
| 68 | + } |
| 69 | + ] |
| 70 | + } |
| 71 | + ``` |
| 72 | +
|
| 73 | + !!! note |
| 74 | +
|
| 75 | + For the `/v1/models` endpoint usage, you can reference the OpenAI doc page like [this](https://platform.openai.com/docs/api-reference/models). |
| 76 | +
|
| 77 | + > `get https://api.openai.com/v1/models` |
| 78 | +
|
| 79 | + Note that you need to substitute the base URL (`https://api.openai.com/` with `http://0.0.0.0:9000`), and you don't need to provide the authorization field. |
| 80 | +
|
| 81 | +=== ":octicons-video-16: Walk-through video" |
| 82 | +
|
| 83 | + <video controls> |
| 84 | + <source src="https://github.com/user-attachments/assets/ee160d65-dad5-4eb1-8341-e178cfc53c78" type="video/mp4"> |
| 85 | + Your browser does not support the video tag. |
| 86 | + </video> |
| 87 | +
|
| 88 | +### `/v1/chat/completions` |
| 89 | +
|
| 90 | +=== ":material-list-box: Step-by-step Instruction" |
| 91 | +
|
| 92 | + 1. Execute the following on a Jetson terminal |
| 93 | +
|
| 94 | + ```bash |
| 95 | + curl http://0.0.0.0:9000/v1/chat/completions \ |
| 96 | + -H "Content-Type: application/json" \ |
| 97 | + -d '{ |
| 98 | + "messages": [ |
| 99 | + { |
| 100 | + "role": "system", |
| 101 | + "content": "You are a helpful assistant." |
| 102 | + }, |
| 103 | + { |
| 104 | + "role": "user", |
| 105 | + "content": "Hello!" |
| 106 | + } |
| 107 | + ], |
| 108 | + }' |
| 109 | + ``` |
| 110 | +
|
| 111 | + 2. Check the output. It should show something like the following. |
| 112 | +
|
| 113 | + ``` { .json .no-copy } |
| 114 | + { |
| 115 | + "id": "chatcmpl-9439e77a205a4ef3bc2d050a73a6e30b", |
| 116 | + "choices": [ |
| 117 | + { |
| 118 | + "finish_reason": "stop", |
| 119 | + "index": 0, |
| 120 | + "message": { |
| 121 | + "content": "<think>\nAlright, the user greeted me with \"Hello!\" and then added \"hi\". I should respond politely and clearly. I want to make sure they feel comfortable and open to any further conversation.\n\nI'll start with a friendly greeting, maybe \"Hello!\" or \"Hi there?\" to keep it consistent. Then, I'll ask how I can assist them, which is important to build trust. I should mention that I'm here to help with any questions, comments, or suggestions they might have.\n\nI also want to invite them to ask anything, so I'll make sure to keep the door open for future interaction. I'll keep the tone friendly and supportive, avoiding any abrupt requests.\n\nSo, putting it all together, I'll have a clear and concise response that's helpful and inviting.\n</think>\n\nHello! I'm here to help with any questions, comments, or suggestions you have. Keep asking anything you like, and I'll do my best to assist!", |
| 122 | + "role": "assistant", |
| 123 | + "name": null, |
| 124 | + "tool_calls": null, |
| 125 | + "tool_call_id": null |
| 126 | + }, |
| 127 | + "logprobs": null |
| 128 | + } |
| 129 | + ], |
| 130 | + "created": 1741993253, |
| 131 | + "model": null, |
| 132 | + "system_fingerprint": "", |
| 133 | + "object": "chat.completion", |
| 134 | + "usage": { |
| 135 | + "prompt_tokens": 11, |
| 136 | + "completion_tokens": 196, |
| 137 | + "total_tokens": 207, |
| 138 | + "extra": null |
| 139 | + } |
| 140 | + } |
| 141 | + ``` |
| 142 | +
|
| 143 | +=== ":octicons-video-16: Walk-through video" |
| 144 | +
|
| 145 | + <video controls> |
| 146 | + <source src="https://github.com/user-attachments/assets/d0d64ffb-2147-4e8b-a887-8d89ef091ee1" type="video/mp4"> |
| 147 | + Your browser does not support the video tag. |
| 148 | + </video> |
| 149 | +
|
| 150 | +## Example Use of LLM Microservices with Open WebUI |
| 151 | +
|
| 152 | +=== ":material-list-box: Step-by-step Instruction" |
| 153 | +
|
| 154 | + 1. Go to [**Models**](../models.html) section of Jetson AI Lab |
| 155 | + 2. Go to **Web UI** section, and click "**Open WebUI**" card |
| 156 | + 3. Check the parameter, change as needed, and click on the **:octicons-copy-16: ("Copy to clipboard")** icon in the code snippet under the "**Docker Run**" section |
| 157 | + - Note the "**Server IP / Port**" section. The default is `0.0.0.0:8080`. |
| 158 | + 4. Paste the `docker run` command in Jetson terminal and execute |
| 159 | +
|
| 160 | + ```bash |
| 161 | + docker run -it --rm \ |
| 162 | + --name open-webui \ |
| 163 | + --network=host \ |
| 164 | + -e PORT=8080 \ |
| 165 | + -e ENABLE_OPENAI_API=True \ |
| 166 | + -e ENABLE_OLLAMA_API=False \ |
| 167 | + -e OPENAI_API_BASE_URL=http://0.0.0.0:9000/v1 \ |
| 168 | + -e OPENAI_API_KEY=foo \ |
| 169 | + -e AUDIO_STT_ENGINE=openai \ |
| 170 | + -e AUDIO_TTS_ENGINE=openai \ |
| 171 | + -e AUDIO_STT_OPENAI_API_BASE_URL=http://0.0.0.0:8990/v1 \ |
| 172 | + -e AUDIO_TTS_OPENAI_API_BASE_URL=http://0.0.0.0:8995/v1 \ |
| 173 | + -v /mnt/nvme/cache/open-webui:/app/backend/data \ |
| 174 | + -e DOCKER_PULL=always --pull always \ |
| 175 | + -e HF_HUB_CACHE=/root/.cache/huggingface \ |
| 176 | + -v /mnt/nvme/cache:/root/.cache \ |
| 177 | + ghcr.io/open-webui/open-webui:main |
| 178 | + ``` |
| 179 | + 5. Once you see a line like the following, the Open WebUI server should be ready |
| 180 | +
|
| 181 | + ``` { .json .no-copy } |
| 182 | + INFO: Started server process [1] |
| 183 | + INFO: Waiting for application startup. |
| 184 | + ``` |
| 185 | +
|
| 186 | + 6. On a web browser on a PC (that is on the same network as Jetson), access `http://<JETSON_IP>:8080/` |
| 187 | + 7. Sign in (if you have not, create an account first). |
| 188 | + - For the detail, read this note from our [Open WebUI tutorial](../tutorial_openwebui.md#step-2-complete-the-account-creation-process). |
| 189 | + 8. Check the selected model |
| 190 | + 9. Type your query in the chat box and check the response. |
| 191 | +
|
| 192 | + !!! tip |
| 193 | +
|
| 194 | + You can check out the walk-through video (in the [next tab](#__tabbed_3_2)) for details. |
| 195 | +
|
| 196 | +=== ":octicons-video-16: Walk-through video" |
| 197 | +
|
| 198 | + <video controls> |
| 199 | + <source src="https://github.com/user-attachments/assets/1166a26c-e8db-4952-9a99-b1802b5d39e4" type="video/mp4"> |
| 200 | + Your browser does not support the video tag. |
| 201 | + </video> |
0 commit comments