- JDK 21
- Gradle isn't required because of embedded Gradle in the project
- Docker (used by unit tests and as preferred way to start Ollama)
- Ollama (install it from official site or use it inside Docker container)
- NVIDIA GPU (recommended) (checked on GeForce RTX 3060 12Gb)
- On Linux to use GPU you need to install
nvidia-container-toolkit
according to article
- On Linux to use GPU you need to install
./gradlew clean build
docker compose -f docker-compose-prod.yml up
or use run-all.bat script
Send POST request to /api/generate
endpoint exposed by service with question inside prompt
field of request body.
For example:
curl -i -H 'Content-Type: application/json' \
-d '{ "prompt": "Tell me about Belarus" }' \
-X POST http://localhost:8090/api/generate
curl -i -H 'Content-Type: application/json' \
-d '{ "prompt": "Describe primitive types in Java" }' \
-X POST http://localhost:8090/api/generate
curl -i -H 'Content-Type: application/json' \
-d '{ "prompt": "Write code of bubble sort using Java" }' \
-X POST http://localhost:8090/api/generate
Or you could use prepared collection of Postman requests from postman folder. Just import them into your Postman
According to instruction
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama_container ollama/ollama
or
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama_container ollama/ollama
docker exec -it ollama_container ollama pull gemma3:4b
docker exec -it ollama_container ollama list
docker stop ollama_container && docker rm ollama_container