feat: allow to run parallel requests (#1290)

* feat: allow to run parallel requests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto
2023-11-16 08:20:05 +01:00
committed by GitHub
parent 66a558ff41
commit fdd95d1d86
9 changed files with 91 additions and 44 deletions

5
.env
View File

@@ -69,4 +69,7 @@ MODELS_PATH=/models
# PYTHON_GRPC_MAX_WORKERS=1
### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
# LLAMACPP_PARALLEL=1
# LLAMACPP_PARALLEL=1
### Enable to run parallel requests
# PARALLEL_REQUESTS=true