Server Fault Asked by Abhishek Divekar on January 9, 2021
I am not running a webapp, but rather a Machine Learning model which needs to provide real-time predictions.
Am using Nginx with Gunicorn, both of which are running in a docker container. The setup uses 4 gunicorn workers with 1 thread each (hosting 4 copies of my model) and nginx with 1 worker process.
At the moment, this setup returns 502 errors when my client sends a burst of requests to my server. I want to avoid this, even if it means longer response times for each request.
Things I have tried:
net.core.somaxconn
from 128 to 2048: this alleviates the issue of 502s. However, I cannot change sysctl.conf
in a production environment because my docker container runs in a non-privileged mode (I have no control over this, since it is controlled by another team).Would some of the folks here be able to help out?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP