Scalable Infrence Server for Object Detection

Data Science Asked on April 17, 2021

I have created a Django service (nginx + Gunicorn) for object detection models.

For my case i have 50+ models with resnet 50 based back bone.

Server Machine Specification:

16 CPU
64 GB Ram

I have pre-loaded all the models in my service. I am running 20 inference requests in parallel.

But issue i am facing that Gunicorn intermittently restarts any worker out of 8 workers when inference is running and it is not due to time out. it starts to reloading the model in service again. Due to this my inference requests are failing.

Can you please Suggest the solution or other way to run inference as service ?

api bigdata inference object detection python

Add your own answers!

Ask a Question

Get help from others!