TransWikia.com

Scalable Infrence Server for Object Detection

Data Science Asked on April 17, 2021

I have created a Django service (nginx + Gunicorn) for object detection models.

For my case i have 50+ models with resnet 50 based back bone.

Server Machine Specification:

  1. 16 CPU
  2. 64 GB Ram

I have pre-loaded all the models in my service. I am running 20 inference requests in parallel.

But issue i am facing that Gunicorn intermittently restarts any worker out of 8 workers when inference is running and it is not due to time out. it starts to reloading the model in service again. Due to this my inference requests are failing.

Can you please Suggest the solution or other way to run inference as service ?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP