TransWikia.com

CNN memory consumption

Data Science Asked on August 17, 2021

I’d like to be able to estimate whether a proposed model is small enough to be trained on a GPU with a given amount of memory

If I have a simple CNN architecture like this:

  • Input: 50x50x3
  • C1: 32 3×3 kernels, with padding (I guess in reality theyre actually 3x3x3 given the input depth?)
  • P1: 2×2 with stride 2
  • C2: 64 3×3 kernels, with padding
  • P2: 2×2 with stride 2
  • FC: 500 neurons
  • Output: softmax 10 classes
  • Mini batch size of 64

Assuming 32bit floating point values, how do you calculate the memory cost of each layer of the network during training? and then the total memory required to train such a model?

3 Answers

Maybe this link will give you an explanation on how to compute the memory usage of an arbitrary neural network. Bellow in the link is explained the memory usage of the VGGNet model. Click here and scroll down a bit))

Answered by Alexandru Burlacu on August 17, 2021

I will assume by C1, C2, etc, you mean convolutional layers, and by P1 ,P2 you mean pooling layers, and FC means fully connected layers.

We can calculate the memory required for a forward pass like this:

One image

If you're working with float32 values, then following the link provided above by @Alexandru Burlacu you have:

Input: 50x50x3 = 7,500 = 7.5K

C1: 50x50x32 = 80,000 = 80K

P1: 25x25x32 = 20,000 = 20K

C2: 25x25x64 = 40,000 = 40K

P2: 12x12x64 = 9,216 = 9.2K <- This is a problem (and my approximation is a very hand-wavy guess here). Instead of working with 50, 25, '12.5', it would make more sense to work with multiples of 32. I've heard working with multiples of 32 is also more efficient from a memory standpoint. The reason this is a bad idea is 2x2 pooling doesn't divide the space properly, as far as I can tell. Feel free to correct me if I'm wrong.

FC: 1x500 = 500 = 0.5K

Output: 1 x 10 = 10 = 0.01K (next to nothing)

Total memory: 7.5K + 80K + 20K + 40K + 0.5K = 157.2K * 4 bytes = 628.8 KB

That's for one image.

Minibatch

If you're working with a minibatch size of 64, then you're reading 64 of these into memory at once and performing the operations all together, scaling everything up like this:

Input: 64x50x50x3 = 480,000 = 480K = 0.48M

C1: 64x50x50x32 = 5,120,000 = 5.12M

P1: 64x25x25x32 = 1,280,000 = 1.28M

C2: 64x25x25x64 = 2,560,000 = 2.56M

P2: 64x12x12x64 = 589,824 = 590K = 0.59M

FC: 64x500 = 32,000 = 32K = 0.032M

Output: 1x10x64 = 640 = 0.64K = 0.00064M (we don't care, this is tiny)

Total memory: 10M x 4 bytes ~ 40MB (I'm saying approximate because the website also says an approximate value)

EDIT: I misread the website, sorry.

According to the website, a backward pass requires about triple this, because of the need to store:

  • the activations and associated gradients for each neuron - these are of equal size;

  • the gradients of the weights (parameters) which are the same size as the parameters;

  • the value of the momentum, if you're using it;

  • some kind of miscellaneous memory (I don't understand this part)

Answered by StatsSorceress on August 17, 2021

While training a convNet, total memory required include following:

  • Memory for parameters
  • Memory for the output of intermediate layers
  • Memory for the gradient of each parameter
  • Extra memory needed if you are using optimizer like Momentum, RMSprop, Adams etc
  • Miscellaneous memory for implementation

A good rough approximation is number of parameters x 3 x 4(if you are using 32-bit float) bytes

Well, now this is how you calculate the number of parameters:

  • Conv layer: (kernel width x kernel height) x number of channels x depth + depth (add depth only if bias is there)
  • FC layer: numb of input*numb of output + output (output is added to include the number of bias)
  • Max pool layer: no parameter

Now just sum the number of all the parameters and use the formula I mentioned.

Answered by Vijendra1125 on August 17, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP