TransWikia.com

We are migrating our wordpress to static site. That is creating over 400,000 folders within 1 folder. Is there any limit to the number of sub-folders?

Server Fault Asked by Muhammad Ebrahym on November 4, 2021

Our wordpress web site is several years old and has many posts indexed and ranking well on google. With any serious traffic my wordpress server tanks – and this happens even after several rounds of wordpress optimization. We have had enough of wordpress issues and have decided to migrate.

We are migrating from wordpress to static site for better performance so that pages are not rendered for every request and static html,css,js and image files can be served directly by nginx web server instead of hitting another server at the back end.

The issue is that we have over 400,000 posts and every post will have a static page and hence a static folder in which we will be storing the relevant files like the html and image file for that post. So our main web folder will have over 400,000 subfolders. Will that be an issue on linux? Or will that be an issue for my web server performance? Is there anything on the hosting side that I should care about in this situation?

Has anyone here tried using ext4 with nginx with large number of subfolders in a folder? Does it really affect the performance? There are conflicting reports about performance of ext4 handling large number of folders… We do not want to migrate with added complexity unless it is really necessary. The migration is a big exercise already for us 🙂 and we would like to keep it as simple as possible unless there is a real risk of performance degradation. Has anyone used nginx webserver with large number of subfolders or files in a single folder?

Thank you in advance.

3 Answers

Here is a method of reducing the number of directories as given in Artem S. Tashkinov's answer and configuring nginx to obey the original URL structure.

Create a directory structure for each URL with the first two characters of each URL being a directory under the document root. Place the static content beginning with those two characters under that directory.

The nginx location that makes this possible is pretty simple:

    location ~ /(..) {
            root /srv/www/example.com/$1;
    }

This simply takes the first two characters of the URL after the initial / and appends that to the document root.

Note that this requires everything to be moved into two character subdirectories. That includes the top level /index.html, which must be placed at $root/in/index.html. As another example, a top level URL path /images must be moved to $root/im/images. The original document root will contain nothing but these two-character directory names.

Your document URLs will remain unchanged. For example, a blog post accessible at /15-things-to-do-when-visiting-dubai will be on your filesystem at $root/15/15-things-to-do-when-visiting-dubai/index.html, but still accessible at the original URL. (Note that if your original URLs did not have a trailing slash, one will be added, and 301 redirects are generated for SEO preservation.)

In the end the document root directory will have only a few thousand directories at most, and each of them will probably have at most a few hundred directories or files. This is very easily handled by any Linux filesystem.

Answered by Michael Hampton on November 4, 2021

Since the site is static how about hosting it on AWS S3 and making it AWS's problem?

S3 can host websites directly and each bucket can store a virtually unlimited number of files (which it calls objects) in a bucket. You used to have to be very careful about file naming but that has largely been solved and isn't a big issue now. Read the performance guidelines though, and test well.

S3 isn't always cheap for storage or bandwidth, you should use the AWS calculator to work out your costs (the new calculator doesn't seem to do S3 pricing). You can mitigate traffic costs somewhat by adding caching headers to every object when you upload it to S3, then putting your S3 bucket behind CloudFlare CDN (see this question). CloudFlare have free and paid plans, but with this much traffic and content I expect you'd want a paid plan.

Answered by Tim on November 4, 2021

In a best case scenario, you should avoid having more than a few thousand files per directory in most file systems because otherwise traversing it will take too much time and resources.

You could create a directory structure such as:

  • 00
  • 01
  • 02

...

  • FE
  • FF

That will give you 256 directories, and you can nest them infinitely.

Or you could try organizing posts by /YYYY/MM/DD/$UID-post-title

Answered by Artem S. Tashkinov on November 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP