TransWikia.com

How to build a database of image data for machine learning?

Data Science Asked on August 3, 2021

I want to build a database of image data for machine learning. But how should this be done? I’m assuming people don’t just dump all of their image data into a folder? Do they use a relational database management system, like MySQL? Or do they use a NoSQL database, like MongoDB? Is there a textbook that explores this part of machine learning in particular? Is this what "data warehouse" refers to?

One Answer

There are several approaches to this as you need both the input (images) and if your problem is a classification one, you need to reliably store the labels. You might also have some additional information about the images that could be useful for your problem:

  • you can store the images in such a way that all information is contained in the permanent store (for instance folder names with the labels that you want to learn and all the images of a given class within that folder). Keras has a method that allows you to create a dataset from a directory tf.keras.preprocessing.image_dataset_from_directory.

  • another way (which I prefer) is to store in a (SQL) database all of the metadata (label, image url in a table for instance). This is more flexible because you can easily change a label, add a new category without having to move images around. This also allows you to change the format and add additional data related to each image.

Correct answer by RonsenbergVI on August 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP