Azure Cloud SQL - Querying large number of rows with Python

Question

I have a Python Flask application that connects to an Azure Cloud SQL Database, and uses the Pandas read_sql method with SQLAlchemy to perform a select operation on a table and load it into a dataframe.

recordsdf = pd.read_sql(recordstable.select(), connection)

The recordstable has around 5000 records, and the function is taking around 10 seconds to execute (I have to pull all records every time). However, the exact same operation with the same data takes around 0.5 seconds when I'm selecting from a local SQL Server database.

What can I do to reduce the time it takes to load data from Azure to a dataframe? Would moving the entire Python application to Azure serverless help? Thanks

Additional Information

Azure database is on Standard tier with 20 DTUs
Database region has been configured to be close to my location
Ideally looking for the operation to take under 2 seconds

keiv.fly · Answer

You have several phases of the data retrieval process: connection time, download time, and database procesing time.

If you store your data in a csv file in a blob storage then the processing time will be faster (essentially zero). So every day you could save the data from the database to a csv file and then access the file when you need it.

Azure serverless will reduce the connection time and download time (if your internet connection is slow), but will not reduce the processing time of the database.

Azure Cloud SQL - Querying large number of rows with Python

One Answer

Add your own answers!

Ask a Question