Mysql self Join query with Limit never terminates

Question

I'm running an experiment with the query
Select  Distinct table_1.id1, table_2.id1
    FROM  image as table_1, image as table_2
    WHERE  table_1.id2 = table_2.id2
    LIMIT  K;

When I run the query without limit, it terminates after 8 hours. However, the query does not terminate for Limit 80000.
I'm running this on cloudLab with 100GB Ram and Below is the image of my.cnf and query plan. I'm not sure what is the bottle neck for my query. How should I solve this problem
Query Plan text:
| -> Limit: 80000 row(s)
  -> Table scan on <temporary>
        -> Temporary table with deduplication
            -> Limit table size: 80000 unique row(s)
                -> Inner hash join (table_2.id2 = table_1.id2)  (cost=48913990425076.15 rows=48913988916980)
                    -> Table scan on table_2  (cost=0.01 rows=22116507)
                    -> Hash
                       -> Table scan on table_1  (cost=2224194.70 rows=22116507)

nbk · Answer

Your comma between two table means that you make a cross join and then select only the fitting rows.
do a proper JOIN with ON Clause, like
Select Distinct t1.id1, t2.id1 
FROM image as t1 INNER JOIN image as t2 
ON t1.id2 = t2.id2 
LIMIT K;

Also have an index on id2 in the table image

Mysql self Join query with Limit never terminates

One Answer

Add your own answers!

Ask a Question