Database Administrators Asked by mikkergp on December 24, 2021
I’m using sqlalchemy and trying to do a query on an m2m table. I have a tag and want find all the events that match that tag:
SELECT * FROM events
WHERE EXISTS ( SELECT 1 FROM events_tags, tags
WHERE events.id = events_tags.event_id
AND tags.id = events_tags.tag_id
AND tags.id = 617)
LIMIT 50;
This is taking 2-3 seconds on my web server. On my laptop, it’s faster(though my laptop is more powerful and has fewer rows.) The web server has about 300,000 rows in this table.
Here is an explain:
+------+--------------------+-------------+--------+----------------+---------+---------+--------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+-------------+--------+----------------+---------+---------+--------------------------+--------+-------------+
| 1 | PRIMARY | events | ALL | NULL | NULL | NULL | NULL | 310172 | Using where |
| 2 | DEPENDENT SUBQUERY | tags | const | PRIMARY | PRIMARY | 8 | const | 1 | Using index |
| 2 | DEPENDENT SUBQUERY | events_tags | eq_ref | PRIMARY,tag_id | PRIMARY | 16 | timeline.events.id,const | 1 | Using index |
+------+--------------------+-------------+--------+----------------+---------+---------+--------------------------+--------+-------------+
I have a feeling this is simple, but my google-fu is failing me.
Create Table Syntax:
| events | CREATE TABLE `events` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`title` varchar(255) DEFAULT NULL,
`context` varchar(255) DEFAULT NULL,
`text` mediumtext,
`wikidata_id` bigint(20) DEFAULT NULL,
`start_day` smallint(6) DEFAULT NULL,
`start_month` smallint(6) DEFAULT NULL,
`start_year` bigint(20) DEFAULT NULL,
`end_day` smallint(6) DEFAULT NULL,
`end_month` smallint(6) DEFAULT NULL,
`end_year` bigint(20) DEFAULT NULL,
`is_number` tinyint(1) DEFAULT NULL,
`version` int(11) DEFAULT NULL,
`number` bigint(20) DEFAULT NULL,
`start_name` varchar(255) DEFAULT NULL,
`end_name` varchar(255) DEFAULT NULL,
`subject_title` varchar(255) DEFAULT NULL,
`object_title` varchar(255) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=327132 DEFAULT CHARSET=utf8 |
edited: my bad, it was missing the “from events” I guess I’m wondering where I would want an index, I would think on the “id” column, which should have an index.
just as one more variant, for common case when need extract not only events, but for example tag name:
SELECT
events.*,
tags.tag_name
FROM
events FORCE INDEX (PRIMARY)
INNER JOIN events_tags ON events.id = events_tags.event_id
INNER JOIN tags ON tags.id = events_tags.tag_id
WHERE tags.id = 617
LIMIT 50;
In some cases MySQL do not want use index, so we little help him (if we sure we request small part of whole table):
FROM
events FORCE INDEX (PRIMARY)
and plan will be:
1 SIMPLE tags const PRIMARY PRIMARY 4 const 1 100.00
1 SIMPLE events_tags ref event_id,tag_id tag_id 5 const 3 100.00 Using where
1 SIMPLE events eq_ref PRIMARY PRIMARY 4 test_db.events_tags.event_id 1 100.00
indexes - must be, all comparable columns must be same type, such as events.id
bigint(20) == events_tags.event_id
bigint(20)
Answered by a_vlad on December 24, 2021
Lots of simplification and speedup:
SELECT e.*
FROM ( SELECT DISTINCT event_id
FROM events_tags
WHERE tag_id = 617 ) AS et
JOIN events AS e ON e.id = et.event_id
ORDER BY ??? -- else get random 50??
LIMIT 50;
events_tags
smells like a many:many mapping table. It can possibly be improved by following these tips.
Answered by Rick James on December 24, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP