Sikandar: when we analyze queries from Zabbix/PMM, we see a lot of queries waiting for lock,
10%~30%. System is very write-intensive, and referential integrity is enforced.
Reads and writes are probably contending with each other.
Sikandar: we're not using FKs except for cascades.
Alan: cascade operations are in place / used for deletes
Alan: deleting contacts => lots of cascade operations
Marco: is there a reproducer?
Alan: for deletes, yes, for FKs not really
Sikandar: no direct relationship between FK cascades and locks
Marco: was replication attempted?
Sikandar: replication made it worse
Marco: replication shouldn't affect it, as it operates on binlogs?
Marco: is high consistency really needed for reads? For example segments?
Sikandar: depends on use-case. For example seeing new contact.
Marco: is high consistency really a business scenario though?
Alan: replication was abnormally slow - took minutes to catch up, and would be hours behind
Alan: we shifted focus elsewhere
Alan: we don't want to create duplication/loops for information that could be processed twice, such as
broadcast email, like eligible contacts after a broadcast went out.
Alan, Marco: discussing Doctrine's MasterSlaveConnection
Sikandar: largest DB is probably 350Gb of leads
Alan: that's mostly metadata - related data around segmentation leads to large sizes
Alan: information about what was sent and such is in related tables / denormalized
Marco: 350Gb is "tiny" - why is it having so much write noise that we get locks?
Alan: creating a contact -> added to segments -> written to table that maps to segment -> log entry (immutable)
Alan: segment mapping table is kept up to date
Alan: log is an append-only data structure
Alan: campaigns also produce records for every step in a campaign, deciding which path a lead has taken
Alan: we generate records when emails go out, and when links are clicked, when users are redirected, etc.
Alan: landing page also leads to writes
Marco: are writes caused by frontend (HTTP) of background processes?
Alan: more background processes - they're real-time
Marco: is there a message queue for this?
Alan: we do, but community uses cronjobs
Marco: perhaps use messenger component to abstract this, and send to RMQ?
Sikandar: that shifts the problem though
Marco: perhaps a good idea to avoid flushing every queue operation to disk?
Marco/Sikandar: discussion about NoSQL use-case
Marco: probably relevant if we have frequent ALTER TABLE, but probably not the issue
Marco: are the logs in FK and transactional coupling with the active lead data?
Alan: no direct biz logic, primary use-case is warehousing
Alan: data only used for visualization/API
Marco: what if we remove strong integrity on logs? Is there PII in there? Could it go to a column storage/time-series DB?
Alan: some info in logs could contain PII
Marco: perhaps we could UDP-out the logs?