2021-08-17 Custom Fields

Main takeaway:

Doctrine ORM should not be used for custom fields. Leads and companies should be refactored to be DTO objects, have read/write objects, or managed outside of Doctrine to prevent the need to double query.

Marco: Notes

Marco proposing notes to be put in written. To be decided where to store them.

Alan: community project - make them open. Confluence pages for wrap-up.
Alan: introduce Marco to project lead to get confluence access.

Marco: recordings OK.

Alan: what can we do better with doctrine?

  • tips on scaling

  • tips on workarounds

Marco: need code reference
Alan: running into problems with custom fields on the "lead" object
Alan: segmentation based on filters, we want to decrease complexity, and we use dynamic properties, done via
magic outside Doctrine
Alan: sorting by custom fields it does multiple queries: one to fetch data, then does a DQL operation after that.
Alan: when we query for content, we do twice the operations, sorting unoptimized
Alan: see Mautic\LeadBundle\Entity\LeadRepository - legacy decisions grown over this implementation

Marco: digging through LeadRepository, finding SQL to fetch metadata
Alan: explaining rough behavior of the LeadRepository
Marco: questioning use-case for sorting/filtering/etc.
Marco: perhaps split internal API and external AP based on use-cases?
Marco: asking for demo on staging env
Alan: showing https://mautibox.com/ - old staging, but experience is clear
Alan: showing contact listing UI
Marco: trying to understand if the problem is with internal workflows, or with API to the outside

Alan: showing email template builder -> custom fields available here
Alan: when rendering a template with a contact, the custom fields must be there
Alan: when "firing" a campaign on a segment, contact data is validated against certain rules to decide whether they are applicable
Alan: segmentation is then about creating a query builder, and storing it

Marco: is segmentation internal-only, or external?
Alan: internal mostly

Marco: how is a Lead built in the DB?
Alan: mostly custom columns added to the table

Marco: discussed following approaches to scaling Lead entity loading:

Generating entities

Discussed an approach that requires generating entities based on runtime metadata.

Based on metadata, we would generate the code that is part of the Lead entity:

  • simple to use: DQL "just works", and ORM can understand custom fields

  • should work well with extensions and static analysis

  • generated code can be isolated in a trait

  • requires re-generating proxies per-configuration

  • problematic for multi-tenancy

  • problematic for multi-tenancy and background long-running tasks

  • problematic for caching (need one app cache per tenant)

Changing DB structure

Discussed an approach that requires storing fields in a JSON structure. This
means that custom fields are stored in a JSON blob along with other default fields.

  • massive BC break - won't work with existing plugins and customizations

  • MySQL JSON indexing is quite bad for now

  • requires changes in how DQL queries are assembled

Making the current repository more lightweight

Discussed an approach that uses only doctrine/dbal for querying, instead of
going through the ORM QueryBuilder (slow / hard to use at scale):

  • manual record hydration (done by repository)

  • can UnitOfWork#registerManaged() manually, if needed

  • good to avoid double-querying when doing search operations

  • not invasive (only affects repository)

  • not a big win: complexity stays