A simplistic explanation of the entity services anti-pattern


Lately, I’ve been participating in discussions where microservices patterns are the hottest topic.

People coming from Java background most of the time go for a software design comprised of entity services. Spring even has a tutorial about creating entity services. Microsoft does too. That nearly explains why, I presume.

In this post, I will explain my view on this anti-pattern.

Backgound

Once upon a time, there was a huge monolith. It started as a great initiative and helped the dev team release features easily and, frankly, quite fast especially in the beginning. More and more features got shipped into it and everyone was happy about it. Until the day things started going south, and the moaning of developers could be heard from outer space. “If I have to change one more thing in this monolith, I’ll go sell carpets instead”.

See? That was the time to move to [micro]-services! And the transition happened, so now instead of a big fat monolith we have a bunch of services. Each entity (eg. each model or app inside the monolith) was chopped off as a separate service, offering mainly a set of CRUD endpoints. And we have plenty of them.

Back to our story, we transitioned from a poorly structured monolith, to a poorly structured set of mostly entity services. Nah

The problem

An often-named example to explain this situation is the shopping cart example. Let it be an e-commerce application where the user needs to buy some products. Thus, in an entity-pattern architecture we’d have at least the following services:

  • products service, to keep information about the products, their prices, availability, etc
  • cart service, which would be responsible for assembling an order and let the customer proceed to checkout

The cart service would need to fetch information from the products service (eg. to get the product prices), therefore, we’ve introduced coupling between those two. But the entire point of microservices is to reduce coupling and increase autonomy between the various service, so these two contradict eachother.

We don’t need all of the information that can be found in the products service DB in order to compile a cart and eventually an order. We only need just a few - the product name, the price, the stock of it and probabbly a couple of others, but definitely not all of them.

The alternatives

1. True DB independence

From a high-level view, a viable alternative is to build a truly independent service DB. For example, the cart service DB can only keep the product information that is relevant for it, along with all the details it will keep anyway for the cart objects themselves.

That is a simplistic approach of how this would look like:

  1. cart service has a DB with two tables: carts and products
  2. products table keeps only the information necessary to be used for carts
  3. When a change occurs on a specific product, the products service triggers an event about it, shoves it into some kind of message exchange and it is consumed by the cart service. This ensures we keep our products information up-to-date.

This will reduce the coupling between services and eventually lead to a message-based architecture.

What could be a challenge in this approach is the introduction of a new service in the system. If by eg. we had to introduce the cart service long after the products service was live, that would pose an interesting problem to solve.

An idea here would be to structure a repeatable way to do data backfilling for operations like this. I’ll gather some thoughts on this one at a future blog post - maybe.

2. SAGA pattern

It could be practical in complex environments that DB independence is not a realistic option to follow the SAGA services pattern. In this case, each distributed transaction would be executed inside a saga. A saga is a sequence of local transactions.

As an example, the saga would a) execute a local transaction inside serviceA and then b) trigger an event/message that would be consumed by serviceB. And that would be repeated until the resulting context was compelte.

However, not everything is filled with roses in this road. The software model becomes significantly more complex. Also, because of the combo “local transaction / publish event” in every step of the process, there is a reliability risk involved, which needs to be addressed at the architecturing phase.

3. Event sourcing

Event sourcing is about persisting in a shared “event store” the state of all business entities. Eg. customers or carts are a set of state-changing events. A new event is generated whenever a state-changing action is made. Each entity is then comprised as a set of events accumulated together to form an actual object. The shared store has an API to publish events in it as well as subscribing a service to consuming certain events.

In our previous example, the a cart instance (eg. cart ID1234), would be comprised as a set of events:

  1. item1 added
  2. item2 added
  3. promo code1 applied
  4. sent to checkout

In this approach, it can become challenging to fetch events with a high frequency especially as you scale. Thus it is common to periodically store snapshots of objects. In case someone attempts to retrieve an object, the event store finds the latest snapshot and queries for the events added further on from the snapshot’s timestamp.

To wrap this up, be careful before opting in for this methodology. Steep learning curve is ahead, especially if you haven’t used it ever before. Also complicates a bit the software model, but, well, trade-offs ¯_(ツ)_/¯.

Epilogue

If you ask my personal favorite, I do not have an honest answer. Probably database independence is what I’d prefer in most use-cases, although there are drawbacks to consider and it heavily depends on the use-case. Have a different approach to recommend? I’d love to learn more about it!


See also