Event-Driven Microservices in Data Science

3 min readMar 25, 2022

Event-driven microservices communicate with each other via event messages. When business events occur, producers publish them with messages. At the same time, other services consume them through event listeners.
Thus, the main benefits of event-driven systems are asynchronous behavior and loosely coupled structures. For example, instead of requesting data when needed, apps consume them via events before the need. Therefore overall app performance increases. On the other hand, keeping coupling loose is one of the main key points of a microservice environment

Problem Statement — Information Extraction

So where does the event comes into play when we talk about micro-services for Data-Driven Projects? I will try to explain the use case in as simple words as possible.

Consider Following Scenario

Digital PDFs are being generated in Real-time and you need to perform steps like PDF parsing -> Text Classification -> Information Extraction and -> dump the information extracted. The results may or may not be required in real-time.

I can wrap the 3 functionalities in 3 different micro-service in a monolithic way and they can serve the purpose. Looks fairly easy.

Consider this now ;

Out of the nG number of documents generated, nF documents processing failed.
And it need not fail at the final stage of information extraction, it can fail at any given point after the generator service has generated the PDF.

For time being let's say the failure reason could be API downtime and my Text -Classification-API is down. What will happen if I have Architecture-1 in place? During the downtime I will face an issue where PDF-preprocessor-API information generated will go un-processed in a large batch. We can definitely create a quick fix after the downtime issue of Text-Classification-API is resolved but is there a better way around this?

Event-Driven Architecture as a Solution

You have numerous microservices that must interact with one another asynchronously. You may want your services to be scalable, disconnected from one another, and independently maintained. You may also want your microservices to generate events that other services may consume. This is where Event-driven Microservices come into play.

Breaking down the components named in the above architecture

Event Bus: Consider this as a Post box, where all the messages generated by services are published but can be consumed only by services whose address is written on the message payload.
Publisher: Each microservice will have a Publisher Controller, whose job is to trigger a message with a pre-defined payload containing required information that can be consumed by other services Consumer.
Consumer: Consumer as the name itself says it all. It consumes the messages published in the event bus. Each consumer is basically listening to some events and waits for their respective publishers.
Event Messages: Once the job is done, event services publish messages in the event channel/bus with a specific/predefined payload structure and contain important information for its consumer to consume and get the rest of the task done.

See the following sample example from ZeroMQ