SAP ->Data Lake Ingestion Challenge

In the digital age, the timeliness of accurate , accessible and actionable data is the primary competitive advantage for all firms.

Strategically most firms want to push master and transactional data assets into Data Lakes for Analysis and interrogation by AI and Machine Learning robots to drive further process automation.

The typical integration pattern to ingest/syndicate data from SAP systems to Data Lakes relies on superfluous middleware application technologies to fulfill the data movement.

An example integration pattern is:

SAP-ECC -> SAP-BI/BW -> ETL (BODS/SDS) -> Data Lake (Azure/Google/Oracle)

The BW and ETL layers add little or no transformational value to the dataset and simply act as a forwarding/relay mechanism.

The are many problems with this approach some of which include:


  • Inflated administration costs
  • Multiple skill sets required to support and maintain
  • Excessive infrastructure required to support data movement
  • Lots of development required to add just one additional table to the Lake


  • The incremental addition of just one more table requires design, build and testing in three separate technologies
  • Excessive moving parts - more things to go wrong and impact data movement process; multiple points of failure


  • Data has aged by the time it reaches the lake and could be more than 24 hours old.
  • Inability to perform near real-time analytics
  • AI/ML processing against out of date data

Simplified Integration Pattern

The Kagool SAP -> Data Lake Universal Adapter enables a direct connection between SAP System(s) and Data Lake technologies.

SAP-ECC -> Data Lake

The Kagool Adapter is fast to deploy and involves installation on both ends of the connection (SAP Side and Data Lake side), once installed the movement of data is fully parameter controlled and managed by your administrators

New tables can be added to the Lake in literally minutes with no development required.

The Adapter is fully generic and can be applied to any SAP System and any Data Lake Technology (Google, Azure, Oracle….etc)

The process allows data to be transferred in the following ways:

  • Batch Kill & Fill - (mass overwrite).
  • Batch Change Data Capture (delta processing)
  • Real time streaming (controllable real-time data syndication)

The data transfer process is highly sophisticated enabling for optimised data movement of very high volumes, extremely quickly and securely, meaning your Data Lake can hold a near real-time dataset - this can be critical for management decision making and also automated AI and Machine Learning (ML).

Overnight batch processing in the UK timezone is no longer acceptable for Global organisations, especially when transactional processing is now triggered from Data Lakes using AI or ML. Can your company afford these types of delay?

Please leave a message with your contact details. Our team will be in touch soon.