Monday 11 September 2023

Delta loading in SAP Datasphere based on the ABAP CDS view

In applications where there is a high volume of data creation, changes, and deletion happens, such as sales orders, it is crucial for the application to have a mechanism that provides delta records and for the extractor that needs to provide delta loading. This means that the application should be able to identify and extract only the new, modified or deleted data, ensuring efficient and accurate data loading.

In some of the cases nightly full uploads are not what we exactly want, as time windows for data extractions are limited. Solution for that is coming with ODP framework.

The ODP framework for CDS extraction provides delta capabilities. We have actually two options for delta handling that we can chose from:

◉ Change Data Capture (CDC) Delta
◉ Generic Timestamp / Date based Delta

Let’s check the two delta mechanisms in detail below.

Change Data Capture (CDC) Delta


The easiest way to implement delta loading is to use CDC-enabled CDS views by importing them as Remote tables in Datasphere and enabling Real-time access to them.

Let us see how it is done:

@AbapCatalog.sqlViewName: 'ZSQL_SALES_DELTA'
@AbapCatalog.compiler.compareFilter: true
@AbapCatalog.preserveKey: true
@AccessControl.authorizationCheck: #NOT_REQUIRED
@EndUserText.label: 'CDS View - Delta loading on Sales Document'

@Analytics.dataExtraction.enabled: true
@Analytics.dataCategory: #FACT
@Analytics.dataExtraction.delta.changeDataCapture.automatic: true
define view ZCDS_SALES_DELTA as select from vbap 
inner join vbak on vbap.mandt = vbak.mandt 
               and vbap.vbeln = vbak.vbeln
{
    
    key vbak.vbeln as SalesDocument,
    key vbap.posnr as SalesDocumentItem,
        vbap.matnr as MaterialNumber,    
        vbap.kwmeng as CumulativeOrderQuantity,
        vbap.vrkme as SalesUnit,
        vbap.netwr as NetValue,
        vbap.waerk as SalesDocumentCurrency,
        vbap.arktx as TextforSalesOrderItem
}

This is our CDS view for delta loading on Sales Documents for the purpose of an example.

If we want to declare a Dimension, Fact, Aggregation Level or Cube we must to include the classification of the CDS view in the header of our CDS view with the following annotation:

@Analytics.dataCategory: #VALUE

Replace #VALUE by one of the categories commented before:

#CUBE
#AGGREGATIONLEVEL
#DIMENSION
#FACT

A CDS view can be enabled for data extraction by just adding the following annotation:

@Analytics.dataExtraction.enabled: true

After adding this annotation, the CDS view is available for extraction by the ODP framework and visible for its data consumers.

The Change Data Capture (CDC) recording mechanism uses database triggers to record any changes to the tables that belong to an ABAP CDS view. For example CDS views such as CDS projection views, this can be done automatically by using the following annotation:

Analytics.dataExtraction.delta.changeDataCapture.automatic

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Adding the CDS view to the Datasphere as a Remote table

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Choosing the SQL view name of CDS view

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Import and Deploy as a Remote table of CDS view

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Deployment has been done as a Remote table in Datasphere

To be able to load initial data to the remote table we have to run first Snapshot for it.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Running Snapshot

Let’s take a closer look at some of the data that we initially loaded first, because after using the Real-Time Access feature, we will see that the data that changes in the source system will also change in the Datasphere without reloading or scheduling the data.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
The data we are looking at as an example

We will see that the description in the “Short text for sales order item” field of the Sales Documents changes in the source system and is updated in Datasphere using the Real-Time Access feature.

In order to load the changes (delta records) created in the source system into Datasphere, we need to enable the data replication feature for the remote table as “Real-Time Access“.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Enabling the Real-Time Access feature

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Enabled the Real-Time Access feature

When we look at the details from the Data Integration Monitor menu, it should be as follows:

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Data Integration Monitor – general

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Data Integration Monitor – details

  • Switching replication type for remote table ‘ZSQL_SALES_DELTA’ from batch replication to real-time replication.
  • Subscribed to source system data changes.
  • Successful initial data transfer for real-time replication of remote table ‘ZSQL_SALES_DELTA’. Now real-time replication will automatically add delta data to remote table ‘ZSQL_SALES_DELTA’.

Now let’s change the “Short text for sales order item” field description of the Sales Documents in the source system and see how it is reflected in the Datasphere.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
In the source system we change the data we specified as an example above

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
All data updated in the source system are as above

The refresh frequency cannot be adjusted but my observation is it takes place within 1 hour to get updated records in Datasphere.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
All data updates that we have as an example

Generic Timestamp / Date based Delta


In order to be able to load data on a date-based basis, it is necessary to have date/time information in the relevant application tables that is updated based on changes in the application data.

“upd_tmstmp” is the field in the VBAK table which will trigger delta records in our example.

To define which field will trigger delta, we need to use annotation

@Analytics.dataExtraction.delta.byElement.name

This element can be date (ABAP type DATS) or a time stamp (UTC).

Lets update CDS view code with delta element annotation.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
Updated CDS view

Please be aware of that: When converting a time stamp to a date, we can use the tstmp_to_dats function as it up or we can use the cast keyword as below.

cast(substring(cast(upd_tmstmp as abap.char(32)),1,8) as abap.dats) as LastChangeDate

Both coding structures will give the same result.

After we have fulfilled our need, we will create a Data Flow and limiting LastChangeDate = CurrentDate-1 at Projection node in Datasphere.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
This is our Data Flow – general view

The operation performed at the TO_DATE node: Since the “LastChangeDate” field we created in the CDS view is of string type, need to be converted it to date type here in node.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
TO_DATE node

The operation performed at the FILTER node: The code applied to obtain the changes (delta records) from a day ago.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
FILTER node

There are two methods we can apply to run the Data Flow we created. They are:

  • The first one is: to run directly the Data Flow by pressing the run button manually or
  • The second one is: to schedule the Data Flow to run daily basis at specified time automatically.

In both of methods, all data that changed (delta records) the day before in the source system will be uploaded to the Datasphere.

As we have chosen to use the second method in our example, the delta records obtained as a result of running this data flow in the target table node one day later are shown below.

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
The result of changes (delta records) in Datasphere

SAP ABAP CDS View, SAP ABAP Certification, SAP ABAP Career, SAP ABAP Skills, SAP ABAP Jobs, SAP ABAP Preparation
The result of Scheduled Data Flow in Datasphere

No comments:

Post a Comment