Friday, 30 November 2018

Delete duplicate entries in ABAP CDS views using Table Function and SQL Window Function

With the Code Push-Down strategy introduced by SAP HANA some of the functionalities used by ABAP developers haven’t been easily translated to the newest technologies, for example, CDS views.

In this blog post we are going to explore one of this scenarios trying to delete duplicate entries using ABAP CDS Views.

SAP ABAP CDS, SAP ABAP Guides, SAP ABAP Tutorial and Materials, SAP ABAP Guides

As I demonstrated before in one of my previous articles, ABAP CDS Views don’t offer full flexibility for all the development scenarios and an alternative solution for the exceptions is to adapt our logic using ABAP CDS Table Functions. If you still didn’t check this content I advise to have a look in the following article before you proceed this reading.


Controlling the granularity of your data sets and deleting duplicate records is quite a difficult task in ABAP CDS development because the use of GROUP BY is not enough to solve this requirement in all of the cases (e.g. join with composite keys). For these particular scenarios there is a simple solution based on ABAP CDS Table Function and one of the SAP HANA SQL Window Functions called RANK.

Study case

Let’s create a simple example and compare both approaches discussed so far:

◈ ABAP CDS view with GROUP BY
◈ ABAP CDS table function with RANK

Create a simple table called ZDEMOCDSRANK based on the following structure:

SAP ABAP CDS, SAP ABAP Guides, SAP ABAP Tutorial and Materials, SAP ABAP Guides

Populate some data inside this table, you can use the following entries as example:

SAP ABAP CDS, SAP ABAP Guides, SAP ABAP Tutorial and Materials, SAP ABAP Guides

The requirement is quite simple, we need to extract the latest Reference Code for each Customer based on the latest Reference Date.

Notice the Reference Codes are not following the same order as the Reference Dates, if we try to solve this requirement with GROUP BY and MAX( ) the output wouldn’t be correct because of the different order between both fields.

First, let’s try an ABAP CDS view based on this erroneous approach and analyse the output.

ABAP CDS view with GROUP BY

@AbapCatalog.sqlViewName: 'ZCDSGROUPBY'
@AbapCatalog.compiler.compareFilter: true
@AccessControl.authorizationCheck: #CHECK
@EndUserText.label: 'Group By (Demo)'

define view ZCDS_GROUPBY 
  as select from zdemocdsrank
{
    key customer     as Customer, 
    
        max(refcode) as ReferenceCode, 
    
        max(refdate) as ReferenceDate
}
group by
  customer

Run the Data Preview and check the result provided by this CDS view.

SAP ABAP CDS, SAP ABAP Guides, SAP ABAP Tutorial and Materials, SAP ABAP Guides

As explained before the result is incorrect because the strategy with GROUP BY doesn’t maintain the integrity of the row. MAX( ) brings the highest values for each one of the columns creating some inconsistencies between ReferenceCode and ReferenceDate.

In this case, the approach with Table Function and RANK( ) can easily solve our requirement, let’s check the second approach now.

ABAP CDS Table Function

@EndUserText.label: 'Rank (Demo)'

define table function ZCDS_RANK
returns {
  Client        : abap.clnt;
  Customer      : kunnr;
  ReferenceCode : abap.char(10);
  ReferenceDate : abap.dats;
}
implemented by method zcl_cds_rank=>exec_method;

AMDP Class

CLASS zcl_cds_rank DEFINITION
  PUBLIC
  FINAL
  CREATE PUBLIC .

  PUBLIC SECTION.
    INTERFACES if_amdp_marker_hdb.
    CLASS-METHODS exec_method FOR TABLE FUNCTION ZCDS_RANK.

  PROTECTED SECTION.
  PRIVATE SECTION.
ENDCLASS.

CLASS zcl_cds_rank IMPLEMENTATION.

  METHOD exec_method
    BY DATABASE FUNCTION FOR HDB
    LANGUAGE SQLSCRIPT
    OPTIONS READ-ONLY
    USING zdemocdsrank.

    rank_table =
      SELECT mandt    AS Client,
             customer AS Customer,
             refcode  AS ReferenceCode,
             refdate  AS ReferenceDate,
             RANK ( ) OVER ( PARTITION BY mandt, customer
                                 ORDER BY refdate DESC ) AS rank
        FROM zdemocdsrank;

    RETURN
      SELECT Client,
             Customer,
             ReferenceCode,
             ReferenceDate
        FROM :rank_table
       WHERE rank = 1;

  ENDMETHOD.

ENDCLASS.

Run the Data Preview again and this time observe the correct output.

SAP ABAP CDS, SAP ABAP Guides, SAP ABAP Tutorial and Materials, SAP ABAP Guides

Important note:

The function RANK( ) creates a numeric field at the end of the selection list holding the position of the record based on the specified PARTITION and ORDER:

◈ Partition by Client and Customer;
◈ Order by Reference Date descending.

This logic creates a ranking for each group of Customers including the latest records on the top, this is the result generated by the first SELECT statement in our example:

Client Customer  ReferenceCode  ReferenceDate   rank 
10 1000000001  1205 08.10.2018 2
10  1000000001  2110  21.11.2018  
10  1000000001  3281  01.09.2018  
10  1000000001  6503  19.08.2018  
10 1000000002  40290  19.12.2018  
10  1000000002  53210  07.11.2018  
10  1000000002 90001  16.09.2018  

The second SELECT removes the duplicate entries looking only for records with the rank position no. 1, in other words, the most up to date record available for each Customer.

No comments:

Post a Comment