With the Code Push-Down strategy introduced by SAP HANA some of the functionalities used by ABAP developers haven’t been easily translated to the newest technologies, for example, CDS views.
In this blog post we are going to explore one of this scenarios trying to delete duplicate entries using ABAP CDS Views.
In this blog post we are going to explore one of this scenarios trying to delete duplicate entries using ABAP CDS Views.
As I demonstrated before in one of my previous articles, ABAP CDS Views don’t offer full flexibility for all the development scenarios and an alternative solution for the exceptions is to adapt our logic using ABAP CDS Table Functions. If you still didn’t check this content I advise to have a look in the following article before you proceed this reading.
Controlling the granularity of your data sets and deleting duplicate records is quite a difficult task in ABAP CDS development because the use of GROUP BY is not enough to solve this requirement in all of the cases (e.g. join with composite keys). For these particular scenarios there is a simple solution based on ABAP CDS Table Function and one of the SAP HANA SQL Window Functions called RANK.
Study case
Let’s create a simple example and compare both approaches discussed so far:
◈ ABAP CDS view with GROUP BY
◈ ABAP CDS table function with RANK
Create a simple table called ZDEMOCDSRANK based on the following structure:
Populate some data inside this table, you can use the following entries as example:
The requirement is quite simple, we need to extract the latest Reference Code for each Customer based on the latest Reference Date.
Notice the Reference Codes are not following the same order as the Reference Dates, if we try to solve this requirement with GROUP BY and MAX( ) the output wouldn’t be correct because of the different order between both fields.
First, let’s try an ABAP CDS view based on this erroneous approach and analyse the output.
ABAP CDS view with GROUP BY
@AbapCatalog.sqlViewName: 'ZCDSGROUPBY'
@AbapCatalog.compiler.compareFilter: true
@AccessControl.authorizationCheck: #CHECK
@EndUserText.label: 'Group By (Demo)'
define view ZCDS_GROUPBY
as select from zdemocdsrank
{
key customer as Customer,
max(refcode) as ReferenceCode,
max(refdate) as ReferenceDate
}
group by
customer
Run the Data Preview and check the result provided by this CDS view.
As explained before the result is incorrect because the strategy with GROUP BY doesn’t maintain the integrity of the row. MAX( ) brings the highest values for each one of the columns creating some inconsistencies between ReferenceCode and ReferenceDate.
In this case, the approach with Table Function and RANK( ) can easily solve our requirement, let’s check the second approach now.
ABAP CDS Table Function
@EndUserText.label: 'Rank (Demo)'
define table function ZCDS_RANK
returns {
Client : abap.clnt;
Customer : kunnr;
ReferenceCode : abap.char(10);
ReferenceDate : abap.dats;
}
implemented by method zcl_cds_rank=>exec_method;
AMDP Class
CLASS zcl_cds_rank DEFINITION
PUBLIC
FINAL
CREATE PUBLIC .
PUBLIC SECTION.
INTERFACES if_amdp_marker_hdb.
CLASS-METHODS exec_method FOR TABLE FUNCTION ZCDS_RANK.
PROTECTED SECTION.
PRIVATE SECTION.
ENDCLASS.
CLASS zcl_cds_rank IMPLEMENTATION.
METHOD exec_method
BY DATABASE FUNCTION FOR HDB
LANGUAGE SQLSCRIPT
OPTIONS READ-ONLY
USING zdemocdsrank.
rank_table =
SELECT mandt AS Client,
customer AS Customer,
refcode AS ReferenceCode,
refdate AS ReferenceDate,
RANK ( ) OVER ( PARTITION BY mandt, customer
ORDER BY refdate DESC ) AS rank
FROM zdemocdsrank;
RETURN
SELECT Client,
Customer,
ReferenceCode,
ReferenceDate
FROM :rank_table
WHERE rank = 1;
ENDMETHOD.
ENDCLASS.
Run the Data Preview again and this time observe the correct output.
Important note:
The function RANK( ) creates a numeric field at the end of the selection list holding the position of the record based on the specified PARTITION and ORDER:
◈ Partition by Client and Customer;
◈ Order by Reference Date descending.
This logic creates a ranking for each group of Customers including the latest records on the top, this is the result generated by the first SELECT statement in our example:
Client | Customer | ReferenceCode | ReferenceDate | rank |
10 | 1000000001 | 1205 | 08.10.2018 | 2 |
10 | 1000000001 | 2110 | 21.11.2018 | 1 |
10 | 1000000001 | 3281 | 01.09.2018 | 3 |
10 | 1000000001 | 6503 | 19.08.2018 | 4 |
10 | 1000000002 | 40290 | 19.12.2018 | 1 |
10 | 1000000002 | 53210 | 07.11.2018 | 2 |
10 | 1000000002 | 90001 | 16.09.2018 | 3 |
The second SELECT removes the duplicate entries looking only for records with the rank position no. 1, in other words, the most up to date record available for each Customer.
No comments:
Post a Comment