Introduction
Many SAP queries and ABAP reports do not perform well when retrieving large volume of data from traditional databases using complex SELECT – JOIN SQL statements often on 10 or more tables containing millions of records.
If these ABAP reports are standard SAP programs or ABAP programs generated by SAP in applications like, e.g., SAP SQ01 Query, they should not be modified by programmers. However, using them is frustrating because of poor performance and timeouts they cause for broad selection screen criteria.
There are options in the developers’ hands to make them working faster – sometimes an order of magnitude faster if not more.
This blog discusses a high-level approach on how to turbocharge/optimize for performance these programs making them more enjoyable, i.e., working lightning fast, for SAP customers.
Small is Beautiful – 10 * 3,000 Is Less Than 1 * 30,000
When executing a complex SELECT – JOIN SQL statement, it usually works inefficiently for broad WHERE clause selection criteria that makes it retrieve thousands of records from multiple database tables. In many cases you might get results faster when executing SELECT – JOIN SQL statement multiple times for a subset of WHERE clause selection criteria and each time retrieving a smaller number of records. The result would be the same, but the execution time could be shorter.
Additionally, when calling SELECT – JOIN SQL statement multiple times each time for a unique subset of the selection screen criteria, parallel processing techniques could be implemented to get results even faster. The difference in execution time could be even an order of magnitude if not more.
For example, executing SELECT – JOIN SQL statement on 20+ database tables retrieving 30,000 records could take 1,000 seconds to run. Calling the SELECT – JOIN SQL statement 10 times in sequence on 10 subsets of the selection screen criteria could take 700 seconds. Calling it in parallel in 10 parallel processes, could take 100 seconds or less – an order of magnitude improvement in performance.
How to do it in Standard ABAP Program
When a poorly performing program is a standard ABAP report or one generated by a standard SAP Application, e.g., SAP SQ01 Query, it is not advisable to modify it nor create a copy of it and modify a copy.
The other options are User Exits, BAdIs and Explicit Enhancements Options. Since these hooks to standard SAP functionality are explicitly predefined by SAP, you might be out of luck and not able to find one in the program area where you need to add your ABAP code to make the program working faster.
The other option is to use Implicit Enhancement Points that are available at the beginning and at the end of ABAP Forms, Function Modules and few other places in standard ABAP programs.
Typical Structure of ABAP Report
Typical structure of ABAP report program consists of 3 sections:
◉ Getting data selection parameters on the program initial screen to choose records from database tables and views
◉ Retrieving data from database tables and views for the selection parameters
◉ Displaying results, i.e., data retrieved from database tables and views
The above sections of the ABAP report could be quite simple or extremely complex containing multiple SELECT – JOIN SQL statements and plenty of conditional ABAP logic. Nevertheless, the structure of the report remains the same; i.e., getting data selection parameters, retrieving data from database and displaying results.
In case of interactive reports, the user has the further options to drill down and select additional data for specific value from the originally retrieved data.
The following sections of this blog show how to embed the implicit enhancements into the typical structure of standard ABAP programs to make it running faster.
From the end user point of view, nothing will change except significant performance improvements. When before, after clicking on F8 – Execute button, there was plenty of time to take lunch or coffee break, now with the optimized version of the report, it could be hardly any time to do brief starching.
The user would run the same program with the same selection screen and results will be presented as before in standard text-based output or ALV Grid.
Pseudo-Code of Long Running ABAP Program
Let us assume, that we have a long running ABAP Program as shown in the following table:
REPORT rxyz_long_running_report.
SELECT-OPTIONS:
so_f01 FOR …,
so_f02 FOR …,
…,
so_f20 FOR ….
…
START-OF-SELECTION.
PERFORM data_get.
END-OF-SELECTION.
PERFORM results_display.
The DATA_GET subroutine implements a complex SELECT – JOIN statement that retrieves thousands of records from multiple database tables and views containing millions of records.
The ABAP Pseudo-Code of DATA_GET subroutine might look as shown in the following table:
FORM data_get.
…
*-PSEUDO-CODE: Slow Running SELECT Statement---------------------------
SELECT f01 f02 f03 f04 f05 f06 f99
INTO TABLE gt_output_tab
FROM vtab1
INNER JOIN vtab2 ON vtab1-f01 = vtab2-f02
…
INNER JOIN vtab8 ON vtab6-f11 = vtab8-f24
WHERE vtab1-f01 IN so_f01
AND vtab1-f02 IN so_f02
…
AND vtab6-f04 IN so_f15.
ENDFORM.
The ABAP Pseudo-Code of RESULTS_DISPLAY subroutine might look as shown in the following table:
FORM results_display.
DATA:
lo_table TYPE REF TO cl_salv_table.
cl_salv_table=>factory( IMPORTING r_salv_table = lo_table
CHANGING t_table = gt_output_tab ).
lo_table->display( ).
ENDFORM.
How to Make it Faster with Implicit Enhancement Points
Let us try to implement Implicit Enhancement Points that would allow us to execute the above program multiple times in parallel, each time for a subset of selection screen criteria. Once all parallel processes are completed, we could concatenate results and display them to the end user. From the end user point of view, nothing would change, except that program would run significantly faster.
Additionally, we would like to be able to run the original version of the program and one with enhancement and compare results to make sure that our implementation did not cause any problem and that the results are exactly the same.
We could implement 2 enhancements – one at the beginning of DATA_GET subroutine and another one at the beginning of RESULTS_DISPLAY subroutine.
DATA_GET Subroutine with ENHANCEMENT ABAP Pseudo-Code
The ABAP Pseudo-Code for the enhancement at the beginning of DATA_GET subroutine might look as follows:
FORM data_get.
ENHANCEMENT….
CALL FUNCTION 'ZXYZ_GET_RUN_TYPE'
IMPORTING
xc_run_type = gc_run_type.
IF gc_run_type = 'CUSTOM'.
IMPORT run_mode = gc_run_mode FROM MEMORY ID sy-uname.
IF gc_run_mode IS INITIAL.
gc_run_mode = 'PARENT'.
*-----Call FM to split input & call this program for data subsets--------
CALL FUNCTION 'ZXYZ_RUN'
TABLES
xt_output_tab = gt_output_tab.
*-----Export Output Table for display in RESULTS_DISPLAY subroutine------
EXPORT gt_output_tab …
*-----EXIT, do not execute SELECT statement------------------------------
EXIT.
ENDIF.
ENDIF.
ENDENHANCEMENT.
…
*-PSEUDO-CODE: Slow Running SELECT Statement---------------------------
SELECT f01 f02 f03 f04 f05 f06 f99
INTO TABLE gt_output_tab
FROM vtab1
INNER JOIN vtab2 ON vtab1-f01 = vtab2-f02
…
INNER JOIN vtab8 ON vtab6-f11 = vtab8-f24
WHERE vtab1-f01 IN so_f01
AND vtab1-f02 IN so_f02
…
AND vtab6-f04 IN so_f15.
ENDFORM.
The above enhancement will:
◉ Check the run type for the program. If it is not a CUSTOM run type, the enhancement is ignored, and the original program is executed.
◉ If it is a CUSTOM run type, it would check the run mode. 2 run modes are possible, i.e., PARENT when the user executes it and CHILD when it is executed in parallel process from custom ABAP code in ZXYZ_RUN function module. The PARENT run mode is set in enhancement if the IMPORT statement returns no value.
◉ If it is PARENT run mode, a function module ZXYZ_RUN is executed followed by the EXIT statement to end START-OF-SELECTION event without executing SELECT – JOIN SQL statement
ZXYZ_RUN Function Module ABAP Pseudo-Code
The ZXYZ_RUN function module ABAP Pseudo-Code might look as shown in the following table:
FUNCTION zxyz_run.
*" TABLES
*" XT_OUTPUT_TAB STRUCTURE …
*-Get list of parameters-----------------------------------------------
CALL FUNCTION 'RS_REFRESH_FROM_SELECTOPTIONS'
EXPORTING
curr_report = 'rxyz_long_running_report'
TABLES
selection_table = lt_selparam_all[].
…
*-Execute program in parallel for subset of selection screen parameters
DO.
…
*---Get subset of selection screen parameters--------------------------
CALL FUNCTION 'ZXYZ_SELPARAM_SUBSET_GET'
TABLES
sel_param_all = lt_selparam_all[]
sel_param_subset = lt_selparam_subset[].
…
*---Start parallel process with subset of selection screen parameters--
CALL FUNCTION 'ZXYZ_RUN_FOR_SUBSET'
STARTING NEW TASK lc_run_id
DESTINATION IN GROUP 'parallel_generators'
PERFORMING zxyz_process_completed ON END OF TASK
EXPORTING
ut_parameters = lt_selparam_subset
TABLES
xt_output = xt_output
EXCEPTIONS
system_failure = 1
communication_failure = 2
resource_failure = 3.
…
IF gc_all_done IS NOT INITIAL. EXIT. ENDIF.
…
ENDDO.
…
*-Copy GLOBAL results table to Output Paramater Table------------------
xt_output_tab[] = gt_output_tab[].
ENDFUNCTION.
FORM zxyz_process_completed USING uc_run_id.
REFRESH lt_output_this.
RECEIVE RESULTS FROM FUNCTION 'ZXYZ_RUN_FOR_SUBSET'
TABLES
xt_output = lt_output_this.
APPEND LINES OF lt_output_this TO gt_output_tab.
ENDFORM.
First the calling program selection screen parameters are retrieved. Then in the DO-ENDDO loop until all selection screen subsets are retrieved, the ZXYZ_RUN_FOR_SUBSET function module is called in NEW TASK as ASYNC RFC. The subset of selection screen parameters is passed to it.
ZXYZ_RUN_FOR_SUBSET Function Module ABAP Pseudo-Code
Inside the ZXYZ_RUN_FOR_SUBSET function module, the original program is SUBMITTED with subsets of the selection screen parameters as shown in the following ABAP Pseudo-Code:
FUNCTION zxyz_run_for_subset.
*" IMPORTING
*" UT_PARAMETERS TYPE rsparams_tt.
*" TABLES
*" XT_OUTPUT_TAB STRUCTURE …
*-Set RUN MODE to CHILD--------------------------------------------------
EXPORT run_mode = 'CHILD' TO MEMORY ID sy-uname.
*-Submit the original program--------------------------------------------
SUBMIT rxyz_long_running_report
WITH SELECTION-TABLE ut_parameters
AND RETURN.
*-Import results table from original program-----------------------------
IMPORT gt_output_tab = xt_output_tab …
ENDFUNCTION.
Note that the run mode is set to CHILD so that the original ABAP program could process it accordingly.
The enhancement in DATA_GET in the original program does nothing if run mode is set to CHILD. It allows execution of the SELECT – JOIN SQL statement for subset of selection screen parameters as shown again below:
FORM data_get.
ENHANCEMENT….
CALL FUNCTION 'ZXYZ_GET_RUN_TYPE'
IMPORTING
xc_run_type = gc_run_type.
IF gc_run_type = 'CUSTOM'.
IMPORT run_mode = gc_run_mode FROM MEMORY ID sy-uname.
IF gc_run_mode IS INITIAL.
gc_run_mode = 'PARENT'.
*-----Call FM to split input & call this program for data subsets--------
CALL FUNCTION 'ZXYZ_RUN'
TABLES
xt_output_tab = gt_output_tab.
*-----Export Output Table for display in RESULTS_DISPLAY subroutine------
EXPORT gt_output_tab …
*-----EXIT, do not execute SELECT statement------------------------------
EXIT.
ENDIF.
ENDIF.
ENDENHANCEMENT.
…
*-PSEUDO-CODE: Slow Running SELECT Statement---------------------------
SELECT f01 f02 f03 f04 f05 f06 f99
INTO TABLE gt_output_tab
FROM vtab1
INNER JOIN vtab2 ON vtab1-f01 = vtab2-f02
…
INNER JOIN vtab8 ON vtab6-f11 = vtab8-f24
WHERE vtab1-f01 IN so_f01
AND vtab1-f02 IN so_f02
…
AND vtab6-f04 IN so_f15.
ENDFORM.
Note that the call to the ZXYZ_RUN function module that starts all parallel processes, waits until they are completed. Then, it returns the output table that contains concatenated results from all processes.
This table is exported to be picked up in RESULTS_DISPLAY subroutine implicit enhancement point and the EXIT statement prevents the execution of the SELECT – JOIN SQL statement. We already have results coming from the ZXYZ_RUN function module call.
RESULTS_DISPLAY Subroutine with ENHANCEMENT ABAP Pseudo-Code
Let us now look at the enhancement ABAP Pseudo-Code in RESULTS_DISPLAY subroutine shown in the following table:
FORM results_display.
ENHANCEMENT.
CALL FUNCTION 'ZXYZ_GET_RUN_TYPE'
IMPORTING
xc_run_type = gc_run_type.
IF gc_run_type = 'CUSTOM'.
IMPORT run_mode = gc_run_mode FROM MEMORY ID sy-uname.
*-We are in PARENT mode – bring output table from DATA_GET Enhancement--
IF gc_run_mode = 'PARENT'.
IMPORT gt_output_tab …
*-We are in CHILD Mode – do not display output. Export it to calling FM-
ELSEIF gc_run_mode = 'CHILD'.
EXPORT gt_output_tab …
EXIT.
ENDIF.
ENDIF.
ENDENHANCEMENT.
…
DATA:
lo_table TYPE REF TO cl_salv_table.
cl_salv_table=>factory( IMPORTING r_salv_table = lo_table
CHANGING t_table = gt_output_tab ).
lo_table->display( ).
ENDFORM.
For the PARENT run mode, the output table is imported from the implicit enhancement point in DATA_GET subroutine and it is displayed with ALV Grid factory method call.
For the CHILD run mode, the output table is exported to calling ASYNC RFC call. It is further passed for collection of all output tables returned by ASYNC RFC calls that are concatenated and the complete output table is returned in DATA_GET enhancement by call to ZXYZ_RUN function module.
This closes the loop and rather than running ABAP Report once for a large selection screen parameter set, the implicit enhancement point in DATA_GET subroutine in the original ABAP code calls ZXYZ_RUN function module that dissects selection screen parameters to multiple subsets and calls the original program many times in ASYNC mode collecting subset of results to the final output table in receiving ABAP subroutine executed on END OF TASK event.
Using this technique, you could often improve performance of standard ABAP reports by order of magnitude.
Normalizing Selection Screen Subsets
To make this approach optimal, it is important to split selection screen criteria, the way that each parallel call will produce a similar number of output records and this could be a tricky proposition. We do not want one parallel execution to return 100 records while another will return 20,000 records. This would caus ethe first process execute very fast and the second one very slow.
If input contains a long list of documents, it might be easy to do it, e.g., splitting list of 10,000 documents in the selection screen to 10 runs each with 1,000 documents.
If there is no obvious division into normalized subsets of input parameters, you would have to implement some pre-processing logic to be able to normalize input parameters the way that input subsets would produce similar number of output records.
For instance, if date range parameters are used, you might be able to evaluate how to split the date range to produce similar number of output records. Splitting one year date range into 12 month periods might not produce balanced outputs. One month might produce 2% of output records and another one might produce 40% of output records and performance will not improve dramatically. To get similar number of output records in all parallel runs, you could apply binary search logic to approximate date ranges yielding similar number of output records and use them when starting parallel processes.
Verifying Results
Even so, the same SELECT – JOIN SQL statements are executed, it is important to verify results between standard and custom runs. You could build a workbench that would collect input and output information as well as execution time so you could verify not only that results match but also get performance gain as shown in STD/CUST column below.
A sample Workbench Screen shows the results for Standard and Custom run of SAP generated ABAP program for user defined SAP Query:
The custom run as shown above was over 85 times faster than the standard run; i.e., 4 seconds versus 342 seconds – each one producing 1,621 output records.
Clicking on hotspot Compare Output icon, could allow comparing output from 2 different runs to the cell level:
What if It Does Not Work Faster
The solution presented in the preceding sections of this blog, describes how to improve the performance of standard SAP programs preserving, its selection screen, data retrieval logic and results display. For the user, nothing changes except getting results faster. If you upgrade to a new SAP version, your enhancement-based solution should continue working.
Sometimes, you might find that this approach does not improve performance and SELECT – JOIN logic continues causing performance problems.
Then you could implement SELECT – JOIN logic on your own in ZXYZ_RUN_FOR_SUBSET function module, rather than executing SUBMIT call to the original program (commented below), as shown in the following ABAP Pseudo-Code:
FUNCTION zxyz_run_for_subset.
*" IMPORTING
*" UT_PARAMETERS TYPE rsparams_tt.
*" TABLES
*" XT_OUTPUT_TAB STRUCTURE …
*-Set RUN MODE to CHILD--------------------------------------------------
EXPORT run_mode = 'CHILD' TO MEMORY ID sy-uname.
*-Submit the original program--------------------------------------------
* SUBMIT rxyz_long_running_report
* WITH SELECTION-TABLE ut_parameters
* AND RETURN.
*-PSEUDO-CODE: Your Own MUCH FASTER SELECT – JOIN Logic------------------
SELECT f01 f02 f03 f04 f05 f06 f99
INTO TABLE gt_output_tab
FROM vtab1
INNER JOIN vtab2 ON vtab1-f01 = vtab2-f02
…
INNER JOIN vtab8 ON vtab6-f11 = vtab8-f24
WHERE vtab1-f01 IN so_f01
AND vtab1-f02 IN so_f02
…
AND vtab6-f04 IN so_f15.
*-Import results table from original program-----------------------------
IMPORT gt_output_tab = xt_output_tab …
ENDFUNCTION.
Feel free to divide SELECT – JOIN logic to several statements on a few tables each rather than execute one big SELECT – JOIN statement on 20 plus tables. This could improve dramatically the performance.
Beside change in the ZXYZ_RUN_FOR_SUBSET function module, everything else stays the same. The CHILD logic in enhancements will not be executed because there will be no SUBMIT calls to the original program.
In case of SAP upgrade, it is more likely that you would have to modify the above function module.
Both approaches use the original application’s selection screen, and output list. Both approaches partition selection screen parameters to subsets and execute data retrieval logic multiple times on subsets of parameters.
The first approach executes original data retrieval procedure, the second one implements its own data retrieval procedure.
With one of the above discussed approaches you could improve performance by order of magnitude if not more and in some cases even over 100 times.
No comments:
Post a Comment