Wednesday, 24 April 2019

How to scan ABAP code?

For different purposes , scanning  ABAP codes in an SAP system is useful to gather different kind of information.

For example:

◈ To detect security vulnerabilities in ABAP level
◈ To detect hard coded values in ABAP codes
◈ To get a list of external RFC calls used in custom (Z) developments
◈ To get a list of database tables – fields used (before S4HANA transformation for example)

In my company Novaline, we’ve coded a tool named “ABAP Optimizer”,
which scans ABAP codes for performance vulnerabilities and then further modifying ABAP code automatically for performance optimization.

In this blog I’m going to share some coding details about scanning ABAP codes.

Basic : Reading an ABAP source code


Basically  we can read source code of an ABAP include with the ABAP command
“READ REPORT”.

Below is a simple report :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

To get source code of this, we can use “READ REPORT” command as below :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

READ REPORT command fills string internal table “gt_source” with source code of the report “ZTEST”.

By doing this, we only have source code as a pure string table. There is no interpretation about code.

Yet,  just to code a simple ABAP scanner which only searches some specific texts in ABAP code, we can simply read a list of custom reports from SAP view TRDIR and read their ABAP codes by “READ REPORT” command one by one .. then finally we can make simple text searches in code.

SAP program “RS_ABAP_SOURCE_SCAN” is already doing this search , you can refer to it as an example.

So what about interpreting the code ?  Let’s go in more details.

Interpreting ABAP Code


Long years ago, when I was first trying to code an SAP security scanner tool, I spent time on “SAP Code Inspector” tool and tried to understand how it analyses the ABAP code and make detections.

Below are some important concepts to know before we go further :

Concepts :

Tokenization : Means parsing an ABAP code from a pure string to meaningful structures. If you like to read theories as me then check out this page.

◈ Statement : Every ABAP command that we finish with a period
◈ Token : Every word in ABAP statement is a token .. doesn’t matter whether it’s an ABAP keyword, literal or else like a variable name
◈ Structure : Some statements are bound to each other .. for example an ABAP LOOP statement ends with an ENDLOOP statement somewhere in the code , so they both presents a structure

So imagine an ABAP code part as below :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

And let’s put the concepts on it  :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

You can parse string to get these structures by yourself after getting code by READ REPORT command, or you can use existing classes in SAP code inspector to make it simpler.  Check out standard class “CL_CI_SCAN” for this.

So by using this logic and information,  how to code an ABAP interpreter ?

After tokenizing  the code as above,  second step should be analyzing the command or statements you are interested in. Interpretation depends on what you are trying to detect.

Let’s continue with an example scenario as below.

Basic Example :

Let’s code an ABAP scanner which detects SELECT commands used with “*”  to read all the database table fields.

Steps should be like below :

1. Tokenize the code
2. Loop on all the statements and detect SELECT commands
3. Parse SELECT commands and find the ones used with star “*”

Imagine an small ABAP program as below :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

Second SELECT statement to read VBAP table is used with a star “*”,  we are trying to detect these SELECT statements.

And let’s code the scanner it in ABAP :

( I’m sharing code as images to make it more understandable, but if you request I can also share code as text )

Tokenize the ABAP code :


On report code below , parameter “p_prog” will be an existing ABAP program name in system.

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

And let’s run it for the program “ZTEST” above and display the object “gr_scan” in debugger, “statements” and “tokens” tables are visible on this object :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

Let’s display “tokens” table :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

And display “statements” table noticing “from” and “to” fields :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

“from” and “to” fields in “statements” table shows the index in “tokens” table for every statement.

Find SELECT statements and check if it uses “*” :


Basically “statements” table keeps every command in the selected ABAP program,
and we can access every token in a statement by using from / to fields on “tokens” table.

To decide whether a statement is a SELECT or not, we can just check the first token.

Also by checking second token we can see if SELECT is used by a star ( * ) .

Remaining part of the scanner code is below :

SAP ABAP Tutorial and Material, SAP ABAP Certifications, SAP ABAP Learning, SAP HANA Guides

No comments:

Post a Comment