How to Perform a Comprehensive Code Security Audit
Preface
For many security researchers, fuzzing is an important vulnerability discovery method, but each method has its limitations. First, in order to perform fuzzing effectively, it is necessary to accurately find the attack surface and perform automated testing on this basis. The process of finding the attack surface is inseparable from code auditing. Secondly, after the automated test finds the crash, the analysis and report writing process requires manual auditing to understand the code logic and the root cause of the vulnerability. Finally, for memory-safe languages such as Java or Go, the application scenarios of fuzzing are even more limited, and complex logic design problems can only be discovered through code auditing.
This article is a methodological summary of code auditing. There is no restriction on the language of the specific code, which can be C/C++, Java, or PHP. Source, Sink, and automated analysis tools unique to specific languages are not within the scope of this article. Interested friends can pay attention to the subsequent article on code security auditing techniques.
Make plans before you act
Sun Tzu's Art of War says: Make plans before you act. There is also a famous saying in modern management: "If you can't measure it, you can't manage it."
This does not mean that we should ignore complex security issues and only look at simple vulnerabilities. The key is to clarify your audit objectives in advance. There are two aspects to an audit task: the auditor's own time investment cost and the complexity of the application to be audited. The combination of the two can effectively evaluate the expected output of this audit.
Applications to be audited are generally divided into the following categories based on access rights:
- Source Code : We only have the target source code, which usually does not include a complete compilation and testing environment, and due to the lack of necessary key dependent components, it is often impossible to build a runnable program. In this case, we can only use static analysis to audit.
- Binary Programs : We only have the binary files of the target application, such as APK, EXE, jar package or IoT system firmware. In this case, auditing is usually performed through dynamic analysis and reverse engineering.
- Source Code / Binary Programs : We have access to both the source code of the target application and a runnable binary program, which provides the most advantageous access rights for security audits. Usually the target is open source software, including a complete build environment and dependencies.
- Black Box : We have neither the target source code nor executable binary programs, so we can only perform blind testing through external interfaces. This is more common in Web applications.
This article mainly focuses on code security audits with source code. Of course, some of the strategies and methods used in source code audits are also applicable to other types of applications. When the source code is accessible, an obvious metric is to evaluate the workload by the number of lines of code, although this indicator does not perfectly represent the complexity of the application. After all, the complexity of 1,000 lines of business code is different from that of 1,000 lines of compiler code.
A code auditor can audit about 100 to 1,000 lines of code in an hour, depending on the auditor's experience and understanding of the code. For individuals, the best way to evaluate audit efficiency is to keep a record of your audit time for different components, which can not only help you better understand your own rhythm, but also provide a time reference for subsequent code audit plans.
There are many reasons that affect the speed of code auditing, such as:
- Code language: For memory-unsafe languages such as C/C++, more attention needs to be paid to the underlying details; while memory-safe languages such as Java and Python focus more on the upper-level logic implementation.
- Coding style: Projects with clean and well-commented code usually take less time to audit than other projects;
It is worth mentioning that although there is a positive correlation between the amount of code and the audit time, as the amount of audit code for a project increases, the efficiency of the audit will also increase, because at this time, there is a deeper understanding of the project, and the time to audit 100,000 lines of code is usually not twice that of 50,000 lines of code. Therefore, when formulating an audit plan and time investment, the above-mentioned related factors need to be carefully considered.
Know when to stop and you will gain something
In the next section, we will introduce some specific strategies and techniques for code security auditing. In practice, some methods may be more effective than others, but experience tells us that it is better to use multiple audit strategies and switch audit methods periodically for a variety of reasons:
- You can only maintain a high level of mental focus for a limited time.
- Variety helps you maintain discipline and passion.
- Different vulnerability types may be easier to spot from other perspectives.
- Different people have different ways of thinking.
From a global perspective, the code security audit approach is a simple three-step cycle:
- Plan: Audit planning. Based on the existing information, determine the code audit strategy to be used in this phase, as well as the small audit goals, such as completing the audit of a certain module or file (directory), understanding the function of a certain structure, etc.
- Work: Execute the audit according to the audit strategy formulated earlier, with the focus on keeping good audit records during the process.
- Reflect: After completing the audit of this phase, reflect on whether you have used your time effectively and whether you have deviated from the direction. Then adjust the audit plan for the next phase based on the experience learned from the previous audit, such as re-dividing the structure, focusing on security-related sub-modules, etc.
Everything has its own law
- Starting point: the starting point of code tracing.
- End point: The goal of the strategy, or the point where the tracking code ends.
- Method: code tracing method, tracing data flow, control flow, tracing direction is forward tracing or reverse tracing.
- Objective: What type of vulnerabilities does this audit policy target?
- Difficulty: Indicates the difficulty of executing the audit strategy, generally from 1 star to 5 stars, indicating increasing difficulty.
- Speed: Indicates the execution speed of the audit policy, also from 1 star to 5 stars, from slow to fast.
- Understanding: Indicates the code understanding brought by the audit strategy. Generally, strategies that bring more understanding are more difficult, but they can also help researchers find more complex vulnerabilities.
Strategy 1: Top-down
Data analysis
key | val |
---|---|
starting point | Data entry points, such as function parameters, environment variables, etc. |
end | The final vulnerability trigger point, such as privilege escalation, injection, memory corruption, etc. |
method | Forward analysis, data flow sensitivity, control flow sensitivity |
Target | Discover security vulnerabilities that can be triggered by malicious input |
Difficulty | ★★★★ |
speed | ★ |
understand | ★★★★ |
Module Analysis
key | val |
---|---|
starting point | Beginning of the file |
end | End of file |
method | Forward analysis, data flow insensitive, control flow insensitive |
Target | Read every function in the module and only document potential problems |
Difficulty | ★★★★★ |
speed | ★★ |
understand | ★★★★★ |
Citation Analysis
key | val |
---|---|
starting point | An object implements |
end | All references to this object (xref) |
method | Forward analysis, data flow insensitive, control flow insensitive |
Target | Learn the interfaces and implementations of important objects, and find errors caused by the use of interfaces |
Difficulty | ★★★★ |
speed | ★★ |
understand | ★★★★★ |
Algorithm Analysis
key | val |
---|---|
starting point | The beginning of the algorithm |
end | End of the algorithm |
method | Forward analysis, data flow insensitive, control flow insensitive |
Target | Analyze algorithm implementations and identify potential design and implementation issues |
Difficulty | ★★★★★ |
speed | ★★ |
understand | ★★★★★ |
Strategy 2: Bottom-up
Sensitive calls
key | val |
---|---|
starting point | Potential vulnerability points |
end | Arbitrary user-controllable input |
method | Reverse analysis, data flow sensitivity, control flow sensitivity |
Target | Given a list of potential vulnerabilities, analyze whether they can be triggered and exploited |
Difficulty | ★★ |
speed | ★★★★ |
understand | ★★ |
Missing scan tool
key | val |
---|---|
starting point | Potential vulnerability points |
end | Arbitrary user-controllable input |
method | Reverse analysis, data flow sensitivity, control flow sensitivity |
Target | Given a list of potential vulnerabilities, analyze whether they can be triggered and exploited |
Difficulty | ★ |
speed | ★★★★ |
understand | ★ |
Interface Analysis
key | val |
---|---|
starting point | Application object interface or function call |
end | Arbitrary user-controllable input |
method | Reverse analysis, data flow sensitivity, control flow sensitivity |
Target | Given a list of potential vulnerabilities, analyze whether they can be triggered and exploited |
Difficulty | ★ |
speed | ★★★★ |
understand | ★ |
Strategy 3: See the big picture from the small details
System Modeling
key | val |
---|---|
starting point | Starting point of the module to be audited |
end | Security Vulnerabilities |
method | Adapt to changing circumstances |
Target | Restore the abstract behavior of the module through behavioral modeling and find potential logic and functional loopholes |
Difficulty | ★★★★ |
speed | ★★ |
understand | ★★★★★ |
Security Boundary
key | val |
---|---|
starting point | All safety-related checksums and check codes |
end | Security Vulnerabilities |
method | Adapt to changing circumstances |
Target | Use known security-related codes to infer the security boundaries of the target design |
Difficulty | ★★★★ |
speed | ★★★ |
understand | ★★★★★ |
Design Verification
key | val |
---|---|
starting point | Starting point of the module |
end | End of module |
method | Forward analysis, control flow sensitivity, data flow sensitivity |
Target | Discover vulnerabilities in code implementation that differ from the design |
Difficulty | ★★★ |
speed | ★★★ |
understand | ★★★ |
Post a Comment