Language-based security
In computer science, language-based security (LBS) is a set of techniques that may be used to strengthen the security of applications on a high level by using the properties of programming languages. LBS is considered to enforce computer security on an application-level, making it possible to prevent vulnerabilities which traditional operating system security is unable to handle.
Software applications are typically specified and implemented in certain programming languages, and in order to protect against attacks, flaws and bugs an application’s source code might be vulnerable to, there is a need for application-level security; security evaluating the applications behavior with respect to the programming language. This area is generally known as language-based security.
Motivation
The use of large software systems, such as SCADA, is taking place all around the world[1] and computer systems constitute the core of many infrastructures. The society relies greatly on infrastructure such as water, energy, communication and transportation, which again all rely on fully functionally working computer systems. There are several well known examples of when critical systems fail due to bugs or errors in software, such as when shortage of computer memory caused LAX computers to crash and hundreds of flights to be delayed (April 30, 2014).[2][3]
Traditionally, the mechanisms used to control the correct behavior of software are implemented at the operating system level. The operating system handles several possible security violations such as memory access violations, stack overflow violations, access control violations, and many others. This is a crucial part of security in computer systems, however by securing the behavior of software on a more specific level, even stronger security can be achieved. Since a lot of properties and behavior of the software is lost in compilation, it is significantly more difficult to detect vulnerabilities in machine code. By evaluating the source code, before the compilation, the theory and implementation of the programming language can also be considered, and more vulnerabilities can be uncovered.
"So why do developers keep making the same mistakes? Instead of relying on programmers’ memories, we should strive to produce tools that codify what is known about common security vulnerabilities and integrate it directly into the development process."
— D. Evans and D. Larochelle, 2002
Objective of LBS
By using LBS, the security of software can be increased in several areas, depending on the techniques used. Common programming errors such as allowing buffer overflows and illegal information flows to occur, can be detected and disallowed in the software used by the consumer. It is also desirable to provide some proof to the consumer about the security properties of the software, making the consumer able to trust the software without having to receive the source code and self checking it for errors.
A compiler, taking source code as input, performs several language specific operations on the code in order to translate it into machine readable code. Lexical analysis, preprocessing, parsing, semantic analysis, code generation, and code optimization are all commonly used operations in compilers. By analyzing the source code and using the theory and implementation of the language, the compiler will attempt to correctly translate the high-level code into low-level code, preserving the behavior of the program.

During compilation of programs written in a type-safe language, such as Java, the source code must type-check successfully before compilation. If the type-check fails, the compilation will not be performed, and the source code needs to be modified. This means that, given a correct compiler, any code compiled from a successfully type-checked source program should be clear of invalid assignment errors. This is information which can be of value to the code consumer, as it provides some degree of guarantee that the program will not crash due to some specific error.
A goal of LBS is to ensure the presence of certain properties in the source code corresponding to the safety policy of the software. Information gathered during the compilation can be used to create a certificate that can be provided to the consumer as a proof of safety in the given program. Such a proof must imply that the consumer can trust the compiler used by the supplier and that the certificate, the information about the source code, can be verified.
The figure illustrates how certification and verification of low-level code could be established by the use of a certifying compile. The software supplier gains the advantage of not having to reveal the source code, and the consumer is left with the task of verifying the certificate, which is an easy task compared to evaluation and compilation of the source code itself. Verifying the certificate only requires a limited trusted code base containing the compiler and the verifier.
Techniques
Security type system
- Main article: Security type system
A security type system is a kind of type system that can be used by software developers in order to control the information flow in their code. A security type system consists of several rules that will be used to verify a given information flow policy in a computer program, usually at compile-time. This can reveal if there is any violation of confidentiality or integrity in the program.
Securing low-level code
Vulnerabilities in low-level code are bugs or flaws that will lead the program into a state where further behavior of the program is undefined by the source programming language. The behavior of the low-level program will depend on compiler, runtime system or operating system details. This allows for an attacker to drive the program towards an undefined state, and exploit the behavior of the system.
Common exploits of insecure low-level code lets an attacker perform unauthorized reads or writes to memory addresses. The memory addresses can be either random or chosen by the attacker.
Using safe languages
An approach to achieve secure low-level code is to use safe high-level languages. A safe language is considered to be completely defined by its programmers's manual.[4] Any bug that could lead to implementation-dependent behavior in a safe language will either be detected at compile time or lead to a well-defined error behavior at run-runtime. In Java, if accessing an array out of bounds, an exception will be thrown. Examples of other safe languages are C#, Haskell and Scala.
Defensive execution of unsafe languages
During compilation of an unsafe language run-time checks is added to the low-level code to detect source-level undefined behavior. An example is the use of canaries, which can terminate a program when discovering bounds violations. A downside of using run-time checks such as in bounds-checking is that they impose considerable performance overhead.
Memory protection, such as using non-executable stack and/or heap, can also be seen as additional run-time checks. This is used by many modern operating systems.
Obfuscating execution platform details
This principle is based on making the addresses used by a process non-consistent and harder to find for an attacker. The most popular instantiation of this principle is Address space layout randomization (ASLR). With ASLR, a process's stack, heap and libraries are always loaded in different memory offsets.
Isolated execution of modules
The general idea is to separate sensitive code from application data by placing them in different modules. When assuming that each module has total control over the sensitive information it contains, it is possible to specify when and how should leave the module. An example is a cryptographic module that can prevent keys from ever leaving the module unencrypted.
Certifying compilation
Certifying compilation is the idea of producing a certificate during compilation of source code, using the information from the high-level programming language semantics. This certificate should be enclosed with the compiled code in order to provide a form of proof to the consumer that the source code was compiled according to a certain set of rules. The certificate can be produced in different ways, e.g. through Proof Carrying Code (PCC) or Typed Assembly Language (TAL).
Proof Carrying Code
The main aspects of PCC can be summarized in the following steps:
- The supplier provides a program with various annotations produced by a certifying compiler.
- The consumer provides a verification condition, based on a security policy. This is sent to the supplier.
- The supplier runs the verification condition in a theorem prover to produce a proof to the consumer that the program in fact satisfies the security policy.
- The consumer then runs the proof in a proof checker to verify the proof validity.
An example of a certifying compiler is the Touchstone compiler, that provides a PCC formal proof of type- and memory safety for programs implemented in Java.
Typed Assembly Language
TAL is applicable to programming languages that make use of a type system. After compilation, the object code will carry a type annotation that can be checked by an ordinary type checker. The annotation produced here is in many ways similar to the annotations provided by PCC, with some limitations. However, TAL can handle any security policy that may be expressed by the restrictions of the type system, which can include memory safety and control flow, among others.
Papers
- "Language based security", Dexter Kozen, Cornell University
- "Language-based security - A research manifesto for a Dagstuhl Seminar, 2003"
Books
- "Secure information flow: definition, enforcement, and preservation through compilation", G. Barthe, B. Grégoire, A. Matos,T. Rezk, 2011
References
- ^ "Can we learn from SCADA security incidents?" (PDF). www.oas.org. enisa.
- ^ "Air Traffic Control System Failed". www.computerworld.com. Retrieved 12 May 2014.
- ^ "Software Bug Contributed to Blackout". www.securityfocus.com. Retrieved 11 February 2004.
- ^ Pierce, Benjamin C. (2002). Types and Programming Languages. The MIT Press.