Source-to-source compiler
![]() | It has been suggested that this article be merged with Transcompiler. (Discuss) Proposed since October 2010. |
A source-to-source compiler is a type of compiler that takes a high level programming language as its input and outputs a high level language. For example, an automatic parallelizing compiler will frequently take in a high level language program as an input and then transform the code and annotate it with parallel code annotations (e.g., OpenMP) or language constructs (e.g. Fortran's DOALL statements)[1].
Another purpose of source-to-source-compiling is translating legacy code to use the next version of the underlying programming language or an API that breaks backward compatibility. It will perform automatic code refactoring which is useful when the programs to refactor are outside the control of the original implementer (for example, converting programs from Python 2 to Python 3, or converting programs from an old API to the new API) or when the size of the program makes it impractical or time consuming to refactor it by hand.
Examples
DMS Software Reengineering Toolkit
DMS Software Reengineering Toolkit is a source-to-source program transformation tool, parameterized by explicit source and target (may be the same) computer language definitions. It can be used for translating from one computer language to another, for compiling domain-specific languages to a general purpose language, or for carrying out optimizations or massive modifications within a specific language. DMS has a library of language definitions for most widely used computer languages (including full C++, and a means for defining other languages which it does not presently know.
LLVM
Low Level Virtual Machine (LLVM) can translate from any language supported by gcc 4.2.1 (Ada, C, C++, Fortran, Java, Objective-C, or Objective-C++) to any of: C, C++, or MSIL by way of the "arch" command in llvm-gcc.
% llvm-g++ x.cpp -o program % llc -march=c program.bc -o program.c % cc x.c % llvm-g++ x.cpp -o program % llc -march=msil program.bc -o program.msil
The "arch" command can also emit assembly language code in these architectures:
- x86: 32-bit X86 for Pentium Pro CPUs and above
- x86-64: 64-bit X86 for EM64T and AMD64 CPUs
- sparc: SPARC
- ppc32: PowerPC 32
- ppc64: PowerPC 64
- alpha: Alpha (experimental)
- ia64: IA-64 aka Itanium (experimental)
- arm: ARM
- thumb: Thumb
- mips: MIPS architecture CPUs
- mipsel: MIPSel
- cellspu: STI CBEA Cell SPU (experimental)
- pic16: PIC16 14-bit (experimental)
- cooper: PIC16 Cooper (experimental)
- xcore: XCore
Refactoring tools
The refactoring tools automate transforming source code into another:
- The Python 3000 2to3 tool transform non forward-compatible Python 2 code into Python 3 code
- Qt's qt3to4 tool convert non forward-compatible usage of the Qt3 API into Qt4 API usage.
- Coccinelle uses semantic patches to describe refactoring to apply to C code. It's been applied successfully to refactor the drivers of the Linux kernel due to kernel API changes[2].
- RefactoringNG is a Netbeans module for refactoring Java code where you can write transformations rules of a program's abstract syntax tree.
See also
- Program transformation
- ROSE compiler framework A source-to-source compiler framework
References
- ^ "Types of compilers". compilers.net. 1997–2005. Retrieved 28 October 2010.
{{cite web}}
: CS1 maint: date format (link) - ^ Valerie Henson (January 20, 2009). "Semantic patching with Coccinelle". lwn.net. Retrieved 28 October 2010.