Jump to content

C++ syntax

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 129.97.124.174 (talk) at 04:29, 27 February 2025 (References). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
A snippet of C++ code

The syntax of C++ is the set of rules defining how a C++ program is written and compiled.

C++ syntax is largely inherited from the syntax of its ancestor language C, and has influenced the syntax of several later languages including but not limited to Java, C#, and Rust.

Basics

Much of C++'s syntax aligns with C syntax, as C++ provides backwards compatibility with C.

Identifier

An identifier is the name of an element in the code. There are certain standard naming conventions to follow when selecting names for elements. Identifiers in C++ are case-sensitive.

An identifier can contain:

  • Any Unicode character that is a letter (including numeric letters like Roman numerals) or digit.
  • Currency sign (such as ¥).
  • Connecting punctuation character (such as _).

An identifier cannot:

  • Start with a digit.
  • Be equal to a reserved keyword, null literal or Boolean literal.

The identifier nullptr is not a reserved word, but is a global constant that refers to a null pointer literal.

Keywords

The following words may not be used as identifier names or redefined.[1]

  • alignas
  • alignof
  • and
  • and_eq
  • asm
  • auto
  • bitand
  • bitor
  • bool
  • break
  • case
  • catch
  • char
  • char8_t
  • char16_t
  • char32_t
  • class
  • compl
  • concept
  • const
  • consteval
  • constexpr
  • constinit
  • const_cast
  • continue
  • contract_assert
  • co_await
  • co_return
  • co_yield
  • decltype
  • default
  • default
  • do
  • double
  • dynamic_cast
  • else
  • enum
  • explicit
  • export
  • extern
  • false
  • float
  • for
  • friend
  • goto
  • if
  • import
  • inline
  • int
  • long
  • module
  • mutable
  • namespace
  • new
  • noexcept
  • not
  • not_eq
  • nullptr
  • operator
  • or
  • or_eq
  • private
  • protected
  • public
  • register
  • reinterpret_cast
  • requires
  • return
  • short
  • signed
  • sizeof
  • static
  • static_assert
  • static_cast
  • struct
  • switch
  • template
  • this
  • thread_local
  • throw
  • true
  • try
  • typedef
  • typeid
  • typename
  • union
  • unsigned
  • using
  • virtual
  • void
  • volatile
  • wchar_t
  • while
  • xor
  • xor_eq

Identifiers with special meaning

The following words may be used as identifier names, but bear special meanings in certain contexts.

  • final
  • override
  • pre
  • post
  • trivially_relocatable_if_eligible
  • replaceable_if_eligible

Preprocessor directives

The following tokens are recognised by the preprocessor in the context of preprocessor directives.

  • #if
  • #elif
  • #else
  • #endif
  • #ifdef
  • #ifndef
  • #elifdef
  • #elifndef
  • #define
  • #undef
  • #include
  • #embed
  • #line
  • #error
  • #warning
  • #pragma
  • #defined
  • #__has_include
  • #__has_cpp_attribute
  • #__has_embed

Code blocks

The separators { and } signify a code block and a new scope. Class members and the body of a method are examples of what can live inside these braces in various contexts.

Inside of method bodies, braces may be used to create new scopes, as follows:

void doSomething() {
    int a;

    {
        int b;
        a = 1;
    }

    a = 2;
    b = 3; // Illegal because the variable b is declared in an inner scope.
}

Classes

Templates and concepts

Modules

Since C++20, C++ has offered modules as a modern alternative to precompiled headers[2], however they differ from precompiled headers in that they do not require the preprocessor directive #include, but rather are accessed using the word import. A module must be declared using the word module to indicate that a file is a module.

Modules provide the benefits of precompiled headers in that they compile much faster than traditional headers which are #included and are processed much faster during the linking phase[3], but also greatly reduce boilerplate code, allowing code to be implemented in a single file, rather than being separated across an interface and implementation file which was typical prior to the introduction of modules. Furthermore, modules eliminate the necessity to use #include guards or #pragma once, as modules do not directly modify the source code, unlike #includes, which during the preprocessing step must include source code from the specified header. Thus, importing a module is not handled by the preprocessor, but is rather handled during the compilation phase. Modules, unlike headers, do not have to be processed multiple times during compilation.[3]

C++ modules often have the extension .cppm, though some alternative extensions include .ixx and .mxx.[4] All symbols within a module that the programmer wishes to be accessible outside of the module must be marked export.

Modules do not allow for granular imports of specific namespaces, classes, or symbols within a module, unlike Java or Rust which do allow for the aforementioned. Importing a module imports all symbols marked with export, making it akin to a wildcard import in Java or Rust.

Since C++23, the C++ standard library has been exported as a module as well, though as of currently it must be imported in its entirety (using import std;). The module names std and std.* are reserved by the C++ standard[5], however most compilers allow a flag to override this.[6]

Modules may not export or leak macros, and because of this the order of modules does not matter (however convention is typically to begin with standard library imports, then all project imports, then external dependency imports in alphabetical order).[3] If a module must re-export an imported module, it can do so using export import, meaning that the module is first imported and then exported out of the importing module.[2]

Modules may have partitions, which separate the implementation of the module across several files. Module partitions are declared using the syntax A:B, meaning the module A has the partition B. Module partitions cannot individually be imported outside of the module that owns the partition itself, meaning that anyone who desires to access code that is part of a module partition must import the entire module that owns the partition.[2]

To link the module partition B back to the owning module A, write import :B; inside the file containing the declaration of module A. These import statements may themselves be exported by the owning module, even if the partition itself cannot be imported directly.

C++ modules do not have a hierarchical system, but typically use a hierarchical naming convention, like Java's packages. In other words, C++ does not have "submodules", meaning the . symbol which may be included in a module name bears no syntactic meaning and is used only to suggest the association of a module.[2] For example, the modules A and A.B in theory are disjoint modules and need not necessarily have any relation, however such a naming scheme is often employed to suggest that the module A.B is somehow related to the module A.

A simple example of using C++ modules is as follows:

Hello.cppm

export module Hello;

import std;

export namespace Hello {
    void printHello() {
        std::println("Hello world!");
    }
}

Main.cpp

import Hello;

int main() {
    Hello::printHello();
}

Everything above the line export module Hello; in the file Hello.cppm is referred to as what is "outside the module purview", meaning what is outside of the scope of the module. Typically, if headers must be included, all #includes are placed outside the module purview between a line containing only the statement module; and the declaration of export module, like so:

module;

#include <print>

#include "MyHeader.h"

export module MyModule;

The name of a module is not tied to the name of the file, unlike Java in which the name of a file must match the name of the class it declares, and the package it belongs to must match the path it is located in.

The file containing main() cannot be a module.

All code which does not belong to any module belongs to the so-called "unnamed module" (also known as the global module fragment), and thus cannot be imported by any module.

Headers may also be imported using import, even if they are not declared as modules - these are called "header units".[7] The syntax is similar to including a header, with the difference being that #include is replaced with import and a semicolon is placed at the end of the statement. The semantics of searching for the file depending on whether quotation marks or angle brackets are used apply here as well.

See also

References

  1. ^ cppreference.com (2025). "C++ keywords". Retrieved 2025-02-26.
  2. ^ a b c d cppreference.com (2025). "Modules (since C++20)". Retrieved 2025-02-20.
  3. ^ a b c "Compare header units, modules, and precompiled headers". Microsoft. 12 February 2022.
  4. ^ "Overview of modules in C++". Microsoft. 24 April 2023.
  5. ^ cppreference.com (2025). "C++ Standard Library". Retrieved 2025-02-20.
  6. ^ "Standard C++ modules".
  7. ^ "Walkthrough: Build and import header units in Microsoft Visual C++". Microsoft. 12 April 2022.