Talk:Thread (computing)

This is the talk page for discussing improvements to the Thread (computing) article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1: 2 months

Template:Vital article

This article has not yet been rated on Wikipedia's content assessment scale.
It is of interest to the following WikiProjects:

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Computing: Software / CompSci C‑class Mid‑importance

This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing

C

This article has been rated as C-class on Wikipedia's content assessment scale.

Mid

This article has been rated as Mid-importance on the project's importance scale.

This article is supported by WikiProject Software (assessed as High-importance).

This article is supported by WikiProject Computer science (assessed as High-importance).

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Computer science C‑class High‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science

C

This article has been rated as C-class on Wikipedia's content assessment scale.

High

This article has been rated as High-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Untitled

Thanks for fleshing out the stub, Lee. I knew this stuff, but it wasn't on the tip of my tongue (one step further buried in the memory banks). User:Ed Poor

Renaming

Just for note. I renamed the article to thread (computer science) because threading not necessarily only in software enginnering. -- Taku 16:14, Mar 28, 2004 (UTC)

System call for a context switch?

The article says "Typically fibers are implemented entirely in userspace. As a result, context switching between fibers in a process is extremely efficient: because the kernel is oblivious to the existence of fibers, a context switch does not require a system call." Why would a context switch require a system call in any situation? Is it talking about non-preemptive multitasking? If so, wouldn't it be better if the article mentioned it? If the article isn't intended to say that a system call is required for a context switch, then it would be better if it were cleared up in the article. It's misleading. It would help if the article (that section) is clearer anyway.

Yeah, it would be more clear to say "a context switch does not require a kernel entry" or some such. I'm not sure that it matters whether or not the thread implementation is preemptively scheduled or not: in a cooperatively scheduled implementation of kernel threads, there is still some kernel state associated with the currently-running thread, and so a kernel entry is some kind is required to switch that state, right? The point is somewhat academic, anyway, as cooperatively-scheduled kernel thread implementations are rare AFAIK. Neilc 04:23, 20 Jun 2005 (UTC)

I made that change. I was trying to figure out how to get something about pre-emptive scheduling of threads involving the kernel programming a hardware interrupt in order to stop the current thread, but couldn't figure out how to word that decently. :) Dianne Hackborn 05:51, 22 Jun 2005 (UTC)

Books

I've added Three .NET threading resources - two of these are admittedly my own published works. I've added them only because there were no other .NET programming resources listed. Feel free to remove them if you feel they are inappropriate. Both of these books are out of print but you can still find them quite often at book stores and from secondary book stores. I have a fourth book published that is still on the market that I didn't list because it covers two other areas as well as threading. The title is Pro .Net 1. 1 Remoting, Reflection, and Threading and the ISBN is 1-590-59452-5. While it is more "current", the fact that it covers two other topics doesn't seem to make it appropriate for this article. Once again, feel free to add it if you feel otherwise. - Sleepnomore 15:46, August 26, 2005 (UTC)

The process with threads image

The potentially leads to misinterpretations.

As it is, unwary readers may think processes are always made up of threads, which is not true. Actually, in many contexts, threads break processes into tasks.

I propose the image, as it is now, to be removed. M. B., Jr. (talk) 23:58, 18 December 2022 (UTC)[reply]

And in the context of the Mach kernel, a task has multiple threads - and a task corresponds somewhat to a process; Darwin's XNU kernel constructs each UN*X process atop a Mach task (and Darwin's pthread library constructs pthreads atop Mach threads).

So what are your defintions of process, thread, and task here? Guy Harris (talk) 01:05, 19 December 2022 (UTC)[reply]

Hi, Guy Harris. I'm glad we agree, since I did not state the opposite whatsoever. As for definitions, I'm not sure what you mean by "your". This is an article about the concept, not about particular implementations. In Modern Operating Systems, Tanenbaum writes that "several different models are possible". Notwithstanding, implementations may be provided in a proper context-wise manner, e.g., precisely explaining the case in the image's subtitle. This way, readers are able to understand the Mach way is not a general rule. Regards, M. B., Jr. (talk) 20:14, 19 December 2022 (UTC)[reply]

You said "many contexts, threads break processes into tasks". I noted that, in Mach, one task, in Mach terminology, can have multiple threads, in Mach terminology, so, in that context, a thread doesn't "break processes into tasks" - there is, at least in Darwin (and probably other systems with Mach-based kernels), there's a one-to-one correspondence between Mach tasks and processes.

And what I mean by "your defintions of process, thread, and task here" is "If you mean by "process" that which is described in process (computing), and if you mean by "thread" that which is described in thread (computing), what do you mean by "task" when you say that "threads break processes into tasks"? And if that isn't what you mean by "process" or "thread", what do you mean?"

Where, and in which edition, does Tanenbaum write that "several different models are possible"? I need to know precesly what he was referring to. He could be referring to the different types of threads mentioned in thread (computing) § Processes, kernel threads, user threads, and fibers, or to the threading models mentioned in thread (computing) § Threading models, or to something else?

And the image is pretty straightforward - a process has multiple threads of control, so that the process can be doing multiple things in parallel. By your use of the word "task", are you referring to the "things" being performed by the threads?

I only brought up Mach because it uses the term "task" as a technical term; OS/360 and its successors use "task" in a different sense, where it seems to cover both what we might think of as processes and what we might think of as threads. The point is that "task" doesn't seem to have a commonly-used technical definition in the fashion that "process" and "thread" appear to; if you wan tto be clearly understood here, a good first step might be to indicate what you mean by "task", so we can determine how "processes are made up of threads" and "threads break processes into tasks" mean something different? Guy Harris (talk) 22:55, 19 December 2022 (UTC)[reply]

Hi, Guy Harris. Thanks for your feedback. Discussions like this one make Wikipedia awesome. Indeed, the concept of task is context-subordinated. I'm referring to the OS concept, not the application one. And yes, the process to task conversion constitutes a thread hijack technique (and maybe a thread rescue technique as well?). Because processes themselves live in the main memory. Despite the name, the CPU deals with threads and tasks, not processes. If one wants processes' data to be actually processed, such data should be somehow turned into threads and/or tasks. Anyway, the point is: it is probably in the best interest of Wikipedia's readers to understand concepts, and not stick to particular definitions and/or implementations. The book is ISBN 0-13-595752-4, chapter 12 Processes and Processors in Distributed Systems. Regards, M. B., Jr. (talk) 17:11, 20 December 2022 (UTC)[reply]

So where is the OS concept of a "task", as distinct from a "thread" or a "process", defined?

And what is a "process to task conversion", and how does it "constitute a thread hijack technique (and maybe a thread rescue technique as well?)"

Processes "live in the main memory" in a number of senses: 1) in many operating systems, one resource that a process has and that is shared by all the threads in the process is its address space - not all of which necessiarly resides in main memory - and 2) the OS maintains state about the process, typically in wired main memory. Point 2), however, applies to threads and, in the case of OSes that use the term "task", tasks.

And most CPUs deal with instructions, not threads (in the thread (computing) sense) or tasks. The notion that a CPU is working on a particular "thread", "process", or "task" is, on most CPUs and operating systems, defined by the operating system, not by the CPU. Guy Harris (talk) 20:50, 20 December 2022 (UTC)[reply]

Hi, Guy Harris. Although fun, I guess this is the wrong talk spot for digressions. The article's content should be improved. You have mentioned pthreads (IEEE's POSIX documents for threads) and Mach stuff earlier. Their definitions of task seem to converge. Regards, M. B., Jr. (talk) 21:32, 20 December 2022 (UTC)[reply]

For IEEE's POSIX documents:

The current Single UNIX Standard doesn't define "task" in its Definitions section. It uses the word "task" in its definitions of "batch job", "batch submission", "command" ("A directive to the shell to perform a particular task."), "program", and "Utility"; that uses the term in the common English sense of "something to be done", in the sense that shopping for groceries, for example, is a task.
The SUS's page for pthread_create() only uses the word "task" in the Rationale, where it refers to the Ada notion of a "task". That's discussed in Ada (programming language) § Concurrency, where it says that "Depending on the implementation, Ada tasks are either mapped to operating system threads or processes, or are scheduled internally by the Ada runtime." The Rationale talks about how to implement Ada tasks atop threads.
The SUS's page for pthread_cond_broadcast() uses the word "task" only in the common English sense ("The pthread_cond_broadcast() function is used whenever the shared-variable state has been changed in a way that more than one thread can proceed with its task.").

So I see no "definition of task" in POSIX to converg with anything else.

As for Mach, the Mach 3 Kernel Principles document says on page 7:

The Mach kernel provides an environment consisting of the following elements:

thread — An execution point of control. A thread is a light-weight entity; most of the state pertinent to a thread is associated with its containing task.
task — A container to hold references to resources in the form of a port name space, a virtual address space and a set of threads.

so that's Mach's "definition of task".

That's different from, say, the Ada definition; the beginning of Section 9 of the 1995 Ada Reference Manual says that "Each task represents a separate thread of control that proceeds independently and concurrently between the points where it interacts with other tasks." and that "In addition, tasks can communicate indirectly by reading and updating (unprotected) shared variables, presuming the access is properly synchronized through some other kind of task interaction.". That could be implemented atop multiple processes, each with its own address space, with parts of those address spaces being mapped to shared memory; however, it could also be implemented atop multiple threads in a single process, with all the threads in a single address space (and, as the quote above from Ada (programming language) § Concurrency, those threads could be known to the OS or could be implemented as user-mode threads in an Ada support library).

In an early specification for PL/I, "A task is an identifiable execution of a set of instructions. A task is dynamic, and only exists during the execution of a program or part of a program." The "Asynchronous Operations and Tasks" section of IBM System/360 Operating System PL/I Language Specifications from 1966, which begins on page 77, describes tasks in a fashion that sounds similar to Ada - for example, on page 79, it says that "If storage is allocated for a variable in the attaching task, this allocation may apply to the attached task, so that the variable may appear as a reference in the attached task."

Tasks in OS/360, as per section 6 "Task Management" of OS/360 Concepts and Facilities, also seem somewhat thread-like, although, as OS/360 ran everything in a single non-virtual address space, there's no notion of "process" in the sense of an entity corresponding to an address space. A "job step" might somewhat correspond to a process, but that's not an exact match. OS/VS1 added virtual memory support, but only had a single address space in which it ran all jobs, the same way OS/360 MFT ran all jobs in physical memory, and OS/VS2 (SVS) added virtual memory support, but only had a single address space in which it ran all jobs, the same way OS/360 MVT ran all jobs in physical memory. OS/VS2 (MVS), eventually called just MVS, was the first OS in the OS/360 line to support multiple virtual address spaces; address spaces somewhat corresponded to processes, with each address space starting out with a single task that could create subtasks running in the same address space, which somewhat correspond to threads.

So "task" does not have the same technical meaning for all operating systems and programming languages in which the term is used. The Task (computing) page explicitly acknowledges that:

In computing, a task is a unit of execution or a unit of work. The term is ambiguous; precise alternative terms include process, light-weight process, thread (for execution), step, request, or query (for work).

As such, it's probably best if the term "task" is used only when talking about systems where it's used in the same, or similar, senses; it shouldn't be used when talking about processes and threads in general. Guy Harris (talk) 07:23, 21 December 2022 (UTC)[reply]

Hi, Guy Harris. Thank you for interacting for the benefit of knowledge. I have searched some of POSIX active standards' documents. In IEEE/ISO/IEC 9945-2009 (not the 2017 Technical Corrigendum 2, I mean the original one), in page 3523, line 118726, it seems that process and task are supposed to be interchangeable concepts. The same in page 3532, line 119135. Also, for the definition of task in the POSIX domain, there is an Ottawa Linux Symposium 2002 paper, which is made available by kernel.org. In its very first page, the author puts it like this:

"... In Linux, the basic unit is a task. In a program that only calls fork() and/or exec(), a Linux task is identical to a POSIX process. The difference arises when a task uses the clone() system call to implement multithreading. The program then becomes a cooperating set of tasks which share some resources. We will use the term task to mean a Linux task...".

A notable Mach resemblance seems to be reasonable then. Regards, M. B., Jr. (talk) 21:04, 21 December 2022 (UTC)[reply]

The POSIX stuff is from the Rationale, not the Standard, so it's not a formal definition of "task" in the POSIX context.

As for the Linux paper, that's a definition of task in the Linux domain, not in the POSIX domain, just as any use of the Mach term "task" in Darwin is a defintion of task in the Darwin domain, not in the POSIX domain. Furthermore, while, in Darwin, a UNIX process has one Mach task associated with it, a Linux task does not always correspond to a single process - as the author notes, when clone() is used, a task doesn't necessarily correspond to a process; the paper goes on to note that multiple tasks are used to implement multiple threads within a Linux process.

So, again, there's no Mach resemblance; "task" means different things in different operating systems and programming languages, and the term "task" should not be used in a general discussion, so as to avoid confusion. Guy Harris (talk) 22:05, 21 December 2022 (UTC)[reply]

> The POSIX stuff is from the Rationale, not the Standard, so it's not a formal definition of "task" in the POSIX context.

Hi, Guy Harris. You're looking for bullets inside documents. Not every specification defines ideas in the atomic perspective a reader wants. Regards, M. B., Jr. (talk) 00:37, 22 December 2022 (UTC)[reply]

The POSIX Rationale is not a specification, it's an explanation of why some choices were made when developing the specification. The POSIX Standard is the specification, and it does not define "task". Guy Harris (talk) 00:59, 22 December 2022 (UTC)[reply]

Hi, Guy Harris. The document I provided you with is a part of the current POSIX standard. Regards, M. B., Jr. (talk) 01:16, 22 December 2022 (UTC)[reply]

Not all parts of the standard specify things. Many standards indicate that some parts are "normative", meaning that they standardize things, and other parts are "informative", meaning that they provide additional information but do not standardize anything. The Rationales in various versions of POSIX are informative, not normative. Guy Harris (talk) 01:40, 22 December 2022 (UTC)[reply]

As for the quote from Tanenbaum, at least one instance of it is on page 507 of the first edition of Modern Operating Systems, in the beginning of chapter 12 "Processes and Processors in Distributed Systems":

In many distributed systems, it is possible to have multiple threads of control within a process. This ability provides some important advantages, but also introduces various problems. We will study those issues first. Then we come to the subject of how the processors and processes are organized, and see that several different models are possible. Finally we look at processor allocation and scheduling in distributed systems.

Presumably the "several different models" are discussed in section 12.2 "System Models". Two models he describes are the "workstation model" and the "processor pool model".

The examples he gives of the first model involve several workstations on a network. possibly with one or more file server on the same network, with each user doing their own work on their workstation and, if there are file servers, using them for files shared between users. In addition, he describes various schemes for sending jobs to idle workstations, to use those workstations' resources when they're not being used by the user of the workstation.

The second model involves a collection of processors in a machine room, with users having X terminals on their desks, with work assigned to the processors as appropriate. (Think of it as timesharing with a GUI and with the timesharing machine being a cluster.)

Neither of these models have to do with how threads exist within a process.

So if you're using Tanenbaum's quote to say something about the organization of processes and threads, you must be referring to another place where he speaks of "several different models"; if so, could you give the page number, so I can find it in archive.org version of the book? There's a little bit about theading models in section 12.1.4 "Implementing a Threads Package". Guy Harris (talk) 08:39, 21 December 2022 (UTC)[reply]