Talk:String (computer science)
Other related topics
Anyone want to tackle - string (small version of rope), string (general chain of various things), string (music), etc. - not just the computer version.
Also, string theory of cosmology/physics.
Computing theory
The first paragraph seems too "busy" to me. What about replacing it with something like this?
- A string (or string of characters) is a data type used in most programming languages to represent text, and is the focus of this article.
- The computing term string is also used in a broader sense to group a sequence of entities; for example, tokens in a language grammar, or a sequence of states in automata. See the theory of computation.
This is a lot better. Also, I think the usage in computing theory could be expanded in its own paragraph: one starts with a finite alphabet, then considers all finite sequences consisting of letters from that alphabet (including the empty string) and defines concatentation of strings. The set of string with concatentation is then a monoid.
I think I wrote most the current paragraph and I agree your rewrite is better. Just do it! --drj
Ok, I'll move my text to the main article. I won't try to expand the second paragraph; I'm inclined to leave that to the computation article, or to whoever can concisely expand it without detracting from the rest of the page. --loh
I won't try to expand the second paragraph; I'm inclined to leave that to the computation article, or to whoever can concisely expand it without detracting from the rest of the page. --Hornlo
Other meaninngs
I think something definitely needs to be added about the other meanings of string. Ukulele is already linked to this page, which is somewhat confusing. Although, I'm not sure how much content can actually be provided for the other meanings. Maybe this article should be moved to String (computer science) or something, and this page be turned into a disambiguation page. B4hand
- I propose renaming this article to Character string (computer science). Comments? - Bevo 17:03, 13 Jun 2004 (UTC)
I oppose. A string doesn't need to be literal string in general. -- Taku 18:44, Sep 25, 2004 (UTC)
Lexicographical order
The lexicographical order on Σ* is not a well-ordering (for example, what is the least element in a*b?), but only a total order. fudo 13:53, 2 August 2005 (UTC)
- The least element of a*b is b. The lexicographical order on Σ* is indeed a well-ordering. --Pexatus 00:22, 19 March 2006 (UTC)
- No. There is no least element of a*b if you use the alphabetical ordering . Assume there is some least element m; this means that for some non-negative integer k. Note that is also in a*b. n agrees m in k positions, but the k + 1th position of n is less than the k + 1th position of m, so , which contradicts the assumption that there exists a least element of a*b. Therefore, there is no least element of a*b. --Bkkbrad 16:28, 13 September 2006 (UTC)
strings, not characters
There are lots of references to this page throughout wikipedia for a "string" that is not a set of characters; a string of bits or bytes, for example. I think it is important that the article be cleaned-up to make it clear that "character strings" are the most common uses of the type, but the term might apply to vectors of data not representable by a string in a particular language. I've made a couple of edits in this direction, but I think some more effort needs to go into the issue. -- Mikeblas 22:39, 29 January 2006 (UTC)
- Just dont forget that a vector has a fixed length, per definition (as an element in an N-dimensional space), while strings often has (chronologically) variable length. 83.255.35.89 (talk) 11:23, 4 March 2011 (UTC)
- But (unfortunately?) the word "vector" in computer science is most often used to refer to a variable-sized storage (see std::vector), while the word "array" is used to refer to the fixed-sized objects you are talking about. (In linear algebra libraries the word "vector" is used for fixed-sized objects, but these also demonstrate unusual (for programming) properties such as addition doing a component-wise action rather than an append operation).Spitzak (talk) 17:09, 4 March 2011 (UTC)
- Yes, I know, it's a disaster. The designers of the C++ STL must have been rather ignorant, or why would you create that kind of mess otherwise, with concepts turned upside down? They could have called it dynamic array, string of type, or whatever; the well defined term vector was really the worst possible choice and the worst kind of hijacking of words. If they actually knew what they did, I guess they must have been inspired by the arrogant "C-syntax" (B→C→C++→Java etc), which, when spread to the world of webb languages some 15 years ago, caused the equality symbol to suddenly lose its meaning in large circles, a symbol that has been established in both mathematics and everyday use since hundreds of years. Too many young (or uneducated) people are now using == and != instead of = and ≠ in any everyday context (mathematics next...?) and they would interpret the equation a=b as a definition/mutation/initializing of a...
- I can't see why Wikipedia shouldn't do what it can to clarify backgrounds like this. I belive it is crucical to illustrate "misunderstandings" and unessescary discrepancy in terminology among branches of science, so more people can see that there are other conventions than the most vulgar ones that one may want to adhere to. It does not conflict with the goal of describing actual usage and terminology or with "following the sources", as there are many kinds of sources, and plenty of room for elaborations on Wikipedia.
- (As a side note, while "addition" may mean several things (as you wrote), only algebraic superposition should use the + sign really; concatenation may use &, &&, ::, |, concat, or whatnot, at least in my world ;) Regards 83.255.32.149 (talk) 04:37, 5 March 2011 (UTC)
- I agree, but you probably ought not to take it too far. "Character string" is usually implied, and the article shouldn't give the impression that "string" on its own is incorrect. For the most part, I think your edits were fine, although perhaps you could revert the edits to the string oriented languages section. More effort doesn't need to go into the issue, as that would just confuse the article. --StuartBrady 22:59, 29 January 2006 (UTC)
- I was confused by this too. "String" always refers to a string of characters. Vectors of other things are lists, arrays, vectors, ... I'm curious as to what language it is where the word "string" is used in reference to lists of objects. Richard W.M. Jones 09:04, 1 May 2006 (UTC)
- Yes. And keep in mind that WP article titles generally reflect the most common usage for a given term (unless it needs its own disambiguation page). In computer science, "string" most common refers to a string of characters. Other not-so-common meanings (e.g., bitstring) can be linked to in the "See also" section. — Loadmaster 18:35, 1 May 2007 (UTC)
- My impression is that "string" virtually always means "string of characters". The only counter-example I know is that C++ std::string is based on a template and can be made to use any object. However I am almost certain this was done only to support bytes and "wchar" (16 bits, often mislabled "Unicode"). If it was not for "wchar" then they probably would not have made it a template. Since wchar is intended to store characters (or UTF-16) then the string is still a "string of characters". I would be interested if anybody has any real examples of usage of a std::basic_string template with any object for any purpose other than storing something that would be considered "characters".Spitzak (talk) 17:19, 4 March 2011 (UTC)
- Yes. And keep in mind that WP article titles generally reflect the most common usage for a given term (unless it needs its own disambiguation page). In computer science, "string" most common refers to a string of characters. Other not-so-common meanings (e.g., bitstring) can be linked to in the "See also" section. — Loadmaster 18:35, 1 May 2007 (UTC)
Null and NUL
We should decide on just one spelling and capitalization of null. I vote for two L's.24.186.138.188 02:03, 2 May 2006 (UTC)
- "Null" generally refers to the "null character" (or a "null pointer"). "NUL" (all caps) is the mnemonic name (ASCII, EBCDIC, Unicode, etc.) of the "null character" code. — Loadmaster 18:35, 1 May 2007 (UTC)
Origin of the Term ?
Anyone know the history behind using the term "string" to mean a sequence of characters? I mean its not like a series of characters looks much like a ball of string... I assume it's origins are as a mathematical term, but it would be interesting to know how it came to such common use in computing.
- Presumably it comes from the rather obvious expression "a string of characters" (as in "these go some characters stringing by"), equivalent to "a string of pearls" or "a sequence of characters" or other similar phrases. — Loadmaster (talk) 16:27, 9 February 2008 (UTC)
Trying to stop misuse of character encodings
The edits I keep trying are to stop a whole army of uninformed but well-meaning programmers who think strlen() should parse and count the Unicode code points in a string. This is a totally useless definition and causing no end of grief when systems do this. For some reason otherwise intelligent programmers turn into these complete morons when presented with UTF-8 and this is actually seriously damaging any ability to do internationalization. If anybody can think of a wording that says these answers are different and that the fixed-size one is "better" it would help a lot. The reverted-to wording implies that the number of characters is the more important attribute, which is wrong.Spitzak (talk) 04:10, 29 May 2009 (UTC)
StringBuilder .Net type
As well as the standard String type, .Net has a StringBuilder type. I think this is implemented as a linked list, but I'm not too sure. This would be worth adding to the Implementation section. N4m3 (talk) 09:35, 28 May 2011 (UTC)
Only finite strings?
I noticed this statement:
- Although formal strings can have an arbitrary (but finite) length, the length of strings in real languages is often constrained to an artificial maximum. [emphasis mine]
I would like a citation on that, what about languages with lazy evaluation like Clojure and Haskell?
cycle "Is this finite? "
⇒ "Is this finite? Is this finite? Is this finite? Is this finite? ..."
let shouting = 'a' : shouting in putStr shouting
⇒ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa..."