Jump to content

Talk:C++ string handling

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 76.66.202.139 (talk) at 05:32, 18 May 2009. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
WikiProject iconComputing: Software Stub‑class
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
StubThis article has been rated as Stub-class on Wikipedia's content assessment scale.
???This article has not yet received a rating on the project's importance scale.
Taskforce icon
This article is supported by WikiProject Software.
Note icon
This article has been automatically rated by a bot or other tool as Stub-class because it uses a stub template. Please ensure the assessment is correct before removing the |auto= parameter.
WikiProject iconC/C++ Unassessed
WikiProject iconThis article is within the scope of WikiProject C/C++, a collaborative effort to improve the coverage of C and C++ topics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
???This article has not yet received a rating on Wikipedia's content assessment scale.
???This article has not yet received a rating on the importance scale.

incorrect

"when two c-strings are compared, it is implementation defined as to whether the contents or addresses are compared."

Huh? No, it's not implementation defined, it's definitely an address compare. The only freedom is that in:

 char *p1 = "hello";
 char *p2 = "hello";

... the compiler is allowed to share the two, i.e. p1==p2 *maybe*.

mem usage?

I just had a young programmer tell me that an uninitialized std::string uses less memory than an initialized one. Is that true? (I guess that would depend on the implementation; but consider, for example gcc) The code I found suspect was:

class blah {
  private:
     std:string name;
  public:
     blah (std:string in) {
        if (!in.empty()) name = in;  // claimed savings of memory
     } 
 };

linas (talk) 03:31, 27 January 2008 (UTC)[reply]

That sounds bogus to me. Even if you don't touch name, it gets initialized at the beginning of the constructor. You can always look at the source, though. —Ben FrantzDale (talk) 06:10, 27 January 2008 (UTC)[reply]
Looking at glibc source is easier said than done. But I did run an experiment with sbrk(0) and the result was no effect. Wonder why he thought that ... linas (talk) 21:07, 27 January 2008 (UTC)[reply]
A std:string basically looks like this:
struct string { size_t length; char* contents; }
With an unitialized instance the char* is just 0. Otherwise it points to a memory block (allocated with new char[] or malloc). A string of length 0 can be represented with no memory block, or a memory block containing just the terminating zero. Since heap management operates with granularity (e.g. in units of 16 bytes) you will waste 16 bytes in the latter case.
Note that there is no direct mapping between malloc and sbrk. The run-time library typically aquires memory in huge chunks from the OS.
--Alba7 (talk) 19:48, 30 August 2008 (UTC)[reply]

null characters

just curious if string class accepts null characters. I would assume it does. —Preceding unsigned comment added by 66.102.196.17 (talk) 00:56, 28 February 2008 (UTC)[reply]

I dug around in the gcc header files and found the following in basic_string.h: 1. String really contains _M_length + 1 characters: due to 21.3.4 must be kept null-terminated. But I am still not sure what that means fully. Guess I will have to test it. Kind of a lot to go through for a curiosity. I am starting to think it would have to be possible though, or how else would someone do binary file i/o. —Preceding unsigned comment added by 66.102.196.44 (talk) 03:03, 7 March 2008 (UTC)[reply]
It appears to. It's not easy to add them, though, because string foo = "asdf\0asdf"; just sets foo to "asdf" because the null terminator means the string constructor never sees the second half of the string. But you can do str.push_back('\0') and the length will increase and you can put non-null characters after the null terminator. —Ben FrantzDale (talk) 03:19, 8 March 2008 (UTC)[reply]

character sets

Does the C++ standard define what character sets the string class stores? I would assume it only does ASCII (or perhaps you can do UTF8, but it won't gaurantee correct operation with some types of manipulation), but I can't recall ever seeing any mention of this in the docs. I was just looking at GLib and I was wondering why they bothered reimplementing a lot of STL, then I figured proper UTF8 support might be the reason. If it is a major difference, perhaps the article should be expanded to compare/contrast std::string with other libraries' string classes. Yanroy (talk) 20:17, 18 July 2008 (UTC)[reply]

Class std::string is actually just an instanciation of a template.
typedef basic_string<char> string;
You can also use wchar_t instead of char to get UTF16/UTF32 support.
--Alba7 (talk) 16:54, 23 October 2008 (UTC)[reply]

Renaming this article to follow a consistent convention

Hi, I am currently considering renaming this article to conform to a common convention for C++ Standard Library components. The full discussion can be found here. decltype 09:47, 6 March 2009 (UTC)[reply]