1-8) Strings

Revisions:
May 22, 2008 creation date
June 3, 2008 added section on <string> objects

 


Strings In C++

Strings are used a lot in programs. C++ programs either use low level C-style strings or string objects. C-Style strings are simple arrays of characters, whereas string objects are objects with useful methods. Although string objects would be considered the more modern OO approach, there is a lot of code out there that uses C-style string so you should be familiar with both approaches.



C-style Strings

C-style strings are arrays of characters with the special null characters '\0' denoting the end of the string. That is, the array memory can be larger than the string but the null character denotes where the string ends in the array.

Literal strings like "Saturday" (denoted by double quotes) are string constants and represented as arrays of characters. String constants can be declared as follows.

char * name = "Lou";
char day[] = "Saturday";

or

char name[15];  //notice array is bigger than the string placed in it
day[0] = 'L';
day[1] = 'o';
day[2] = 'u';
day[4] = '\0';   //signifies the end of the characters of the string


String constants are not modifiable. The following code example compiles and runs but crashes at run time because of the attempt to modify an literal string. (Literal strings are stored where the compiler puts static constants.)

#include <iostream>
using namespace std;

void main() {
    
char * name = "saturday";
     cout << name << endl;
     cout << name[0] << endl;
     name[0] =
'S'; //ERROR -code crashes at run time
    
cout << name << endl;

}

The follow, more correct, program produces a compile error instead because the literal string is, more correctly, referenced through a pointer to a constant.  

#include <iostream>
using namespace std;

void main() {
    
const char * name = "saturday";
     cout << name << endl;
     cout << name[0] << endl;
     name[0] =
'S'; //ERROR -compile time error this time
    
cout << name << endl;

}

In the following example the compiler does allocate memory for the array and so the string is modifiable. 

#include <iostream>
using namespace std;

void main() {
  
char name[] = "saturday";
     cout << name << endl;
     cout << name[0] << endl;
     name[0] =
'S'; //OK
    
cout << name << endl;

}

Printing C-style strings. C++ provides a << operator for cout which will print char* as though they were strings. That is, it will treat the pointer as the start of a character array and print all the characters in succession until it finds a '\0' characters. If the pointer is not in fact pointing at a null terminated array of characters the operation will "wander" through memory trying to print characters until it accidentally encounters the first '\0' character.  Also if you want to print the actual address value of the pointer with the << operator you will need to cast it to a void * so that the << operator does not treat it as a null terminated string of characters.

 

#include <iostream>
using namespace std;

void main() {
    
char * name = "saturday";
     cout << name << endl; //prints name as a sequence of characters until '\0' is found
     cout << (void *) name << endl; //prints the memory location of name

}

/* OUTPUT
saturday
004177FC

*/


<cstring> or <string.h> Standard library

C++ provides the cstring, or older string.h, standard library of manipulating c-style strings. Like iostream, cstring is declared in the namespace std. The following table provides some of its operations that are used throughout these note. 

procedure parameter types comment
strcat(char * dest, char * source) dest and source are null-terminated char arrays. returns address dest.
the characters of source are concatenated to the end of  the characters in dest, and the new string is null terminated.
strcmp(char * str1, char * str2) str1 and str2 are null-terminated char arrays returns int representing character by character comparison of the strings as follows:
returns <0 if str1 < str2
returns 0 if str1 = str2
returns >0 if st1 > str2
strcpy(char * dest, char * source) dest and  source are null-terminated char arrays returns address dest.
The characters of source, including the null character are copied into the memory addressed by dest.
This operation is UNSAFE in that strcpy assumes dest is pointing at enough memory.
strlen(char * str) str is a null-terminated char array returns an integer value >= 0 specifying the length of the string represented by str, but NOT including the '\0' null character.

Using C-style Strings (Example)
 

#include <iostream>
#include <cstring>
//for strcmp(), strcpy(), strlen(), strcat()

using namespace std;

int main() {
    
char input[100];
    
char * myString;
    
     cout <<
"enter an input string" << endl;
     cin >> input;
     cout <<
"you said: ";
     cout << input << endl;
     cout <<
"the length of your string is " << strlen(input) << endl;
    
     myString =
new char[strlen(input) + 1];
     strcpy(myString, input);
    
     cout <<
"the length of myString is " << strlen(myString) << endl;
     cout << myString << endl;
    

     //comparing strings
    
cout << strcmp("abba", myString) << endl;
     cout << strcmp(myString, myString) << endl;
     cout << strcmp(myString,
"abba") << endl;
   
    
return
0;

}

/* Program output

enter an input string
Saturday
you said: Saturday
the length of your string is 8
the length of myString is 8
Saturday
1
0
-1

*/


Example of Dynamic Memory for C-style string variables in a class
 

Notice in this example the consequences of trying to use a const char * to refer to the name. The idea is that we don't want the name modifiable after the person object is created, but we do want to allow the email address to be changed. This is an illustration of the consequences of referring to the name through a const char *. (Be aware that the name is actually protected though since we did not provide a public "set" method to change the name.)

#include <iostream>
#include <cstring>
//for strcmp(), strcpy(), strlen(), strcat()

using namespace std;

class Person {
public:
   Person(
char * their_name = "unknown", char * email = "unknown"){
      char * temp =
new char[strlen(their_name) + 1];
      strcpy(temp, their_name);
      name = temp;
      email_address =
new char[strlen(email
) + 1];
      strcpy(email_address, email);
   }

   ~Person(void) {
     
delete [] (char *) name;   //notice cast is necessary to allow destruction
     
delete
[] email_address;
   }

   char * getEmailAddress(){return email_address;}
   const
char * getName() {return name;}

   void setEmailAddress(char * new_email_address) {
      if(strlen(new_email_address) > strlen(email_address)){
           //allocate new memory if necessary
           delete [] email_address; //destroy old email address memory
           email_address =
new char[strlen(new_email_address
) + 1];
      }
      strcpy(email_address,
new_email_address);
   }

private:
  
const char * name; //characters of name should not be modifiable
  
char
* email_address;  //characters of email_address should be modifiable
};

void main(){
      Person lou(
"Lou", "lou@hotmail.com");
      Person sue(
"Sue", "sue@scs.carleton.ca");
      Person p;
      cout << lou.getName() <<
" email: " << lou.getEmailAddress() << endl;
      cout << sue.getName() <<
" email: "
<< sue.getEmailAddress() << endl;
      cout << p.getName() << " email: " << p.getEmailAddress() << endl;

      lou.setEmailAddress("louis@connect.carleton.ca");
      cout << lou.getName() << " email: " << lou.getEmailAddress() << endl;
}

/* output
Lou email: lou@hotmail.com
Sue email: sue@scs.carleton.ca
unknown email: unknown
Lou email: louis@connect.carleton.ca
*/



String Objects

C++ provides string objects, as opposed to C-style character arrays. String objects are defined in the <string> standard library. Do not confuse <string> with <string.h>. The newer implementation of <string.h> is <cstring>, <string> on the other hand defines the string objects.

The following table gives some of the constructors and methods associated with <string> objects. Unless otherwise indicated, str, str1 and str2 are string objects in the table below.

method parameters comment
constructors:    
string()          //empty string
string(char * s)  //string version of char * s
string(const string& str)   //copy of str
string(size_t n, char c)     //n repeated characters
string(char* s, size_t n)   //first n characters
string(const string & str, size_t n)  //first n characters
string(char* s, size_t i, size_t n)  //up to n characters starting at i
string(const string* str, size_t i, size_t )
  Note: size_t is an alias for unsigned int on most compilers

The string class also defines types:
string::size_type //unsigned integer type
string::npos //maximum value of type string::size_type
conversion methods    
str.c_str()   returns base address of null terminated c-string containing the characters of str.
inspection    
str.empty()   returns true if str is empty, false otherwise.
str.length()   returns number of characters in str as unsigned int.
str.size()   same as .length().
str.find(strExp) strExp is an expression that evaluates to a string object or a character. searches str for the first occurrence of the string or character specified by strExp. Returns the position where strExp was found, or the special value string::npos if not found
str.substr(pos, len) unsigned ints pos and len return a temporary string object containing a substring of str starting at position pos with at most len characters. If len is too large the substring only goes to the end of str
str1.swap(str2)   swaps the contents of str1 and str2.
comparision    
str.compare(anyString) anyString is string object or c-style string compare str1 with characters of anyString. the result is positive, negative, or zero indicating str is bigger than, less than, or equal to anyString
str.compare(i,n,anyString) anyString is string object or c-style string compare str with anyString starting at index i for n characters.
operators: <,>,>=,<=,==   compares string objects under alphabetical order based on the characters they represent
modification    
str1 + str2   return a temporary string object that is str1 concatenated with str2
st1 += str2   concatenates str2 to the end of str1. Modifies str1.
str.clear()   removes all the characters for str
str.erase(), str.erase(m)   same as str.clear(),
str.erase(m); unsigned int m removes all the characters of str starting at index m.
str.erase(m,n) unsigned int m and n removes up to n characters from str starting at position m
str1.insert(m,n,c) unsigned int m,n char c inserts into str n occurrences of c at index m.
str1.insert(m str2)   inserts into str at postion m all the charcters of str2.
str1.append(str2)   appends the characters of str2 to str1.
str1.replace(i,n,anyString) unsigned int i,n string or c-string anyString starting at i replace up to n characters in str1 with the characters of anyString.
input-output    
getline(stream, str) stream is istream or ifstream object characters until newline character are read from stream and stored in str. The newline character is read but not stored in str.
     

Using string objects Examples
 

Example 1

#include <iostream>
#include <string>
//for string objects

using namespace std;

int main() {
   string myString;
   cout <<
"enter an input string" << endl;
   cin >> myString;
   cout <<
"you said: ";
   cout << myString << endl;
   cout <<
"the length of your string is " << myString.length() << endl;

  
//comparing strings
  
cout << myString.compare("abba") << endl;
   cout << myString.compare(myString) << endl;
|   string abba =
"abba";
   cout << abba.compare(myString) << endl;
  
return
0;
}

/* Program output
enter an input string
Saturday
you said: Saturday
the length of your string is 8
-1
0
1
*/


Example 2

#include <iostream>
#include <string>
//for string object

using namespace std;

class Person {
public:
   Person(
char * their_name = "unknown", char * email = "unknown") {
      name = their_name;
//this should be done in base member initializer list instead
     
email_address = email;
   }
 
~Person(
void) {
     
//nothing to do here
  
}

   const char * getEmailAddress(){return email_address.c_str();}
   const char * getName() {return name.c_str();}
  
void setEmailAddress(char * new_email_address) {
      email_address = new_email_address;
   }

private
:
   string name;
   string email_address;
};

void main(){
   Person lou(
"Lou", "lou@hotmail.com");
   Person sue(
"Sue", "sue@scs.carleton.ca");
   Person p;

   cout << lou.getName() <<
" email: " << lou.getEmailAddress() << endl;
   cout << sue.getName() <<
" email: " << sue.getEmailAddress() << endl;
   cout << p.getName() <<
" email: " << p.getEmailAddress() << endl;
  
   lou.setEmailAddress(
"louis@connect.carleton.ca");
   cout << lou.getName() <<
" email: "
<< lou.getEmailAddress() << endl;
}

/* output
Lou email: lou@hotmail.com
Sue email: sue@scs.carleton.ca
unknown email: unknown
Lou email: louis@connect.carleton.ca
*/