Objects First |
| Memory address | 1000 | 1001 | 1002 | 1003 | 1004 | |
|---|---|---|---|---|---|---|
| Content | Character | a | b | c | d | \0 |
| Value (decimal) |
97 | 98 | 99 | 100 | 0 | |
Syntactically, "abcd" is a pointer to the memory location in which the first character of the string is stored.
In conventional (or von Neuman machines), both the program and the data on which it operates are stored in addressable memory. Here, we aren't concerned with the program addresses (although later, when we look at functions as parameters, we shall be concerned with the address of the code of a function), but the data addresses (regrettably!) need to be considered in almost all C programs. Thus the value of the symbol "abcd" in your program is actually 1000 - the address of its first character!
Thus, if we write:
char *p;
p = "abcd";
printf("p = %d\n", p);
The declaration char * means that we are defining a
pointer to a char rather than a char.
void some_function( .. ) {
char s[5] = "abcd";
char t[5] = { 'a', 'b', 'c', 'd', '\0' };
}
These declarations will produce arrays,
s and t,
with identical contents,
but at different addresses - there will be two strings,
"abcd" stored in the program memory.
We can use s
and t as the names for the strings,
because
printf("s is [%s], t is [%s]\n", s, t );
which will produce:
s is [abcd], t is [abcd]
printf("Today is %s\n", "Monday" );
which will produce:
Today is Monday
char s[4] = "abcd";will not work, because the declaration provides space for only four characters. Always allow one extra character for the terminating null character! This is a major source or error for new C programmers. When strings are manipulated in programs, few C functions check the size of the area into which strings are being copied (the structure of C doesn't allow them to do it reliably anyway!) and thus there is considerable potential for error! We will see some examples of functions which can cause errors when insufficient space is allocated for the original array later in this section.
char s[10]; s = "0123456789";produces a compiler error and would have had an unexpected side-effect if the intent had been to copy "0123456789" into the array, s. C doesn't allow us to re-assign the name of an array to point to another area of memory. Remembering that
By not providing an in-built string data type, the designers of C were able to produce a somewhat smaller language (with less formal syntactic and semantic rules), but the penalty has been much grief in programs that don't run correctly because insufficient space was allocated for strings!
To copy strings,
you must use the library function, strcpy,
or write your own copying function:
#include <string.h> /* Include the string function prototypes */
char s[N] = "test";
char t[N];
strcpy( t, s );
(You can view the file string.h as defining a class of
strings - because it defines a set of operations which will work
on C-style null-terminated strings.
The formal specification for strcpy
(and one of its relatives)
found in string.h is:
char *strcpy(char *dest, const char *src);
char *strncpy(char *dest, const char *src, size_t n);
strcpy copies characters from the source (
src)
to the destination, dest until it finds a null character in
src.
(The null character is copied also, so that the destination string is also
a null-terminated string.)
A significant problem with strcpy is that it has no way of knowing that enough space was allowed in dest to receive all the characters in src. It simply keeps copying until it finds the terminating null character. If the source has no null character to terminate it, then strcpy will continue to copy characters until, by luck (rarely by design!), it finds a null character somewhere in memory. Even if the source has a null character, there is the possibility that the destination doesn't have enough space. To make programs a little more robust, the standard library provides strncpy which will copy at most n characters are transferred. This enables you to avoid over-writing memory, by setting n to be the size of the destination. However strncpy will not copy the terminating null character if the source string has n or more characters, so the destination copy could be left without its terminating null character.
| Caution is required in all string manipulation code written in C! |
char s[N], t[N];
.... /* Copy strings into s and t */
if( s == t ) { .. }
else { ... };
In this fragment, the else branch will always
be taken, because s and
t are memory addresses,
not the strings themselves!
You must use a function:
#include <string.h> /* Include the string function prototypes */
..
char s[N], t[N];
.... /* Copy strings into s and t */
if( strcmp( s, t ) == 0 ) { .. }
else { ... };
strcmp is also defined in
string.h.
The return value is:
| < 0 | s is lexicographically less than t |
| 0 | s is equal to t |
| > 0 | s is lexicographically greater than t |
| Function Call | Returns |
|---|---|
| strcmp("abc","bcd") | < 0 |
| strcmp("xyz","bcd") | > 0 |
| strcmp("xyz","xyz") | 0 |
| strcmp("XYZ","xyz") | < 0 |
| strcmp("123","1234") | < 0 |
strcmp has the same problems as strcpy if the strings aren't null-terminated - it continues through memory comparing characters until it finds a null character. strncmp is also defined; it compares at most n characters. strncmp is "safe" as long as a sensible value of n is supplied; it simply stops comparing after the nth character and returns a 0. If the strings differ in the first n characters, it will, of course, return as soon as it finds the difference.
Key terms |
|
Continue on to Pointers Back to the Table of Contents |