OverIQ.com

The strcmp() Function in C

Last updated on July 27, 2020


The syntax of the strcmp() function is:

Syntax: int strcmp (const char* str1, const char* str2);

The strcmp() function is used to compare two strings two strings str1 and str2. If two strings are same then strcmp() returns 0, otherwise, it returns a non-zero value.

This function compares strings character by character using ASCII value of the characters. The comparison stops when either end of the string is reached or corresponding characters are not same. The non-zero value returned on mismatch is the difference of the ASCII values of the non-matching characters of two strings.

Let's see how strcmp() function compare strings using an example.

strcmp("jkl", "jkq");

Here we have two strings str1 = "jkl" and str2 = "jkq". Comparison starts off by comparing the first character from str1 and str2 i.e 'j' from "jkl" and 'j' from "jkm", as they are equal, the next two characters are compared i.e 'k' from "jkl" and 'k' from "jkm", as they are also equal, again the next two characters are compared i.e 'l' from "jkl" and 'q' from "jkm", as ASCII value of 'q' (113) is greater than that of 'l' (108), Therefore str2 is greater than str1 and strcmp() will return 5 ( i.e 113-108 = 5 ).

It is important to note that not all systems return difference of the ASCII value of characters, On some systems if str1 is greater than str2 then 1 is returned. On the other hand, if str1 is smaller than str2 then -1 is returned. It is more likely that you will encounter this behaviour on your system.

Let take some examples:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
strcmp("a", "a"); // returns 0 as ASCII value of "a" and "a" are same i.e 97

strcmp("a", "b"); // returns -1 as ASCII value of "a" (97) is less than "b" (98)

strcmp("a", "c"); // returns -1 as ASCII value of "a" (97) is less than "c" (99)

strcmp("z", "d"); // returns 1 as ASCII value of "z" (122) is greater than "d" (100)

strcmp("abc", "abe"); // returns -1 as ASCII value of "c" (99) is less than "e" (101)

strcmp("apples", "apple"); // returns 1 as ASCII value of "s" (115) is greater than "\0" (101)

The following program compares two strings entered by the user.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include<stdio.h>
#include<string.h>

int main()
{
    char strg1[50], strg2[50];

    printf("Enter first string: ");
    gets(strg1);

    printf("Enter second string: ");
    gets(strg2);

    if(strcmp(strg1, strg2)==0)
    {
        printf("\nYou entered the same string two times");
    }

    else
    {
        printf("\nEntered strings are not same!");
    }

    // signal to operating system program ran fine
    return 0;
}

Expected Output:

1st run:

1
2
3
4
Enter first string: compare
Enter second string: compare

You entered the same string two times

2nd run:

1
2
3
4
Enter first string: abc
Enter second string: xyz

Entered strings are not same!

Relational operators with strings #

When a relational operator (>, <, >=, <=, ==, !=) is used with strings they behave in a slightly different way. Consider the following example:

1
2
char *s1 = "hello";
char *s2 = "yello";

Can you guess what the following expression does?

s1 == s2

This expression compares the addresses of strings pointed by s1 and s2 not the contents of string literals.

The following example demonstrates this behaviour.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#include<stdio.h>
#include<string.h>

int main()
{
    char *s1 = "hello";
    char *s2 = "world";

    printf("Address of string pointed by s1 = %u\n", s1);
    printf("Address of string pointed by s2 = %u\n\n", s2);

    printf("Is s1 == s2 ? %u\n", s1 == s2);
    printf("Is s1 > s2 ? %u\n", s1 > s2);
    printf("Is s1 < s2 ? %u\n", s1 < s2);

    // signal to operating system program ran fine
    return 0;
}

Expected Output:

1
2
3
4
5
6
Address of string pointed by s1 = 4206592
Address of string pointed by s2 = 4206598

Is s1 == s2 ? 0
Is s1 > s2 ? 0
Is s1 < s2 ? 1

Let's get back to our original discussion, and try creating our own version of strcmp() function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
int my_strcmp(char *strg1, char *strg2)
{
    while( ( *strg1 != '\0' && *strg2 != '\0' ) && *strg1 == *strg2 )
    {
        strg1++;
        strg2++;
    }

    if(*strg1 == *strg2)
    {
        return 0; // strings are identical
    }

    else
    {
        return *strg1 - *strg2;
    }
}

How it works:

The my_strcmp() function accepts two arguments of type pointers to char and returns an integer value. The condition in while loop may look a little intimidating so let me explain it.

( *strg1 != '\0' && *strg2 != '\0' ) && (*strg1 == *strg2)

The condition simply says keep looping until the end of the string is not reached and corresponding characters are same.

Let's say my_strcmp() is called with two arguments "abc" (strg1) and "abz" (strg2), where strg1 points to the address 2000 and strg2 points to the address 3000.

1st Iteration

In the first iteration both strg1 and strg2 points to the address of the character 'a'. So

*strg1 returns 'a'
*strg2 returns 'a'

while condition is tested:

( 'a' != '\0' && 'a' != '\0' ) && ('a' == 'a')

As the condition is true, the statements inside the body of the loop are executed. Now strg1 points to address 2001 and strg2 points to address 3001. This ends the 1st iteration.

2nd Iteration

In the second iteration both strg1 and strg2 points to the address of the character 'b'. So

*strg1 returns 'b'
*strg2 returns 'b'

while condition is tested again:

( 'b' != '\0' && 'b' != '\0' ) && ('b' == 'b')

As the condition is true, the statements inside the body of the loop are executed once more. Now strg1 points to address 2002 and strg2 points to address 3002. This ends the 2nd iteration.

3rd Iteration

In the third iteration both strg1 and strg2 points to the address of character 'c' and 'z' respectively. So

*strg1 returns 'c'
*strg2 returns 'z'

while condition is tested again:

( 'c' != '\0' && 'z' != '\0' ) && ('c' == 'z')

The while condition becomes false and the control breaks out of while loop. if condition following the while loop is checked.

1
2
3
4
if( *strg1 == *strg2)
{
   return 0;  // strings are identical
}

Since

*strg1 returns 'c'
*strg2 returns 'z'

Therefore the condition 'c' == 'z' is false. Control comes to the else block and the following statement is executed.

return *strg1 - *strg2;

The expression *strg1 - *strg2 evaluates the difference of the ASCII value of characters.

1
2
3
4
*strg1 - *strg2
=> 'c' - 'z'
=> 99 - 122
=> -23

at last -23 is returned to the calling function.

The following program demonstrates our new string comparison function my_strcmp().

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include<stdio.h>
int my_strcmp(char *strg1, char *strg2);

int main()
{

    printf("strcmp(\"a\", \"a\") = %d\n", my_strcmp("a", "a") );
    printf("strcmp(\"a\", \"b\") = %d\n", my_strcmp("a", "b") );
    printf("strcmp(\"a\", \"c\") = %d\n", my_strcmp("a", "c") );
    printf("strcmp(\"z\", \"d\") = %d\n", my_strcmp("z", "d") );
    printf("strcmp(\"abc\", \"abe\") = %d\n", my_strcmp("abc", "abe") );
    printf("strcmp(\"apples\", \"apple\") = %d\n", my_strcmp("apples", "apple") );

    // signal to operating system program ran fine
    return 0;
}

int my_strcmp(char *strg1, char *strg2)
{

    while( ( *strg1 != '\0' && *strg2 != '\0' ) && *strg1 == *strg2 )
    {
        strg1++;
        strg2++;
    }

    if(*strg1 == *strg2)
    {
        return 0; // strings are identical
    }

    else
    {
        return *strg1 - *strg2;
    }
}

Expected Output:

1
2
3
4
5
6
strcmp("a", "a") = 0
strcmp("a", "b") = -1
strcmp("a", "c") = -2
strcmp("z", "d") = 22
strcmp("abc", "abe") = -2
strcmp("apples", "apple") = 115

As you can see, the my_strcmp() returns ASCII value of mismatched characters. As a homework modify this function so that it returns 1 if strg1 is greater than strg2 and -1 if strg1 is smaller than strg2.