TransWikia.com

How does this string comparison program work for same sized strings?

Stack Overflow Asked on December 31, 2020

Our professor gave us the following C code for comparing two strings of the same size, and I can’t understand how it works.

int compare(char *a, char *b)
{
 int *pa = (int *)a;
 int *pb = (int *)b;
 return *pa == *pb;
}

When I have the following two strings:

char *a = "hello";
char *b = "hello";

The program outputs 1, nothing unusual.

char *a = "helo";
char *b = "help";

The program outputs 0. How is this happening? I tried to print out the following:

 printf("%c,  %cn%d,  %dn", *a, *b, *pa, *pb);
 printf("%dn", *pa == *pb);

and I get this:

h,  h
1869374824,  1886152040
0

Why does the value change after typecasting the char pointer and dereferencing it?

One Answer

The given function works only for strings as long as an int is on your target system. (There are more subtleties to this, however.)

The meaning of memory contents is just by context. If you say in your program, that some cells form a character array, it will be interpreted as such. If the same memory cells form an integer, it will be interpreted as such.

Now let's look at your example:

"helo" is a string of 5 characters, 'h', 'e', 'l', 'o', and ''.

"help" is a string of 5 characters, 'h', 'e', 'l', 'p', and ''.

In hex bytes these are, assumed that they are ASCII:

  • "helo": 0x68 0x65 0x6C 0x6F 0x00
  • "help": 0x68 0x65 0x6C 0x70 0x00

Assuming that your system uses 32-bit integers and little endianness, interpreted as integers these bytes give:

  • "helo": 0x68 0x65 0x6C 0x6F 0x00: 0x6F6C6568 = 1869374824
  • "help": 0x68 0x65 0x6C 0x70 0x00: 0x706C6568 = 1886152040

To see the limits, try to compare strings like:

  • "hello dear" and "hello pal" => will be reported as equal on 32-bit integer systems
  • "a" and "a" => might be reported as not equal depending on the system
  • "really, really long" and "really, really long and even longer" => will be reported as equal on any common system

To answer your specific question

Why does the value change after typecasting the char pointer and dereferencing it?

The value does not change by the type cast. The interpretation of the value changes. That is one of the purposes of type casts.

EDIT:

Why does compare("a", "a") return "0" (false) on some systems or circumstances?

Try this little example:

#include <stdio.h>

int compare(char *a, char *b)
{
    int *pa = (int *)a;
    int *pb = (int *)b;
    return *pa == *pb;
}

int main(void)
{
    static char a[] = "a";
    char b[2];
    
    b[0] = 'a';
    b[1] = '';
    
    printf("a = "%s", b = "%s", compare gives %dn", a, b, compare(a, b));
    
    return 0;
}

And the result will be:

a = "a", b = "a", compare gives 0

This is because 4 or 8 bytes are compared, but the string a is just 2 bytes long, as is string b. Nevertheless the function compares more, and in this case too many bytes. Since the memory after the strings contains different values, the strings are reported to be not equal.

Correct answer by the busybee on December 31, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP