Discussion:
Pointer to a char
Randi Botse
2012-09-18 09:29:32 UTC
Permalink
Hi, having coding in C for 3 years but I'm still not clear with this one.
Consider this code.

...
char *p;
unsigned int i = 0xcccccccc;
unsigned int j;

p = (char *) &i;
printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]);

memcpy(&j, p, sizeof(unsigned int));
printf("%x\n", j);
...

Output:

ffffffcc ffffffcc ffffffcc ffffffcc
0xcccccccc


My questions are:

1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is
unsigned char* then it will print correctly "cc cc cc cc")
2. Why pointer to char p copied to j correctly, why not every member
in p overflow? since it is a signed char.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Phil Sutter
2012-09-18 10:29:56 UTC
Permalink
Hi,
Post by Randi Botse
...
char *p;
unsigned int i = 0xcccccccc;
unsigned int j;
p = (char *) &i;
printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]);
memcpy(&j, p, sizeof(unsigned int));
printf("%x\n", j);
...
ffffffcc ffffffcc ffffffcc ffffffcc
0xcccccccc
1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is
unsigned char* then it will print correctly "cc cc cc cc")
This is because of the two's complement in which singed absolute values
are stored internally. Since %x is a conversion of an integer, signed
extension of the passed char happens, which in two's complement means
that the leading bit is replicated to fill the upper bits. (0xC is 1100
in binary).
Post by Randi Botse
2. Why pointer to char p copied to j correctly, why not every member
in p overflow? since it is a signed char.
I am not quite sure about what the question is here (maybe caused by the
lack of verbs in your sentence). Keep in mind that memcpy() only copies
the memory, irrespective of the pointer type passed. Also,
sizeof(unsigned int) == sizeof(int).

HTH, Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Duan Fugang-B38611
2012-09-18 10:33:01 UTC
Permalink
Thanks, Phil,

It is great for the detail explain.


Best Regards,
Andy

-----Original Message-----
From: linux-c-programming-***@vger.kernel.org [mailto:linux-c-programming-***@vger.kernel.org] On Behalf Of Phil Sutter
Sent: Tuesday, September 18, 2012 6:30 PM
To: Randi Botse
Cc: linux-c-programming
Subject: Re: Pointer to a char

Hi,
Post by Randi Botse
...
char *p;
unsigned int i = 0xcccccccc;
unsigned int j;
p = (char *) &i;
printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]);
memcpy(&j, p, sizeof(unsigned int));
printf("%x\n", j);
...
ffffffcc ffffffcc ffffffcc ffffffcc
0xcccccccc
1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is
unsigned char* then it will print correctly "cc cc cc cc")
This is because of the two's complement in which singed absolute values are stored internally. Since %x is a conversion of an integer, signed extension of the passed char happens, which in two's complement means that the leading bit is replicated to fill the upper bits. (0xC is 1100 in binary).
Post by Randi Botse
2. Why pointer to char p copied to j correctly, why not every member
in p overflow? since it is a signed char.
I am not quite sure about what the question is here (maybe caused by the lack of verbs in your sentence). Keep in mind that memcpy() only copies the memory, irrespective of the pointer type passed. Also, sizeof(unsigned int) == sizeof(int).

HTH, Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jon Mayo
2012-09-19 01:04:02 UTC
Permalink
Post by Randi Botse
Hi, having coding in C for 3 years but I'm still not clear with this one.
Consider this code.
...
char *p;
unsigned int i = 0xcccccccc;
unsigned int j;
p = (char *) &i;
printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]);
printf (and other var arg functions) don't take char, short or float.
they take int or double and a few other types.
those [signed] chars are going to get sign extended when they are
converted to signed int. (0xcc = -52 )
Post by Randi Botse
memcpy(&j, p, sizeof(unsigned int));
the data at i, pointed to by p has not changed, so this memcpy works.
The only thing that is weird is how you interpreted the data (in your
printf above).
Post by Randi Botse
printf("%x\n", j);
...
ffffffcc ffffffcc ffffffcc ffffffcc
0xcccccccc
1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is
unsigned char* then it will print correctly "cc cc cc cc")
2. Why pointer to char p copied to j correctly, why not every member
in p overflow? since it is a signed char.
Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Randi Botse
2012-09-19 07:59:40 UTC
Permalink
Hi Phil, Jon

Thanks, now I'm clear with this, assignment doesn't care with type modifier.

Code such as

unsigned int j = 0xffeeddcc;
int i = j;

Both has the same value depending on how them interpreted (is this
assumption correct?)

Because,

printf("%u", i) will be different to printf("%i", i)
- but -
printf("%u", i) wlll be same as printf("%u", j)


Actually why asking this because I often see a pointer to a char* cast

Let me show you with this example.
Consider some structures...

struct a_data {
unsigned char f1[4];
unsigned char f2[6];
unsigned short f3[2];
};

and another struct named b_data, c_data, etc.

Then there is a general function to process all type of structure,
maybe something like this:

int process_data(char *buffer, size_t len);

Then if we cast for example a pointer to a_data struct to a char* as follow:

struct a_data a;
process_data((char*) &a, sizeof(a));

I though since it was cast to char*, the cast is "problem" because
every signed char buffer will have a range CHAR_MIN to CHAR_MAX,
therefore value of CHAR_MAX to UCHAR_MAX will broken (signed char
overflow)

I think process_data() should be declared with

int process_data(unsigned char *buffer, size_t len)

this declaration in seem correct and work for me.

However, now I'm conceptually understand why this works.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Leon Shaw
2012-09-19 08:47:55 UTC
Permalink
Post by Randi Botse
Hi Phil, Jon
Thanks, now I'm clear with this, assignment doesn't care with type modifier.
Code such as
unsigned int j = 0xffeeddcc;
int i = j;
Both has the same value depending on how them interpreted (is this
assumption correct?)
According to C99, when applying integer conversion, "if the new type
is signed and the value cannot be represented in it, either the result
is implementation-defined or an implementation-defined signal is
raised". But most implementation keeps the same memory representation.
Post by Randi Botse
Because,
printf("%u", i) will be different to printf("%i", i)
- but -
printf("%u", i) wlll be same as printf("%u", j)
Actually why asking this because I often see a pointer to a char* cast
Let me show you with this example.
Consider some structures...
struct a_data {
unsigned char f1[4];
unsigned char f2[6];
unsigned short f3[2];
};
and another struct named b_data, c_data, etc.
Then there is a general function to process all type of structure,
int process_data(char *buffer, size_t len);
struct a_data a;
process_data((char*) &a, sizeof(a));
I though since it was cast to char*, the cast is "problem" because
every signed char buffer will have a range CHAR_MIN to CHAR_MAX,
therefore value of CHAR_MAX to UCHAR_MAX will broken (signed char
overflow)
Actually, whether char is signed or unsigned is
implementation-defined, though, normally, it is signed. SCHAR_MAX+1 ~
UCHAR_MAX can be mapped to SCHAR_MIN ~ -1.
For a pointer that denotes a memory region, what type it points to
doesn't cause much problem as long as you don't simply dereference it.
In such cases, void * might be less confusing.

Regards,
Leon
Post by Randi Botse
I think process_data() should be declared with
int process_data(unsigned char *buffer, size_t len)
this declaration in seem correct and work for me.
However, now I'm conceptually understand why this works.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jon Mayo
2012-09-19 18:09:49 UTC
Permalink
Post by Randi Botse
Hi Phil, Jon
Thanks, now I'm clear with this, assignment doesn't care with type modifier.
Code such as
unsigned int j = 0xffeeddcc;
int i = j;
Both has the same value depending on how them interpreted (is this
assumption correct?)
Because,
printf("%u", i) will be different to printf("%i", i)
- but -
printf("%u", i) wlll be same as printf("%u", j)
most architectures will work that way. some are a little nutty, but
standard C allows for implementation defined behavior when you
interpret a data type the wrong way. (it gets pretty specific about
signed versus unsigned representations)

I will readily admit that years of FORTH programming has warped my
mind and I no longer worry too much about signed int and unsigned int.
I tend to think more in terms of how big a data type is. The 'union'
keyword is especially useful for dealing with different ways to
interpret the same sized piece of memory.

float is often the same size as int. so this potentially works on some
platforms:

float f = 1;
int i = *(int*)&f;
printf("%u", i);

it would print some weird number that shows you how dramatic an
internal representation can differ if you manage to interpret it
incorrectly. (this trick is often used to dump float values in
hexidecimal "%x" for debugging purposes)
Post by Randi Botse
Actually why asking this because I often see a pointer to a char* cast
Let me show you with this example.
Consider some structures...
struct a_data {
unsigned char f1[4];
unsigned char f2[6];
unsigned short f3[2];
};
and another struct named b_data, c_data, etc.
Then there is a general function to process all type of structure,
int process_data(char *buffer, size_t len);
I would have made process_data take a void * instead, so people
wouldn't have to hack around C's simple type checking with casts.

casting struct a_data* to char* doesn't change the value of the
pointer. if you ignore compiler warnings it will work without the
cast.

now inside process_data, the char* type is useful, because the pointer
math will use sizeof(char) [which is always 1] for calculating
offsets. while your sizeof(struct a_data) will be around 14 bytes.
Some people don't like to use void* here, because the compiler will
not like pointer math done on a void* as sizeof(void) doesn't make
sense. Old compilers hacked around this by treating it as 1. New
compilers will prefer that you cast or load the void* into a char*
(which is how i usually implement these sorts of functions)
Post by Randi Botse
struct a_data a;
process_data((char*) &a, sizeof(a));
I though since it was cast to char*, the cast is "problem" because
every signed char buffer will have a range CHAR_MIN to CHAR_MAX,
therefore value of CHAR_MAX to UCHAR_MAX will broken (signed char
overflow)
casting to a pointer won't alter the data. it just changes how you
would interpreter the data when dereferencing it. if process_data
doesn't dereference, then there is probably not a problem.

(also char can be signed or unsigned. in gcc you could use something
like -funsigned-char to override the default setting. which can
potentially break a lot of assumptions in your system and library
headers)
Post by Randi Botse
I think process_data() should be declared with
int process_data(unsigned char *buffer, size_t len)
you should use:
signed char * - if you need signed
unsigned char * - if you need unsigned
char * - if you don't care either way. as long as the pointer points
to something char-sized.
void * - if you don't even care about what type it points to. (maybe a struct)

note- this rule is different than signed/unsigned int. int is always signed.

I use char* when dealing with strings, because I won't be using them
in situations where negative values could be a problem. but one
terrible issue you can run into is a simple function like this:

int isupper(char c)
{
const int upper_table[256] = { ... }; /* UCHAR_MAX is more appropriate here. */
return upper_table[c]; /* oops what if c is negative, that would be a
terrible array index. */
/* we would actually want to cast c to unsigned char, or at least
check x >= 0 && x < upper_table_len */
}
Post by Randi Botse
this declaration in seem correct and work for me.
However, now I'm conceptually understand why this works.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Continue reading on narkive:
Loading...