วันพุธที่ 24 ตุลาคม พ.ศ. 2550

How do I output numbers as binary strings?

How do I output numbers as binary strings? We had some help from the standard C library to answer the question How do I input number as binary string? The standard library function strtoul() was available to do the work for us.

When it comes to outputting a numeric value of one of the integer types as a string of just 0's and 1's, however, we are strictly on our own. The code to do this is not very hard. Here is the code for displaying an unsigned char:

#include
#include
char *unsigned_char_to_text(char val){
unsigned char usc = (unsigned char)val;
unsigned char mask = UCHAR_MAX - (UCHAR_MAX >> 1);
static char result [CHAR_BIT + 1];
char *store = result;
while (mask) {
*store++ = (char)('0' + ((usc & mask) != 0));
mask >>= 1;
}
*store = '\0';
return result;
}

To generate a function to output an unsigned int as a binary string, just make the following changes to this function:

Change val to int and usc and mask to unsigned int.
Change the two references to UCHAR_MAX to UINT_MAX.
Change the expression "UCHAR_BIT + 1" for the array size to "sizeof(int) *UCHAR_BIT + 1".

To make a function to do the same for unsigned long, change val to long and usc and mask to unsigned long, change UCHAR_MAX to ULONG_MAX, and use "sizeof(long) * UCHAR_BIT + 1".

Warning about the returned string. The string used to store the characters in the unsigned_char_to_text is defined as a static array of characters inside the function. It must be static because local variables defined inside a function go out of scope and cease to exist once the function returns, so it is illegal and a bug to return a pointer to a local non static variable.

The fact that the string is static means that each call to the function uses the same string space and overwrites the string produced by a previous call. So if you use this function more than once in a program you should be careful to do the following:

Use the string immediately to display or write to a file, before calling the function again.
If you do need to keep the generated string for some time, and the function might be called again, copy the string.

วันจันทร์ที่ 22 ตุลาคม พ.ศ. 2550

What should I use instead of gets()?

You might have been told this in a C programming newsgroup already, but it is important enough that I will repeat it here:

Never, never, NEVER use the function gets(). It is the most dangerous function in the entire C standard library because there is there is no way to use it safely!
Consider this example:

#include
int main(void)
{
char name [25];
printf("Enter your name: ");
fflush(stdout);
if (gets(name) != NULL)
printf("Hello and Goodbye %s\n", name);
return 0;
}

What do you think will happen if the user types fifty characters into your twenty-five character array? What if the user types one hundred characters? Two hundred??

The answer is that gets() will fill up your array and then keep on going, trying to write to memory past the end of the array which your program does not have the right to access. A program crash is likely. Some notorious computer viruses have based their attack on deliberately overflowing buffers used by calling gets().

You might also have heard that you should use the fgets() function, with stdin as the FILE * parameter, instead of gets(). Most people stop after saying that, but that doesn't actually give you the same result. gets() removes the '\n' character from the input but fgets() does not. That means you must manually remove the '\n' before passing the string to fopen(), or for many other uses.

Here is my getsafe() function. Like gets() and fgets() both, it returns a pointer to char. This is either the pointer which was passed to it, or NULL if end of file or an error occurred. Like gets(), it removes the '\n' at the end of the string, if there is one. The prototype is:

char *getsafe(char *buffer, int count);

Here is the function:

#include
#include
char *getsafe(char *buffer, int count) {
char *result = buffer, *np;
if ((buffer == NULL) (count < 1))
result = NULL;
else if (count == 1)
*result = '\0';
else if ((result = fgets(buffer, count, stdin)) != NULL)
if (np = strchr(buffer, '\n'))
*np = '\0';
return result;
}

What should my "int main()" return?

As pointed out in the section above, it is extremely common for a program to return a result indication to the operating system. Some operating systems require a result code. And the return value from main(), or the equivalent value passed in a call to the exit() function, is translated by your compiler into an appropriate code.

There are three and only three completely standard and portable values to return from main() or pass to exit():

  • The plain old ordinary integer value 0.
  • The constant EXIT_SUCCESS defined in or
  • The constant EXIT_FAILURE defined in or

If you use 0 or EXIT_SUCCESS your compiler's run time library is guaranteed to translate this into a result code which your operating system considers as successful.

If you use EXIT_FAILURE your compiler's run time library is guaranteed to translate this into a result code which your operating system considers as unsuccessful.

Note: Some operating systems, such as Unix, MS-DOS, and Windows, truncate the integer passed to exit() or returned from main() to an unsigned char and make this available to the shell script, batch file, or parent process which invoked the program. On these systems programmers sometimes use different positive numbers to indicate different reasons for the failure of the program to execute successfully. Such usage is not portable and may not work correctly on all implementations and operating systems. Only the values 0 and the constants EXIT_SUCCESS and EXIT_FAILURE are portable and guaranteed to work correctly on all hosted implementations.

C++ Note

In a C++ program you do not have to return anything from int main()! The language standard guarantees that if your int main() function "falls off the end" by reaching the closing brace, the compiler will automatically return 0 for you indicating success.

Warnings
  • It is not good programming practice to take advantage of this C++ feature. Programs should always specifically indicate a return status.
  • C++ does not provide this automatic return for any function except int main().
  • C does not provide an automatic return value for main() or any other function. It is up to the program to specify a return value or the status returned to the operating system is undefined.

Please don't void my main() !!!

A commonly asked question:


Why do people say that main() has to return an int and void main(void) is wrong? I think it is all right because:
I have a book by /* somebody */ and all of the example programs are written this way.
My instructor tells us to write it that way.
The examples in my online help show it that way.
It works on my compiler.
First let's talk about the items above.
Many books are written by people who don't even bother to actually learn their subject. They are good at writing books and think they know everything. Some of them test their code by compiling and running it with their favorite compiler, and never notice that they are using compiler specific extensions. Some authors have written books containing code which they just wrote off the top of their head. They never compiled and ran it because it would not compile with ANY compiler.
Sadly, many instructors of computer programming classes do not know the actual standard for the language themselves. What they know they learned from reading the same books written by careless authors described above. This kind of ignorance of their subject matter would not be allowed in a subject like mathematics, chemistry, or physics, but it is all too common in computer programming.
Microsoft is particularly bad at this. Almost all of the examples in their online help start with void main(void). Yet if you look up main in the online help, it displays a proper definition. The example programs in the online help for Borland compilers always show main() properly returning an int.
The standards for the C and C++ programming languages specifically allow compiler writers to include extensions. In addition, the usual tradition is for a compiler to process your program and generate an executable if at all possible, even if your source code is questionable. The International Standard for each programming language defines what the language is, not whatever any particular compiler happens to accept. Most C compilers have some type of option to compile in strict standard mode. Again Microsoft is one of the worst offenders here. There is no way that I know of to get their compilers to issue an error or even a warning message if you define a void main() function in your code.
Some computer languages are proprietary, that is they belong to a particular company, and that company has final authority to determine what the language is and isn't.
One example is Java. This language is a product of Sun Microsystems, and it defined by them. Any other company which uses the Java language signs a contract with Sun and agrees that their version will be 100% compatible with Sun's standard.
Another language is Microsoft's Visual Basic. This is a programming package that only they sell, and only for their Windows operating systems. Since this language is completely theirs, the definition of what the language is is completely theirs.
This is not the case with most general purpose programming languages, and this includes C and C++. Nobody owns these languages or their definitions. Instead the languages are defined by International Standards, which are generated and maintained by ISO, the International Standards Organization.
ANSI, which is the American National Standards Institute, is the American national standards body and is one of the member bodies of ISO. You may have heard the term ANSI C or ANSI C++. These terms are in common use because it was ANSI that issued the first standard for C. When the ISO International Standard was adopted, ANSI adopted the ISO version and it became the ANSI version as well. The current ANSI standards for C and C++ are the ISO International standards.
The ANSI/ISO standards for C and C++ define the languages, not Microsoft or Borland or any other compiler vendor. Here is what the standards have to say:
ANSI/ISO/IEC 9899:1990 International Standard For C
The function called at program startup is named main. The implementation declares no prototype for this function. It can be defined with no parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):
int main(int argc, char *argv[ ]) { /* ... */ } The newly ratified update to the C standard in 1999 will make this even clearer, perhaps because of all the clueless who could not understand that only the two formats above, both of which define main to return an int, are valid.
The draft of the new standard expands on the two definitions above by modifying the wording "It can be defined" to "It shall be defined with a return type of int".
ANSI/ISO/IEC 14882:1998 International Standard For C++
An implementation shall not predefine the main function. This function shall not be overloaded. It shall have a return type of type int but otherwise its type is implementation defined. All implementations shall allow both of the following definitions of main: The two definitions which follow are identical to those in the C standard.
Practical Reasons To Return An int from main()
On many operating systems, the value returned by main() is used to return an exit status to the environment. On Unix, MS-DOS, and Windows systems, the low eight bits of the value returned by main() is passed to the command shell or calling program. This is often used to change the course of a program, batch file, or shell script.
Many compilers will refuse to compile a source code file containing a definition of main() which does not return an int.
On some platforms a program starting with void main() may crash on startup, or when it ends.
A program which contains a main() function that is not defined to return an int is just plain not real C or C++.