Zeyuan Hu's Page

Modify char in another function

Date Tags c

Almost two years ago, I write a post on how to modify an array in one function through another function in C. I did pretty detailed study through GDB there but I find that the illustration is lengthy to read. In this post, I try to show the same concept using char *. Hopefully, this time I do a better job.

Problem

We are given the following program:

#include <stdio.h>

// modify this function
void function()
{

}

int main()
{
    char* s;
    function(); // modify here
    puts(s);
    return 0;
}

We want to implement function() such that we can print out Hello World! to the screen. The result of the modification looks like below:

#include <stdio.h>

// modify this function
void function(char** c)
{
    *c = "Hello World!";
}

int main()
{
    char* s;
    function(&s); // modify here
    puts(s);
    return 0;
}

The question we want to answer is why doing so works?

Explanation

We acquire key data from GDB as following:

GDB command result
p s 0x7fff5fbff360 ""
p &s (char **) 0x7fff5fbff340
p c (char **) 0x7fff5fbff340
p *c 0x7fff5fbff360 ""
p c (char **) 0x7fff5fbff340
p *c 0x100000f8e "Hello World!"

Note that the last two commands are executed after *c = "Hello World!";. The state of the variables on the stack shown below:

state of variables on the stack

Note that one can think about a variable in C as an alias for some virtual memory address. In other words, variable s and address 0x7fff5fbff340 are the same thing and we use variable as a shortcut to reference some address. For a given variable name, we can get its address by using & (i.e., When & used, the address of that variable is returned, instead of the variable itsef). In our case, &s is 0x7fff5fbff340. Since s itself is a pointer, which by definition, contains a memory address instead of a value. In our case, the memory address in s is 0x7fff5fbff360, which contains "" (note that "" value is undefined. It could be any value).

We pass &s into the function because inside the function, if we modify the content on the address 0x7fff5fbff340 (i.e. represented by &s), we can still reference 0x7fff5fbff340 once the function exits. It's because we can still access s, and s and 0x7fff5fbff340 are the same thing. Whatever change made to the content on 0x7fff5fbff340 will be accessible by s as well. Since s has type char*, then naturally &s corresponds with type char**. Another way of understanding char** is that we want to change the value of the passed in argument and C, by default, pass the argument by copying the value. Thus, we need to pass in a pointer to that value, not just the value itself.

Inside the function, we modify the content on the address 0x7fff5fbff340 by deferencing c (i.e. *c), which holds a copy of 0x7fff5fbff340. After *c = "Hello World!";, the content on the address 0x7fff5fbff340 changed to 0x100000f8e, which contains "Hello World!". Once we are done with the function and back to main, since s is the alias to 0x7fff5fbff340 and 0x7fff5fbff340 contains address to "Hello World!", our task is accomplished.

Note

One thought I had when I finished this post was why can't I pass s instead of &s because if s contains some address (say 0xab) and we modify the content on that address (0xab) to be "Hello World!". Since s contains It seems that there is another option we can use. However, as pointed out by others, the problem is that s is uninitialized: whatever we do with the address contained in s is undefined behavior. Undefined behavior means there is no predictability of the program: anything can happen. Thus, even we can print out the string, we still consider doing so wrong.

Hope this short writeup helps!

--- 10/15/19 UPDATE ---

Addtional perspective to understand why &s works: a pointer is just a regular variable that holds some memory address of another variable. Now, we want to instead of holding the memory address of some random content (e.g., 0x7fff5fbff360), hold the memory address of string "Hello World!" (e.g., 0x100000f8e). A natural choice is to pass in the memory address of the variable that we want to modify the value it holds, which in this case is &s.

comments powered by Disqus