CS50 Week4 Memory
hex means 16
hexadecimal: 0,1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
Why do scientists like to use hexadecimal?
16 can represent 4 binary digits, which is 1111 in binary base. (2 * 2 * 2 * 2 = 16). With just two hexadecimal digits, we can express any number from 0 to 255. To do the same in binary, we need 8 digits. As number getting larger and larger, we need more digits to represent values.
People often use prefix like
0* to indicate the usage of hexadecimal, otherwise it might get mixed up with decimal.
0*48 (a hexadecimal number 48)
RGB is a system that uses hexadecimal to represent different colors.
000000 = black,
FFFFFF = white
Memory, today’s topic, is also an example that uses hexadecimal.
Memory address starts from
0, all the way up to
FF like this:
A variable that contains the address of other value. Its type is
int * and takes up 8 bytes in most of the time.
&: address operator, gets the address. When you put an ampersand before a variable, C gives you the address that the variable is stored in.
*: dereference operator, goes to the address. When you put an star sign before a variable, C goes to the address and return the value stored inside.
What does the memory look like?
The pointer p is also a variable stored in somewhere in the memory. But we don’t need to care about where
p is stored. All we need to know is that the value of
p is the address of n, in this case, 50.
P is just a pointer that points to the address of int n.
String in memory
string s = “HI!” What does this line of code look like in the memory?
We know string is an array of chars that are stored back to back in the memory, that is, they are stored one byte apart. That’s why you see the memory address is in order.
What about s?
The value of s is the address of the first character in the string. In the above example, it would be
How does the computer know the end of the string?
When hitting the value of
\0, the null character, we know we are at the end of a string. Variable s is just a pointer pointing to the address of the first character.
The data type string does not exist in C. More precisely, it exists in C as char star.
<cs50.h> just make a custom data type called string.
This line of code:
char *s = "HI!";
equals these lines of code in <cs50.h>
typedef char *string;
string s = "HI!"
Remember we can use
s to get the second element in string s? It is actually a syntactic sugar. We know a pointer in C is an address, which is a numeric value. By incrementing the pointer by 1, the program gives us access to each of the succeeding elements in the array just like what below shows:
Suppose we want to compare two strings. With the program below, even when we type in two identical strings, the result still turns out to be different. Why?
t represent the address of the first character in the string, not the character itself.
Now let’s look at another example. We want to create
t which copies the value of
s, and then change the element in
t without modifying
s. However, it turns out that both
t got changed afterwards.
That’s because in copy.c,
t both refer to the same address. When
t got changed,
s also got changed.
How to copy string values rather than memory addresses?
Malloc() and free()
function malloc(int num): memory allocation
char *t = malloc(4);(allocate 4 bytes to t’s memory)
The malloc function returns the first address of the chuck of memory that it has allocated for you. But sometimes it may return a nul pointer when there’s no memory available. Thus, better check for the return value when using this function.
NULL, is not the same as nul, it’s a nul pointer, an empty address.
\0, is a null character.
strcpy function helps you to copy strings.
malloc(), free function returns the memory allocated by
realloc() back to the computer, this way you won’t run out of memory. It’s better to always give the memory back after you are done with the program. Just give the memory address you want to release as the input, the function will free the memory.
Another useful debug tool for detecting potential memory problems
It helps detect any memory related problems. When you do not use
free() to release memory, when you touch any memory that you should not(segmentation fault), and when memory leaks, valgrind reports the problem to you.
How to use valgrind? Write a problematic program first
call valgrind to detect problems:
Invalid write of size 1
⇒ A write refers to changing a value. In the above example, we only ask for three bytes, and are not supposed to change the value in the fourth byte.
Invalid read of size 1
⇒ A read refers to reading, using or printing a value. In the above example, we print out the whole string, which includes a memory address that we should not have touched.
LEAK SUMMARY: ==705== definitely lost: 3 bytes in 1 blocks
⇒ We do not free the memory got from
malloc() at the end of the program.
Say we write a function called swap, it should swap the values of two variables:
Why the above program does not swap values of x and y successfully?
It turns out that computer treats different portions in computer’s memory in different ways. Everything are stored in a standard location.
Machine code: The 0s and 1s that compose your program
global variables: Constants that are put outside your functions
heap: a chunk of memory that
malloc() uses to get some extra space. And you need to use
free() at the end to release memory
stack: When calling functions, like
strcmp(), you are using spaces called stack space to store some local variables and parameters. Don’t need to use
free() function at the end because when functions are executed, they are popped and removed from the space.
As you can see from the picture, stack and heap spaces point at one another. We typically have enough memory, so these two spaces won’t collide.
In the above program, we call function swap. It does swap values, but the values are just the copies of x and y. The real x and y in the main function remain unchanged.
How to fix the problem?
Pass pointers instead so this time we are not just copying x and y’s values and passing them to
swap(), but actually going to the addresses of them and making changes.
stack overflow: Calling too many functions till the memory overflows the heap memory
heap overflow: A form of buffer overflow. It happens when a chunk of memory is allocated to heap, and data is written into the memory without any bound checking, leading to internal structures being overwritten.
some critical data structures in heap are overwritten,
buffer overflow: Buffer is a chunk of memory, an array of memory that temporarily holds data while it is being transferred from one location to another. A buffer overflow (or buffer overrun) occurs when the volume of data exceeds the storage capacity of the memory buffer and thus the program attempts to overwrite adjacent memory locations.
Some of the useful functions in CS50 library can actually be implemented using scanf in C’s library, like
get_string() and so forth.
scanf(%i, &i); (datatype of the input, pointer to a variable to store the input)
scanf()won’t check for the data type of user’s input. For example, if you type in a string in the above example, the final
xwould be 0. No errors pop up.
scanf()won’t check how many bytes can be stored in the variable. If you type in a long int in the above example, the final
xwould not be what you expected and no errors are shown.
taking input and output from files
FILE is a data type that represents a file.
fopen(filename, mode) opens the file in a certain mode. Mode includes read, write and append. Append mode means to add new text to the file.
jpeg.c helps determine if a file is probably a jpeg. And
copy.c is similar to the command
cp we use in the Linux system. Both of them use the data type
CS50 Week 4 lesson: https://cs50.harvard.edu/x/2021/weeks/4/