CS50 Week4 Memory

De_
7 min readApr 12, 2022
CS50 Week4 Memory

hexadecimal

hex means 16

hexadecimal: 0,1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F

Why do scientists like to use hexadecimal?

16 can represent 4 binary digits, which is 1111 in binary base. (2 * 2 * 2 * 2 = 16). With just two hexadecimal digits, we can express any number from 0 to 255. To do the same in binary, we need 8 digits. As number getting larger and larger, we need more digits to represent values.

People often use prefix like 0* to indicate the usage of hexadecimal, otherwise it might get mixed up with decimal.

e.g. 0*48 (a hexadecimal number 48)

RGB is a system that uses hexadecimal to represent different colors.

e.g. 000000 = black, FFFFFF = white

Memory, today’s topic, is also an example that uses hexadecimal.

Memory address starts from 0, all the way up to FF like this:

Hexadecimal memory address

Pointer

A variable that contains the address of other value. Its type is int * and takes up 8 bytes in most of the time.

&: address operator, gets the address. When you put an ampersand before a variable, C gives you the address that the variable is stored in.

*: dereference operator, goes to the address. When you put an star sign before a variable, C goes to the address and return the value stored inside.

What does the memory look like?

pointer in the memory

The pointer p is also a variable stored in somewhere in the memory. But we don’t need to care about where p is stored. All we need to know is that the value of p is the address of n, in this case, 50. P is just a pointer that points to the address of int n.

String in memory

string s = “HI!” What does this line of code look like in the memory?

string in the memory

We know string is an array of chars that are stored back to back in the memory, that is, they are stored one byte apart. That’s why you see the memory address is in order.

What about s?

The value of s is the address of the first character in the string. In the above example, it would be 0x123.

How does the computer know the end of the string?

When hitting the value of \0, the null character, we know we are at the end of a string. Variable s is just a pointer pointing to the address of the first character.

The data type string does not exist in C. More precisely, it exists in C as char star. <cs50.h> just make a custom data type called string.

This line of code:

char *s = "HI!";

equals these lines of code in <cs50.h>

typedef char *string;
string s = "HI!"

Pointer arithmetic

Remember we can use s[1] to get the second element in string s? It is actually a syntactic sugar. We know a pointer in C is an address, which is a numeric value. By incrementing the pointer by 1, the program gives us access to each of the succeeding elements in the array just like what below shows:

pointer arithmetic

String comparison

Suppose we want to compare two strings. With the program below, even when we type in two identical strings, the result still turns out to be different. Why?

That’s because s and t represent the address of the first character in the string, not the character itself.

String copy

Now let’s look at another example. We want to create t which copies the value of s, and then change the element in t without modifying s. However, it turns out that both s and t got changed afterwards.

That’s because in copy.c, s and t both refer to the same address. When t[0] got changed, s[0] also got changed.

copy.c

How to copy string values rather than memory addresses?

correct way to copy a string

Malloc() and free()

function malloc(int num): memory allocation

syntax: char *t = malloc(4);(allocate 4 bytes to t’s memory)

The malloc function returns the first address of the chuck of memory that it has allocated for you. But sometimes it may return a nul pointer when there’s no memory available. Thus, better check for the return value when using this function.

NULL, is not the same as nul, it’s a nul pointer, an empty address.

nul, or \0, is a null character.

function strcpy(destination, source)

strcpy function helps you to copy strings.

functionfree(pointer)

Contrary to malloc(), free function returns the memory allocated by malloc(), calloc(), and realloc() back to the computer, this way you won’t run out of memory. It’s better to always give the memory back after you are done with the program. Just give the memory address you want to release as the input, the function will free the memory.

valgrind

Another useful debug tool for detecting potential memory problems

It helps detect any memory related problems. When you do not use free() to release memory, when you touch any memory that you should not(segmentation fault), and when memory leaks, valgrind reports the problem to you.

How to use valgrind? Write a problematic program first

call valgrind to detect problems: valgrind ./memory.c

Errors:

Invalid write of size 1

⇒ A write refers to changing a value. In the above example, we only ask for three bytes, and are not supposed to change the value in the fourth byte.

Invalid read of size 1

⇒ A read refers to reading, using or printing a value. In the above example, we print out the whole string, which includes a memory address that we should not have touched.

LEAK SUMMARY: ==705== definitely lost: 3 bytes in 1 blocks

⇒ We do not free the memory got from malloc() at the end of the program.

Memory layout

Say we write a function called swap, it should swap the values of two variables:

Why the above program does not swap values of x and y successfully?

It turns out that computer treats different portions in computer’s memory in different ways. Everything are stored in a standard location.

memory usage in C

Machine code: The 0s and 1s that compose your program

global variables: Constants that are put outside your functions

heap: a chunk of memory that malloc() uses to get some extra space. And you need to use free() at the end to release memory

stack: When calling functions, like main(), strlen(), strcmp(), you are using spaces called stack space to store some local variables and parameters. Don’t need to use free() function at the end because when functions are executed, they are popped and removed from the space.

As you can see from the picture, stack and heap spaces point at one another. We typically have enough memory, so these two spaces won’t collide.

In the above program, we call function swap. It does swap values, but the values are just the copies of x and y. The real x and y in the main function remain unchanged.

how stack works

How to fix the problem?

Pass pointers instead so this time we are not just copying x and y’s values and passing them to swap(), but actually going to the addresses of them and making changes.

correct way to swap values

stack overflow: Calling too many functions till the memory overflows the heap memory

heap overflow: A form of buffer overflow. It happens when a chunk of memory is allocated to heap, and data is written into the memory without any bound checking, leading to internal structures being overwritten.

some critical data structures in heap are overwritten,

buffer overflow: Buffer is a chunk of memory, an array of memory that temporarily holds data while it is being transferred from one location to another. A buffer overflow (or buffer overrun) occurs when the volume of data exceeds the storage capacity of the memory buffer and thus the program attempts to overwrite adjacent memory locations.

Scanf

Some of the useful functions in CS50 library can actually be implemented using scanf in C’s library, like get_int(), get_string() and so forth.

syntax: scanf(%i, &i); (datatype of the input, pointer to a variable to store the input)

Possible errors:

  • scanf() won’t check for the data type of user’s input. For example, if you type in a string in the above example, the final xwould be 0. No errors pop up.
  • scanf() won’t check how many bytes can be stored in the variable. If you type in a long int in the above example, the final x would not be what you expected and no errors are shown.

file I/O

taking input and output from files

file.c

FILE is a data type that represents a file.

fopen(filename, mode) opens the file in a certain mode. Mode includes read, write and append. Append mode means to add new text to the file.

jpeg.c helps determine if a file is probably a jpeg. And copy.c is similar to the command cp we use in the Linux system. Both of them use the data type FILE.

jpeg.c
cp.c

References

CS50 Week 4 lesson: https://cs50.harvard.edu/x/2021/weeks/4/

Slides: https://cdn.cs50.net/2020/fall/lectures/4/lecture4.pdf

--

--

De_

Who dares, wins. | Backend Engineer | Voracious Reader | Dog Person | Open to challenge