CS50 pset3 Tideman (Merge sort)

8 min readMar 14, 2022

This is a step-by-step tutorial on how I solved the problem Tideman in CS50 week3.

Before you continue, make sure you fully understand the problem, which centers on the Tideman election and is explained carefully here.

The main goal of the problem is to turn the ranked-choice voting system into codes. The main function is already written as the following shows and our tasks are to complete all the functions that have the comments “TO-DO” inside. (It is also allowed to create new functions when necessary.)

#include <cs50.h>
#include <stdio.h>// Max number of candidates
#define MAX 9// preferences[i][j] is number of voters who prefer i over j
int preferences[MAX][MAX];// locked[i][j] means i is locked in over j
bool locked[MAX][MAX];// Each pair has a winner, loser
typedef struct
{
    int winner;
    int loser;
}
pair;// Array of candidates
string candidates[MAX];
pair pairs[MAX * (MAX - 1) / 2];int pair_count;
int candidate_count;// Function prototypes
bool vote(int rank, string name, int ranks[]);
void record_preferences(int ranks[]);
void add_pairs(void);
void sort_pairs(void);
void lock_pairs(void);
void print_winner(void);int main(int argc, string argv[])
{
    // Check for invalid usage
    if (argc < 2)
    {
        printf("Usage: tideman [candidate ...]\n");
        return 1;
    }// Populate array of candidates
    candidate_count = argc - 1;
    if (candidate_count > MAX)
    {
        printf("Maximum number of candidates is %i\n", MAX);
        return 2;
    }
    for (int i = 0; i < candidate_count; i++)
    {
        candidates[i] = argv[i + 1];
    }// Clear graph of locked in pairs
    for (int i = 0; i < candidate_count; i++)
    {
        for (int j = 0; j < candidate_count; j++)
        {
            locked[i][j] = false;
        }
    }    pair_count = 0;
    int voter_count = get_int("Number of voters: ");// Query for votes
    for (int i = 0; i < voter_count; i++)
    {
        // ranks[i] is voter's ith preference
        int ranks[candidate_count];// Query for each rank
        for (int j = 0; j < candidate_count; j++)
        {
            string name = get_string("Rank %i: ", j + 1);            if (!vote(j, name, ranks))
            {
                printf("Invalid vote.\n");
                return 3;
            }
        }        record_preferences(ranks);        printf("\n");
    }    add_pairs();
    sort_pairs();
    lock_pairs();
    print_winner();
    return 0;
}// Update ranks given a new vote
bool vote(int rank, string name, int ranks[])
{
    // TODO
    return false;
}// Update preferences given one voter's ranks
void record_preferences(int ranks[])
{
    // TODO
    return;
}// Record pairs of candidates where one is preferred over the other
void add_pairs(void)
{
    // TODO
    return;
}// Sort pairs in decreasing order by strength of victory
void sort_pairs(void)
{
    // TODO
    return;
}// Lock pairs into the candidate graph in order, without creating cycles
void lock_pairs(void)
{
    // TODO
    return;
}// Print the winner of the election
void print_winner(void)
{
    // TODO
    return;
}

In the following will be explaining the logic behind each function and how to put these ideas into codes.

// Query for votes
    for (int i = 0; i < voter_count; i++)
    {
        // ranks[i] is voter's ith preference
        int ranks[candidate_count];// Query for each rank
        for (int j = 0; j < candidate_count; j++)
        {
            string name = get_string("Rank %i: ", j + 1);            if (!vote(j, name, ranks))
            {
                printf("Invalid vote.\n");
                return 3;
            }
             
            record_preferences(ranks);            printf("\n");        }
      ...

Inside the main function, the first thing is to ask for voter’s preferences and record them by calling the vote function. Let’s say we have three candidates. When voting, each voter will first be asked their favorite candidate and then store the value in ranks, then he or she will be asked the second favorite candidate … the whole process would iterate three times.

vote

Voters give us a list of candidates ordered by their preferences and we store their choices into ranks.

The first parameter, integer rank, means the ith favorite candidate of the voter. Looping through each candidate, when candidates[i] matches the name, we get the index of the candidate and update the ranks. Otherwise, we know the name is not valid and should return false.

// Update ranks given a new vote
bool vote(int rank, string name, int ranks[])
{
    // ranks[i] is the index of the candidate who is the ith preference for the voter
    for (int i = 0; i < candidate_count; i++)
    {
        if (strcmp(candidates[i], name) == 0)
        {
            ranks[rank] = i;
            return true;
        }
    }
    return false;
}

record_preferences

After recording all the preferences of a voter in ranks, we have to update the preferences array. This array is a 2D array that indicates how many voters prefer candidates[i] to candidates[j]. It collects every voters’ choices. For instance, if preferences[2][1] = 5, that means 5 voters like candidates[2] more than candidates[1].

Inside the function, we use a nested loop. Notice the inner loop starts from i + 1 instead of i since we don’t need to compare the preferences between two identical candidates.

// Update preferences given one voter's ranks
void record_preferences(int ranks[])
{
    // TODO
    for (int i = 0; i < candidate_count -1; i++)
    {
        for (int j = i + 1; j < candidate_count; j++)
        {
            {
                preferences[ranks[i]][ranks[j]]++;
            }
        }
    }
    return;
}

add_pairs

Next, we are going to add pairs where one is preferred over the other. This step helps us figure out the winner and loser of every single pair.

To determine the winner and loser of each pair, we use a nested for loop and see if preferences[i][j] > preferences[j][i]. If it does, then candidates[i] wins against candidates[j], and vice versa. Record the winner and loser in pairs and increment pair_count. (Don’t forget pair_count is initialized to 0 in the beginning)

For pairs that have the same amount of votes, like preferences[1][2] = 4 and preferences[2][1] = 4, neglect them.

// Record pairs of candidates where one is preferred over the other
void add_pairs(void)
{
    // if there is a tie between pairs, do not add them
    for (int i = 0; i < candidate_count -1; i++)
    {
        for (int j = i + 1; j < candidate_count; j++)
        {
            {
                if (preferences[i][j] > preferences[j][i])
                {
                   pairs[pair_count].winner = i;
                   pairs[pair_count].loser = j;
                   pair_count++;
                }
                else if (preferences[i][j] < preferences[j][i])
                {
                   pairs[pair_count].winner = j;
                   pairs[pair_count].loser = i;
                   pair_count++;
                }
            }
        }
    }
    return;
}

merge_sort

Now let’s sort pairs according to the strength of victory. Why do we need to sort? That’s because we need to lock the pairs later. To lock the pair, we have to first search for the strongest pair. This step would greatly facilitate the following step.

In CS50 week3, we learned three sorting algorithms: selection sort, bubble sort and merge sort. Feel free to pick the one that you are comfortable with. I picked merge sort just to give myself a challenge.

Because the strength of victory comes from the preferences array, we need to pass both the winner and loser of the pair to preferences. That’s why you see preferences[left_winner][left_loser] in codes. It might look a bit complicated. But as long as you know what parameter you should use, everything makes sense.

void merge_sort(int i, int j, pair pair_example[], pair temp[])
{   
    // i = the first index of array, j = the last index of array
    if (j <= i)
    {
        return;
    }    int mid = (i + j) / 2;    merge_sort(i, mid, pairs, temp);
    merge_sort(mid + 1, j, pairs, temp);    int left_pointer = i;
    int right_pointer = mid + 1;    for (int k = i; k < j + 1; k++) 
        // k is the index of the start of array
    {
        int left_winner = pair_example[left_pointer].winner;
        int left_loser = pair_example[left_pointer].loser;
        int right_winner = pair_example[right_pointer].winner;
        int right_loser = pair_example[right_pointer].loser;        if (left_pointer == mid + 1) // left array reaches its end
        {
            temp[k] = pair_example[right_pointer];
            right_pointer++;
        }
        else if (right_pointer == j + 1) // right array reaches its end
        {
            temp[k] = pair_example[left_pointer];
            left_pointer++;
        }
        else if (preferences[left_winner][left_loser] > preferences[right_winner][right_loser])
        {
            temp[k] = pair_example[left_pointer];
            left_pointer++;
        }
        else
        {
            temp[k] = pair_example[right_pointer];
            right_pointer++;
        }
    }    for (int k = i; k < j + 1; k++)
    {
        // copy sorted array into original pair
        pair_example[k] = temp[k];
    }
}

sort_pairs

Implement merge_sort to sort_pairs.

// Sort pairs in decreasing order by strength of victory
void sort_pairs(void)
{
    pair temp[pair_count];
    merge_sort(0, pair_count - 1, pairs, temp);
    return;
}

has_cycle

This function will be called by locked_pairs. To lock a pair, we start from the strongest pair and check whether this pair would create a cycle. If it does not, then we lock the pair.

If you are not that clear on what is a cycle and why we should avoid that, visit this link first. The author explained the whole concept thoroughly and helped me figure out the flaws in my previous thoughts.

To see if a pair will create a cycle, first you check if locked[cycle_end][cycle_start] equals to true. If it does, that means locking [cycle_start][cycle_end] would definitely create a cycle.

e.g. a => b => a (cycle created)

If it does not, continue to check if cycle_end is the cycle_start of any locked pair, since this condition could lead to a cycle.

e.g. Say we have a sorted pair like this:

[(d, a), (a, b), (b, c), (c, a), (d, b), (d, c)]

The first three pairs are locked successfully without creating any cycle. However, in the fourth pair, a, the current cycle_end, is the cycle_start of (a, b), and b finally points to c, the current cycle_start. A cycle would be created if (c, a) is locked. (d => a => b => c => a)

To avoid this situation, check whether b, the cycle_end of (a, b), would eventually end with c. Here, we use recursion to repeatedly check for potential cycles. We also need to define the base case of recursion, where the cycle is found and recursion stops.

d => a => b => c => a (cycle created)

bool has_cycle(int cycle_start, int cycle_end)
{
    if (locked[cycle_end][cycle_start]) //base case
    {
        return true;
    }for (int i = 0; i < candidate_count; i++)
    {
        if (locked[cycle_end][i] == true && has_cycle(cycle_start, i)) // create a cycle
        {
            return true;
        }
    }
    return false;
}

lock_pairs

If a cycle would not be created, lock the pair.

// Lock pairs into the candidate graph in order, without creating cycles
void lock_pairs(void)
{
    // TODO
    for (int i = 0; i < pair_count; i++)
    {
        if (!has_cycle(pairs[i].winner, pairs[i].loser))
        {
            locked[pairs[i].winner][pairs[i].loser] = true;
        }
    }
}

print_winner

Finally, print out the winner. A winner has no arrows pointing to him/her, suggesting that no one beats him/her. Remember we just lock the pairs. Since winner never lose, there must be no locked pair where the winner is the second index.

e.g. If the winner is candidate[3], then looping through each candidate as i, locked[i][3] must always equal to false.

So what we can do now is looping through all the candidate with a nested loop. Since winner never lose, every candidate[i][winner_index] would produce a false. And when the false_count equals to candidate_count -1 (candidate[i][i] must equal to false, so don’t count it), we found the winner.

// Print the winner of the election
void print_winner(void)
{
    // TODO
    for (int col = 0; col < candidate_count; col++)
    {
        int false_count = 0;
        for (int row = 0; row < candidate_count; row++)
        {
            if (locked[row][col])
            {
                break;
            }
            if (!locked[row][col])
            {
                false_count++;
                if (false_count == candidate_count - 1) // No one has won against the candidate
                {
                    printf("%s\n", candidates[col]);
                    break;
                }
            }
        }
    }
    return;
}

Voila! After submitting the solution to check50, you should be able to see green smiling faces on the screen. It was such a great relief that I finally solved the problem.

To see the full solution to this problem, please refer to this link. If there is any error or anything I missed, feel free to let me know. Hope you find something helpful in this article.