The Secret Santa Problem

In this level-0 post, we figure out how many ways we can organize a gift exchange.

Introduction

In my family, we celebrate Christmas in part with a gift exchange. At an earlier family gathering, everyone draws names from a hat that tells them who they will give a gift to at Christmas. However, there is one rule: you can’t give a gift to someone from your immediate family. One of the downsides to drawing from a hat is that there is a chance that when you get to the last person, the final name in the hat could be someone in their immediate family. If we wanted to stick to true anonymity, we would have to entirely redo the drawing of the names (though in my family someone would just trade names with you). I’ve always wondered how many different ways that there were to have a gift exchange with these rules, and I thought I’d share my notes.

The problem of counting the number of possible gift exchanges is hard. If we only had the rule that each person gives and receives one gift, then we could easily compute that. The first person gives a gift to one of the p people in the exchange, then the next person can give a gift to any of the p-1 people who haven’t yet received a gift and so on until the last person gives a gift to the only person who hasn’t received one yet, for a total of p! possible gift exchanges. This is clearly a bigger number than we would expect from our problem, since many of these arrangements break our rules.

Let’s get on with it, shall we?

Let’s construct some language to talk about this problem that will make our lives easier. An extended family is basically a collection of immediate families, which are themselves collections of people. If we have N immediate families, each with their own size, a_i, we can represent an extended family as a collection of numbers, \{a_1,a_2,\ldots,a_N\}. Then, we can define the Secret Santa number S(\{a_1,a_2,\ldots,a_N\}) to be the number of possible gift exchanges with that extended family. So for example, if there were 2 immediate families of 1 person each, we would write S(\{1,1\})=1 because there is obviously only one option, which is that each person gives a gift to the other. From here, we can get a feel for some of the basic properties of S.

  • If there is only one immediate family, there is no way of giving a gift to someone of another immediate family.
    S(\{p\})=0, for any value of p.
  • It doesn’t matter what order you list the immediate families, you always get the same number of Secret Santa arrangements.
    S(\{a_1,a_2,\ldots,a_N\})=S(\{a_{\sigma(1)},a_{\sigma(2)}, \ldots,a_{\sigma(N)}\}), where \sigma is a function that shuffles the numbers 1 to N.
  • If one of the immediate families contains more than all the other immediate families combined, then there aren’t enough people for that family to give gifts to.
    If a_1>\displaystyle\sum_{n=2}^{N}a_n, then S(\{a_1,a_2,\ldots,a_N\})=0.

So we have some idea of how S should behave, but how could we actually compute the values of this function without just writing down every possible Secret Santa configuration and counting them? As a start, let’s try to compute some edge cases. Looking back to the last property, an immediate family cannot be too big. We can take the case where there is a family with exactly half of the total people in the extended family. Every member of the large family gives a gift to every member of all the other families, and every member of the other families gives a gift to a member of the large family. In this way, the other families are effectively one immediate family. We can visualize this as a picture. Every person is a little circle, and we can color them by which immediate family they come from. We will say the large family is blue to be concrete. Line up the members of the large family in one column, and the members of all the other immediate families in another column. Now, we can draw arrows from one person to another to represent the first person giving a gift to the second. By our rules, the arrows must start and end at members of different immediate families, and every person has exactly one arrow starting from and ending at them.

Two columns of 5 people, the left column is blue, the right column has red and green people. Arrows flow between the left and right column (but not within a column), such that each person has an arrow starting and ending on them.

In this example picture, we can see that there is no room in the diagram for any of the people in the right column to give gifts to each other, since then there wouldn’t be enough people left over to receive gifts from the large family. So in this edge case, the arrows must go from one column to the other. That simplifies the counting considerably! If we say that the total number of people in the extended family is p, then there are p/2 people in each column. Let’s start counting. The first person in the left column has p/2 possible people to give a gift to. Once that person is decided, the next person in the left column has p/2-1 people left over as possibilities. This continues, as the 3rd person has p/2-2 options, on down to the last person, who will have to give a gift to the only person left on the other side. In total, that gives (p/2)(p/2-1)(p/2-2)\ldots(2)(1) = (p/2)! possible ways for the people in the large family to give gifts to the people in all the other immediate families. The people in the right column have to give gifts to the people in the left column as well, which also has (p/2)! possibilities (this is how they act effectively as a single immediate family). All-in-all, we have S(\{p/2,a_2,a_3,\ldots,a_n\})=[(p/2)!]^2. This picture we drew to represent the flow of gifts from person to person is called a directed graph. A graph is a collection of points called nodes (here representing people) and edges that connect a node to another node, or even a node to itself. A directed graph replaces these edges with arrows. Clearly, not every directed graph could represent one of our gift exchanges, since we have extra rules, but the fact that we can use this representation gives us a simple tool that we can play with.

Recursion

Rather than considering how to count all the valid graphs, why don’t we look at the structure of a valid graph and see if it can tell us anything about how it is constructed. In particular, I would like to see what happens if we remove a node from the graph and fuse the two arrows connected to that node. There are two possible outcomes, depending on the situation: either the resulting smaller graph is a valid or invalid gift exchange. It would remain a valid gift exchange if the nodes on either side of the deleted node are different colors, and it would become invalid if the nodes on either side are the same color.

Three cases of deleting a node from a graph. Left: the middle of three nodes is deleted, and the other two are different colors, resulting in a valid graph. Center: the middle of three nodes is deleted, but the other two nodes are the same color, resulting in an invalid graph. Right: two nodes are connected to each other. One node is deleted, leading to the remaining node being connected to itself.

In the graphs which are now invalid, I’d like to point out a couple of things. These graphs are nearly valid, as there is only one point where the rules are broken. In fact, if we remove the rule breaking node, we would get a valid graph, as we can see in this animation:

Two cases of deleting a rule breaker. Left: a rule breaker who is connected to another member of its own family is removed, resulting in a valid graph. Right: a rule breaker who is connected to themself is deleted, leaving the remaining graph valid.

Now that we understand this structure, let’s go the other way and insert a node into a graph, counting along the way. Since the ordering of the families doesn’t matter, we can always reorder them so that we are adding to the last family. Starting with valid graphs, how many ways are there to insert a new member of the last family? The only thing we have to be careful of is placing our new person next to another member of their family. If the family previously had a_N members, and we don’t want our new person to give or receive presents from them, then there are 2a_N forbidden spots to insert our new person into the graph. In other words, there are \left(\displaystyle\sum_{i=1}^{N}a_i\right) - 2a_N=\left(\displaystyle\sum_{i=1}^{N-1}a_i\right) - a_N places to insert our new person. We can do this for every smaller valid graph, so these contribute

\left[\left(\displaystyle\sum_{i=1}^{N-1}a_i\right) - a_N\right]S(\{a_1,\ldots,a_N\})

Now let’s consider graphs that are made by inserting a node into a smaller nearly valid graph. Recall that nearly valid graphs have a single rule breaker from another family. We want to know how many different rule breakers there can be. In family i, there are a_i possible rule breakers, who could be trying to give a gift to anyone in their family, for a total of a_i{}^2 possible rule breaking configurations. We also established earlier that removing a rule breaker makes a valid graph. Within a family, every valid rule breaker/valid smaller graph comes out to a_i{}^2 S(\{a_1,\ldots,a_i -1,\ldots,a_N\}). Then, we have to add up all the contributions from every family (except for the last one of course):

\displaystyle\sum_{i=1}^{N-1}\left[a_i{}^2 S(\{a_1,\ldots,a_{i}-1,\ldots,a_N\})\right].

That takes care of all the graphs that can be made out of smaller valid and invalid graphs, which is all of them! This gives us a recursion relation that we could use to build up Secret Santa numbers from smaller Secret Santa numbers:

S(\{a_1,\ldots,a_{N} +1\})=\left[\left(\displaystyle\sum_{i=1}^{N-1}a_i\right) - a_N\right]S(\{a_1,\ldots,a_N\}) + \displaystyle\sum_{i=1}^{N-1}\left[a_i{}^2 S(\{a_1,\ldots,a_{i}-1,\ldots,a_N\})\right]

It is worth noting that this equation works even when the person you are adding is in a new family. In other words, this still works if a_N=0.

The above formula may look complicated (because it kind of is), but for a computer, this is a cake walk. I wrote some code that computes Secret Santa numbers by checking all possible graphs and counting which ones satisfy the rules, and also computes using the recursion relation. The recursion relation makes things run orders of magnitude faster. That is incredibly important when the number of graphs with p nodes scales like p!. For an extended family \{3,3,3\}, the recursion method was 1,000\times faster than the brute force method. Adding one person to get an extended family of \{3,3,3,1\} made the recursion method 10,000\times faster.

Exploration

Let’s explore this recursion relation a little. Probably the easiest set of extended families to consider are the ones where every immediate family has one member. In this case, we can use the shorthand s_n = S({1,1,\ldots,1}), where there are n 1’s. Then, we just replace all the a_i‘s in the recursion relation with 1, except for a_N, which we take to be 0.

s_{N+1} = N\cdot(s_{N}+s_{N-1})

We know s_1=0 and s_2=1, and then we can use those values to generate s_3=2 and so on. The sequence starting at s_1 looks like 0,1,2,9,44,265,\ldots. If you search for this sequence in the Online Encyclopedia of Integer Sequences (OEIS), it will tell you that s_N is the number of derangements of N objects (OEIS A000166). A derangement of a set of objects is a reordering where no element remains in its original place. We can see the connection with our graphs here. If we lined up all of our people, and instead of having them give gifts, we had them move to the spot of whoever their arrow points to on the graph, we would get a derangement because we are only looking at graphs where no one can have an arrow that points to themself. There are a couple of interesting closed form expressions for s_n,

s_N = N!\displaystyle\sum_{n=0}^{N}\frac{(-1)^n}{n!}\, ,\hspace{3em}\text{and}\hspace{3em} s_N = \left\lfloor\frac{N!+1}{e}\right\rfloor\,,

where e\approx2.7 is Euler’s number, and \lfloor x\rfloor rounds x down to the nearest whole number.

We could continue along this path and talk about the extended families made up of families of 2 people and so on, but it gets quite a bit more complicated because with our recursion relation, we would mix the all-2’s numbers with mixed 2’s-and-1’s numbers. There is an OEIS entry for the case of all 2’s (OEIS A000316), but not for any higher repeated numbers.

A general rule to live by is that if a problem is easy enough to state, then someone has probably thought about it before. This is no different, even down to the “Secret Santa Problem” name. I saw so many different directions that people have taken to look into this problem: efficiently generating valid gift exchange graphs, incorporating more or slightly modifying the rules, and figuring out how to fully anonymize the process, to name a few. I encourage you to play with this problem and see how much you can discover.

Bonus (with calculus)

Since this is a level-0 post, I didn’t want to put this in the main text, as it involves some calculus. While I was researching for this blog post, I came across a general expression for the Secret Santa numbers:

S(\{a_1,\ldots,a_N\})=\displaystyle\int_0^\infty e^{-x}\prod^{N}_{i=1}\left[(-1)^{a_i}\,a_i !\,L_{a_i}(x)\right]\,dx ,

where the L_{n}(x)‘s are Laguerre polynomials, a family of polynomials that have some nice properties. They show up in quantum mechanics when studying the Hydrogen atom. Exactly where this relationship between Secret Santa and Laguerre polynomials comes from is a mystery to me. The paper I found it in only proved that it works, not why it works. Regardless, this gives us more information to work with, since we now have access to all sorts of Laguerre polynomial identities. For example, note that \left(L_{1}(x)\right)^2 = 2L_{2}(x) - 2 L_{1}(x) + L_{0}(x) . If we use t_{n,m} as a shorthand for the Secret Santa number with n families of 2 and m families of 1, we can use the aforementioned identity to show

t_{n,2} = t_{n+1,0} + 2 t_{n,1} + t_{n,0} .

We can also use our main recursion to find

t_{n,2} = (2n+1)t_{n,1}+4nt_{n-1,2}+t_{n,0}\hspace{1em}\text{and}\hspace{1em}t_{n,1}=2nt_{n,0}+4nt_{n-1,1}.

We can combine these to find a recursion relation that only involves families of 2 (i.e. the t_{n,0}‘s):

(2n-1)t_{n+1,0}=2n(2n+1)^2 t_{n,0}+4n(2n-1)t_{n-1,0}-16n(2n+1)(n-1)t_{n-2,0}.

Perhaps something similar works out for the case where all families have 3 people…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: