poliboard

Last Updated: just now
Teams Analyzed: 0

Why this is worth looking at

The motivation for this site was to get a solid answer to the question "What do I need to prepare for for my next regional?". Since the dawn of VGC, conversations about this generally end up being pretty subjective - which makes sense given the data most players have available. For example, an observation like 'lots of people use Calyrex-Shadow' is correct, and probably useful, but if that was the deepest insight you were armed with on your way to Internats, you might feel a little unprepared.

To get a real answer (or at least as close as I can get), I decided it might be worth while to run some clustering algorithms over tournament data and see what showed up. The result of that, plus a bit of extra time over the holidays, is what you're seeing on this page.

The goal here is for you to be able to open up this site, have a quick look at the clusters my code has surfaced, and understand that those are the main archetypes you're most likely to face at your next tournament. Do note that we are working in the territory of most likely - if the community decides to start running something weird overnight - this site will likely not show it up.


Hierarchical clustering & why it's useful

Imagine you have a group of friends at a party. Some know each other well, others have just met, and a few are complete strangers. Now, let's say we want to group these friends into clusters based on how similar they are to each other. That's essentially what hierarchical clustering does, except instead of friends, it's working with data points.

In the context of this project, we're clustering Pokemon teams based on the similarities in their compositions. Each team is treated as a unique data point, and the goal is to find groups (or clusters) of teams that share common characteristics.

🔍 How Does Hierarchical Clustering Work?

Hierarchical clustering works by building a tree-like structure called a dendrogram, which shows how data points (in our case, teams) get grouped together step by step. Here's how the process unfolds:

  1. Start with every team as its own cluster. At the beginning, each Pokemon team is considered a separate cluster.
  2. Find the two most similar teams and merge them. The algorithm looks for the two closest teams (based on a similarity measure like shared Pokemon or moves) and groups them together into a single cluster.
  3. Repeat this process until all teams are in one big cluster. The merging continues until every team is part of one giant cluster. The result is a tree that shows how clusters were formed along the way.
  4. Cut the tree to create meaningful clusters. Once the tree is built, you can “cut” it at a certain level to decide how many clusters you want. For example, cutting the tree into 5 clusters might reveal distinct team archetypes. We pick the position of this 'cut' at wherever minimizes lost (i.e., wherever gives us the best results, mathematically).

🧰 Why Use Hierarchical Clustering for Pokemon Teams?

In this project, we're analyzing Pokemon teams from tournaments. Hierarchical clustering helps us:

📊 Example: Rain Teams vs. Trick Room Teams

Let's say we have 100 teams. Some teams are built around a Rain strategy (using Pokemon like Pelipper and Ludicolo), while others rely on Trick Room. Hierarchical clustering might group these into two separate clusters because their core strategies are quite different.

If we cut the dendrogram at a higher level, we might see these two clusters. If we cut it lower, we might notice that within the Rain cluster, there are further subgroups based on different variations of the strategy (e.g., some use Politoed, others use Pelipper).

🧪 How We Measure Similarity

In our case, similarity is calculated using a method called cosine similarity, which looks at the overlap between teams. The more Pokemon two teams have in common, the more similar they are. This is crucial for finding meaningful clusters that represent real-world team archetypes.

In short, hierarchical clustering is a way to make sense of large amounts of data by organizing it into meaningful groups. And in the context of Pokemon tournaments, it helps us discover which team archetypes are dominating the meta!


Roadmap

Things I'd like to add to this site in the near future

  1. pokepas.es exportability (this is actually a little harder than it looks);
  2. Ability to filter what tournaments go into the analysis;
  3. Win-rates of each archetype versus all the others;
  4. What would be the holy grail - a data explorer in the style of tactics.tools