Most music is based on the harmonic series. It is a very simple progression of frequencies which sound good together. For example, let's say a note is played at 1000 Hz. From there, the harmonic series goes up:
- 1000 Hz
- 2000 Hz
- 3000 Hz
- 4000 Hz
- 5000 Hz
- ...
We can also go down instead of up...
- 1000 Hz
- 500 Hz
- 333 Hz
- 250 Hz
- 200 Hz
- ...
And that's pretty much all we need to know to build a musical scale.
Here's what happens if we start at an arbitrary note and then apply the harmonic series. The algorithm used is very simple:
- Start with one frequency. Make a dot on the keyboard where that frequency is.
- For each harmonic we care about, and each frequency we've looked at so far, do the following:
- Multiply or divide that frequency by a simple number, like 2 or 3 or 1/2 or 1/3.
- Make a dot on the keyboard where that frequency would be.
- Add this frequency to the list, then move down one row and go back for another iteration.
Here's how it looks when we only use "1" as a ratio. There's only one note:
(each dot is one iteration, going from top to bottom)

If we add in another harmonic, the ratios are 1/2, 1, and 2.
This gives us the original frequency plus all the different octaves.

Add one more harmonic, and the ratios are 1/3, 1/2, 1, 2, and 3.
At first it gives us just the original note, the octaves, and the extra note in a "power chord" like what people play in heavy metal. That's called a seventh, because it's 7 half-steps up from the original tone. But after enough iterations, these ratios produce 12 different clusters of frequencies.

Add one more harmonic, and the ratios are 1/4, 1/3, 1/2, 1, 2, 3, and 4.
This converges to the same results as last time, but it takes fewer iterations to get there. At the end, the numbers land in 12 different clusters. This is why the common scale has 12 notes per octave.

Add in one more harmonic, and then things turn into a huge mess.
Almost nobody writes music this way, because it's too complicated and usually sounds bad.

You may have noticed that the notes aren't perfectly aligned. Here's a closer look.
The blue line in each note represents "equal temperament", or what happens when we space 12 frequencies as evenly as possible throughout the octave. This is used as a common tuning for many instruments because, even though it's not perfect, it's really close... and then it doesn't matter what scale or key you play in. They're all more or less equally in tune (or equally out of tune).
The blue dots, of course, represent perfect intervals relative to the original note of "C".

Sometimes musicians on fretless instruments bend each note slightly up or down to get closer to a perfect interval. Sometimes people tune their instruments to a specific key, to align everything with the dots above... so it sounds better when played in that key, but worse when played in any other key. And sometimes people play microtonal instruments so they can get a perfect and exact pitch every time no matter what key they're playing in. But that requires a really good ear, and it places extra restrictions on composition... and some listeners may think it sounds out of tune because they're accustomed to the evenly-spaced scale.
Zooming in even further, and running the algorithm for more iterations, it becomes clear that most notes have two possible frequencies for perfect intervals... but one is usually more dominant, and the other is only used on rare occasions. Mostly, the choice of which one is better seems to depend on which direction it's approached from, like from above or below, and from how far away.

Or, if we use all 12 equal-temperament notes as a starting point instead of just "C", here's how the result looks. It's like the previous image, but with 12 slightly-offset copies all overlaid onto the same graph:

Overall though, the slight detuning isn't usually a problem.