Generating Music from Basics

In this post, I am discussing on music notes, frequencies and the science behind it. Along the way I shall live demo you what I am talking about. For the demos, I am using an arduino nano to generate PWM signals (square-wave) and a cheap buzzer speaker (Piezoelectric). It is possible to use a headphone or a speaker, but that will require more elaborate amplifier circuits, may be in a future post I will attempt that. For now lets just stick to basics.

Sound Waves

As you probably know sound is just vibrations in the air. As a membrane vibrate at a particular rate it produces a wave. For example, to produce a way of 450 hz, the membrane need to vibrate 450 times a seconds (yes, that is really fast). We sapiens can sense sound waves from 20 hz to 20,000 hz with our ears. Using a tuning fork, it is possible to produce vibrations at a particular frequency. A real source, however, rarely produces a single sine wave, it is usually a sum of several sinusoidal. Different musical instruments produces different combinations of sinusoidal which give it its own distinct sounds. The reason a guitar sounds very different than a violin is because the combinations of sine-waves each produces.

Image result for sound waves

Demo-1 : Sa Re Ga Ma (Indian Music Notation System) under a Sound Spectrum Analyzer

Let me demonstrate this generation of sine waves as I sing the musical octaves, in the Indian music system, we use the notation of Sa Re Ga Ma. This is almost exactly like the western music system of c,d,e,f,g,…etc. In a later section I go over it a bit more details. I would like to mention that when I was a kid (probably 12-15 years old), I used to learn Indian classical music. Since a microphone captures the sound intensity as time passes it is oftentimes hard to see the sinusoidal. However, using Fourier transform of this time series data it is possible to know which frequencies exist in it, this is referred to as Frequency domain or Spectral domain representation. For the reader who are unaware of the details of Fourier transform, it suffices to know that by applying Fourier transform to the sound data, we can get the constituent frequencies in the signal. On an android phone, there are several apps which can do the fast-fourier-transform (FFT) of the sound data coming in from the phones microphone.

I use the Spectroid app , in the above video. Notice my Sa is about 130 hz and the Sa in the next octave (ie. Sa after Re, Ga, Ma, Pa, Dha, Ni) is at about 260 hz. When I say 130hz, it is the dominant frequency. No sapiens can generate a pure sine waves, notice the other higher frequency noise that make my voice less melodious. The third thing to note in this context is that my Sa is 130 hz, different person can have their Sa at different frequencies, the ratios of the frequencies of Sa:Re:Ga etc make them sound them like the octaves, the absolute frequencies so not matter. When singers sing in a concert they have a reference Sa-Pa frequency, which they often listen to with their headphones.


There are several types of speakers that are available in the market. However all of them have one thing in common, there is a material that vibrates at the requested frequencies which in turn produces sounds. The difference between expensive speakers and cheap speakers is precision with which it can produce the requested frequencies and the exact mechanism of sound generation. I am going to breifly go over the following types: a) Electrodynamic speakers b) Piezo-electric speakers c) Rare earth speakers. The most common form of speakers the reader might be familiar are the Electrodynamic speakers. They involve a permanent magnet attached to the base and an electro-magnet attached to the vibrating diaphragm which turn on and off at the requested frequencies which vibrate the diaphragm.

Image result for electrodynamic speaker

There are several materials found in nature called piezo electric materials. They have a special property that when pressure is applied to these material (with a hammer for example) they produce a voltage, conversely, when a voltage is applied to these materials they vibrate. Quartz is such a material. Speakers can be made out of such materials, called the Piezospeakers. Such speakers are in popular use in buzzers. Commercial buzzers can be bought as low as 10 US cents – 50 US cents. A commercial package is available in 3 pin or a 2 pin configuration. I have a 3 pin package. +VCC, GND and Signal. The way to drive this speaker is to send a PWM signal to the signal pin. For example, if we want to produce 130hz sound on peizospeaker we need to send PWM to the speaker at 130hz (ie. with a period of about 7692 microseconds).

I am going to use a piezo speaker for the experiments in this post, because it needs a low current to drive (easily be driven directly by Arduino’s PWM). To drive a electrodynamic speaker one needs an amplifier circuit which I shall explore in a future blog post.

A rare earth element called Dysprosium has the property that it naturally resonates on application of an external electric field. This is used to construct the so called Rare-earth Speaker. These are fairly expensive and a relatively new technology. See the video below by youtuber-Thoisoi on it.

Indian Music Theory and Notation System

Let us understand a bit on how to derive the frequencies of Sa Re Ga etc. Once we know these frequencies we know the ABCD of music, with which we can build up songs. With this end in mind lets dive in……

There are two major Indian music system. The Hindustani system and the Carnatic System. For this discussion lets stick to the Hindustani notation system. There are 12 sounds (svaras) in an octave. 7 pure sounds (Shudha): Sa, Re, Ga Ma, Pa, Dha, Ni; 4 komal (half) sounds: komal Re, komal Ga, Komal Dha, komal Ni and 1 Tivra sound: Tivra Ma. Typically in this system we talk about 3 octaves (saptak), ie. three repetitions of Sa Re, Ga,…Ni. The three saptaks are referred to as a) Mandra sapta, Madya saptak and Taar Saptak. This is so because, human’s sound will usually fall in 3 octaves only. Note that the frequency of Sa in taar saptak is two times the frequency of Sa in madya sapta. Similarly frequency of Pa in madhya saptak will be 2 times frequency of Pa in taar saptak. The absolute values of the frequency do not matter. All the svaras are defined relative to a reference. The table below summaries the information presented in this paragraph.

For this calculation, lets set the reference as 400hz for Sa of madya sapta. The band 400hz-800hz is harmonically divided into 12 parts. By harmonically we mean the next sound’s frequency will be a constant factor times the previous. In this case if we set the constant factor g as \sqrt[12](2) (the 12th root of 2). The svaras will be 400 hz, 400 \times g hz, 400 \times g^2 hz, … , 400 \times g^{11}. If we use say the reference of 270hz for Sa, the octave will sound exactly the same to a human listener but with a difference pitch, in this case the band 270hz-540hz will be harmonically divided into 12 parts. This following python snippet will help you get the frequencies (and wave periods) of each of the sounds.

ref_madya_sa = r = 400 #hz, change this as needed
g = 2**(1.0/12.0)
Sa = r
Re_k = r * g
Re =   r * g**2
Ga_k = r * g**3
Ga =   r * g**4
Ma =   r * g**5
Ma_t = r * g**6
Pa =   r * g**7
Dha_k =r * g**8
Dha =  r * g**9
Ni_k = r * g**10
Ni =   r * g**11

svara_list = ['Sa', 'Re_k', 'Re', 'Ga_k', 'Ga', 'Ma', 'Ma_t', 'Pa', 'Dha_k', 'Dha', 'Ni_k', 'Ni' ]
svara_freq = [Sa, Re_k, Re, Ga_k, Ga, Ma, Ma_t, Pa, Dha_k, Dha, Ni_k, Ni ]

for i in range( len(svara_list) ):
    print( 'svara=%-5s\tfreq=%4.4f\tperiod(microsecs)=%4.4f' %(svara_list[i], svara_freq[i], 1.0E6/svara_freq[i] ) )
    # print( '#define %s %d //hz=%f' %( svara_list[i], 1.0E6/float(svara_freq[i]), svara_freq[i] ) )

# Result: 
# svara=Sa   	freq=400.0000	period(microsecs)=2500.0000
# svara=Re_k 	freq=423.7852	period(microsecs)=2359.6858
# svara=Re   	freq=448.9848	period(microsecs)=2227.2468
# svara=Ga_k 	freq=475.6828	period(microsecs)=2102.2410
# svara=Ga   	freq=503.9684	period(microsecs)=1984.2513
# svara=Ma   	freq=533.9359	period(microsecs)=1872.8838
# svara=Ma_t 	freq=565.6854	period(microsecs)=1767.7670
# svara=Pa   	freq=599.3228	period(microsecs)=1668.5498
# svara=Dha_k	freq=634.9604	period(microsecs)=1574.9013
# svara=Dha  	freq=672.7171	period(microsecs)=1486.5089
# svara=Ni_k 	freq=712.7190	period(microsecs)=1403.0776
# svara=Ni   	freq=755.0995	period(microsecs)=1324.3289

Now if you send PWMs to the Piezo speaker at these frequencies you could get the whole sequence played by the speaker similar to my human voice. What is really happening is with the PWMs the quartz crystal vibrates at those frequencies thus producing the sounds. The following Arduino program can do that.

#define Sa 2500 //hz=400.000000
#define Re_k 2359 //hz=423.785238
#define Re 2227 //hz=448.984819
#define Ga_k 2102 //hz=475.682846
#define Ga 1984 //hz=503.968420
#define Ma 1872 //hz=533.935942
#define Ma_t 1767 //hz=565.685425
#define Pa 1668 //hz=599.322831
#define Dha_k 1574 //hz=634.960421
#define Dha 1486 //hz=672.717132
#define Ni_k 1403 //hz=712.718975
#define Ni 1324 //hz=755.099450

// Define a special note, 'R', to represent a rest
#define  R     0
#define eof    321

#define speaker 4

int list[100] = {Sa, Re, Ga, Ma, Pa, Dha, Ni, 0.5*Sa, eof };
//int  duration[100] = [100,212
void setup() {
  pinMode(speaker, OUTPUT);
  int k=0; 
  while( list[k] != eof ) // loop over the array
    Serial.print( list[k] );
    for( int del=0 ; del<800 ; del++ )
      // +ve for half the time and -ve for half the time
      digitalWrite(speaker, HIGH);
      digitalWrite(speaker, LOW);

// the loop function runs over and over again forever
void loop() {

While the method I mentioned is a mathematically sound method, there are alternate ways to get to the frequencies of the octave. There are thumb rules like, the ratios of Sa:Ga:Pa = 4:5:6 etc. I got to know about this through this link or [PDF].

Demo-2: Sa Re Ga Ma with Arduino+Piezo Speaker

Demo-3: Roja Song Tune

Now that we can generate the constituents of music, we are all set to create an whole song. Although the sound quality with the piezo speaker is of very poor quality, this is a fully working proto-type. You could get the music notations of your favorite song online, just try doing a good search. My favorite song is ‘Dil Hai Chotasa’ from the movie Roja, the music director was the great AR Rahman. I got notation for this song from here.

Future Improvements

Although this is a functional prototype, there is a lot to learn in getting the melodies correctly. For now, although the pitches are correct, the song is totally monotone. Different pitches need to play for different time duration for the song to make sense to a human. Also getting the sound to play on a electrodynamic speaker is a worthwhile endevour. Reading and playing sounds encoded as .wav or .midi can help you build your own ipod. I would also like to mention about Teensy Audio Shield (I am not promoting), available here. Guys at teensy have put up a tutorial to work with sounds with the Teensy board (A better Arduino).

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s