Playing MIDI Files in Windows (Part 3)

Jump to: Part 1, Part 2, Part 3, Part 4, Part 5

In the last article we demonstrated playing a very simple single track MIDI file. In this article we will build on that and add support for multiple tracks. This example will still only support very small MIDI files and is only guaranteed to work correctly for the provided example MIDI file. Related source and example files can be downloaded here: In the previous example, each event was preceded by a delta-time value. This delta-time value is the amount of time to wait before executing the MIDI event. This time is relative to the previous event so a delta-time of say 10 ticks means “wait 10 ticks from the last event before executing the next event”. In the single track example, this was pretty straight forward. In multiple track MIDI files this becomes a little more complicated. The delta-time values are relative to the previous event in the same track. So for example, if one track executes an event every 10 ticks and another simultaneous track executes an event every 25 ticks, they will need to be merged (or mixed) into a single stream and each delta-time will need to be adjusted based on the events from both tracks. For example:

 Ticks   |   Track 1    |   Track 2   ==>   Merged
  0      |   0 - e1     |   0 - e1     | 0  - t1e1
         |              |              | 0  - t2e1
  5      |              |              |
  10     |   10 - e2    |              | 10 - t1e2
  15     |              |              |
  20     |   10 - e3    |              | 10 - t1e3
  25     |              |   25 - e2    | 5  - t2e2
  30     |   10 - e4    |              | 5  - t1e4
  35     |              |              |
  40     |   10 - e5    |              | 10 - t1e5
  45     |              |              |
  50     |   10 - e6    |   25 - e3    | 10 - t1e6
         |              |              | 0  - t2e3

These two tracks begin simultaneously executing their first event. 10 ticks later an event from the first track is executed followed by another after another 10 ticks. Now it starts to get interesting. Because the second event from track 2 is executed after 25 ticks, we must take into account the events from track 1 that have been executed so far and subtract that value from the delta-time for the track 2 event. Therefore, when the tracks are merged, the second track 2 event will need to be executed only 5 ticks after the third track 1 event. Also, since the second event from track 2 is executed 5 ticks after the third event from track 1, the delta-time for the fourth event for track one must be adjusted to account for that and is executed 5 ticks later. At this point in the example things get back on track until the 50 tick mark when two events must be executed simultaneously again. A total of 10 ticks must pass before executing one of the events, and because they are to be executed simultaneously, 0 ticks must pass before executing the second event. This gets seemingly more complicated as more tracks are added, however the algorithm is relatively simple. A counter is needed to keep track of the absolute time from the beginning of the score. Another counter for each track is also needed to keep track of the absolute time processed so far within the track. The algorithm goes something like this:

  1. Loop through each track.
    1. If a track is at the end-of-track marker, skip the track.
    2. If all the tracks are at the end-of-track marker, we are done.
  2. Select the event closest to the current absolute time.
    1. Extract the event.
    2. Advance the track pointer to the next event in the track.
    3. Advance the absolute time for the track by the extracted event’s delta-time.
  3. The difference between the absolute time for the track and the absolute time for the score is used as the new delta-time for the event.
  4. The absolute time for the score is advanced by the new delta-time.
  5. The delta-time and event are added to the stream as in the previous example.
  6. Continue from step 1.

To put this into practice we’ll introduces some new structures and helper functions.

struct trk {
	struct _mid_track* track;
	unsigned char* buf;
	unsigned char last_event;
	unsigned int absolute_time;

The trk structure will keep track of the processing of each MIDI track. The track field is a pointer to the Track header as described in the last article. buf will be initialized to point to the first delta-time of the first event. The absolute_time field will be initialized to 0 and will keep track of the absolute time of all processed events. The last_event field we won’t worry about for now. It will be used in the next article.

struct evt {
	unsigned int absolute_time;
	unsigned char* data;
	unsigned char event;

The evt structure will be used to represent the extracted event data from the track. The absolute_time field will represent the absolute time of the event from the beginning of the track. The data field will point to the beginning of the event (first byte past the delta-time value). The event field will store the event byte.

unsigned short swap_bytes_short(unsigned short in)
	return ((in << 8) | (in >> 8));
unsigned long swap_bytes_long(unsigned long in)
	unsigned short *p;
	p = (unsigned short*)∈
	return (  (((unsigned long)swap_bytes_short(p[0])) << 16) |
				(unsigned long)swap_bytes_short(p[1]));

Because numeric data in MIDI files is stored in big-endian format, swap_bytes_short() and swap_bytes_long() are helper functions used to convert the numeric data to little-endian format.

struct evt get_next_event(const struct trk* track)
	unsigned char* buf;
	struct evt e;
	unsigned int bytesread;
	unsigned int time;
	buf = track->buf;
	time = read_var_long(buf, &bytesread);
	buf += bytesread;
	e.absolute_time = track->absolute_time + time; = buf;
	e.event = *;
	return e;

The get_next_event() helper function simply reads the next event from the track. It does not advance the track pointer or absolute time, it simply returns the next event available. If the returned event is the one that will be selected, the track pointer will have to be advanced later. This function uses read_var_long() as discussed in the previous article to read the delta-time from the track buffer. That value is added to the track’s absolute_time to get the event’s absolute_time value. The data field is set to the next byte past the delta-time in the track buffer and the event byte is copied into the event field.

int is_track_end(const struct evt* e)
	if(e->event == 0xff) // meta-event?
		if(*(e->data + 1) == 0x2f) // track end?
			return 1;
	return 0;

Finally, the is_track_end() helper function determines if the returned event is the end-of-track marker.

unsigned int get_buffer_ex7(unsigned char* buf, unsigned int len, unsigned int** out, unsigned int* outlen)
	struct _mid_header* hdr = NULL;
	struct trk* tracks = NULL;
	unsigned short nTracks = 0;
	unsigned int i;
	unsigned int* streambuf = NULL;
	unsigned int streambuflen = 1024;
	unsigned int streamlen = 0;
	unsigned char* tmp = buf;
	unsigned int currTime = 0;
	streambuf = (unsigned int*)malloc(sizeof(unsigned int) * streambuflen);
	memset(streambuf, 0, sizeof(unsigned int) * streambuflen);
	hdr = (struct _mid_header*)tmp;
	tmp += sizeof(struct _mid_header);
	nTracks = swap_bytes_short(hdr->tracks);
	tracks = (struct trk*)malloc(nTracks * sizeof(struct trk));
	for(i = 0; i < nTracks; i++)
		tracks[i].track = (struct _mid_track*)tmp;
		tracks[i].buf = tmp + sizeof(struct _mid_track);
		tracks[i].absolute_time = 0;
		tracks[i].last_event = 0;
		tmp += sizeof(struct _mid_track) + swap_bytes_long(tracks[i].track->length);
		unsigned int time = (unsigned int)-1;
		unsigned char cmd;
		unsigned int msg = 0;
		unsigned int idx = -1;
		struct evt evt;
		// get the next event
		for(i = 0; i < nTracks; i++)
			evt = get_next_event(&tracks[i]);
			if(!(is_track_end(&evt)) && (evt.absolute_time < time))
				time = evt.absolute_time;
				idx = i;
		if(idx == -1)
			break; // we're done
		evt = get_next_event(&tracks[idx]);
		tracks[idx].absolute_time = evt.absolute_time;
		time = tracks[idx].absolute_time - currTime;
		currTime = tracks[idx].absolute_time;
		cmd = *;
		if((cmd & 0xf0) != 0xf0) // normal command
			msg = ((unsigned long)cmd) |
				  ((unsigned long)* << 8);
			if(!((cmd & 0xf0) == 0xc0 || (cmd & 0xf0) == 0xd0))
				msg |= * << 16;
			streambuf[streamlen++] = time;
			streambuf[streamlen++] = msg;
			tracks[idx].buf =;
	*out = streambuf;
	*outlen = streamlen;
	return 0;

This function first parses the MIDI header to determine how many tracks are in the file. It then allocates an array of that number of trk structures. Each trk structure is initialized with the beginning of the track data for each track within the buffer. Once the trk structures are initialized, processing begins. We loop through each trk structure looking for the next event that is not the end-of-track marker and has the lowest absolute_time value. If more than one event have the same absolute_time, the first event is chosen. (the next iteration will choose the next event, and so on…) If all the tracks are at the end-of-track marker, our stream buffer is filled and we can return to the caller. With the next event in hand, we update the absolute time for the track from which the event came to be the absolute time of the event. We calculate the new delta-time for the event by subtracting the score absolute time value from the track absolute time value. Finally, we update the absolute time for the score to be the absolute time of the event (track). The event is processed in the same way as described in the previous article. However, the data field in the evt structure is used to read from the track buffer to determine where the data bytes are and find the beginning of the delta-time for the next event in the track. Once the event is processed, is pointing to the beginning of the next delta-time. The buf field of the trk structure is advanced to this location.

unsigned int example7()
	unsigned char* midibuf = NULL;
	unsigned int midilen = 0;
	unsigned int* streambuf = NULL;
	unsigned int streamlen = 0;
	unsigned int err;
	const unsigned int PPQN_CLOCK = 5;
	unsigned int i;
	err = midiOutOpen(&out, 0, 0, 0, CALLBACK_NULL);
		printf("error opening default MIDI device: %d\n", err);
		printf("successfully opened default MIDI device\n");
	midibuf = load_file((unsigned char*)"example7.mid", &midilen);
	if(midibuf == NULL)
		printf("could not open example7.mid\n");
		return 0;
	get_buffer_ex7(midibuf, midilen, &streambuf, &streamlen);
	i = 0;
	while(i < streamlen)
		unsigned int time = streambuf[i++];
		Sleep(time * PPQN_CLOCK);
		err = midiOutShortMsg(out, streambuf[i++]);
			printf("error sending command: %d\n", err);
	return 0;

Because the multiple tracks have been merged into a single stream, the code that sends the messages to midiOutShortMsg() is exactly the same as in the previous example. In this article we looked at merging multiple MIDI tracks into a single stream. In the next article we will look at supporting more advanced features such as running mode and META events. We will also look at a more precise timing mechanism for timing MIDI events instead of using the Sleep() function. The result will be a (mostly) complete example capable of playing (most) midi files thrown at it. Related files:

Further reading:


Related Posts

Leave a Reply