Playing MIDI Files in Windows (Part 5)

Jump to: Part 1, Part 2, Part 3, Part 4, Part 5

In the last article we created some code that is capable of playing most (not all) MIDI files out there. We do most of the grunt work decoding and playing MIDI events including processing time values, waiting a specific amount of time and synchronizing events from multiple tracks.

Related source and example files can be downloaded here: mididemo.zip

We created our usleep() function to get mostly accurate MIDI timing which seemed to work well. Unfortunately, the usleep() function does not actually sleep but spins in a tight loop. This is a big waste of processor and still isn’t perfectly accurate. The Windows multimedia API supplies some mid-level APIs that can be used to make things easier on us and possibly our processor.

The midiStream*() functions take a stream of MIDI messages and time values and takes care of processing the time values and playing the messages. This eliminates our need for usleep() and lets Windows take care of timing and playing individual messages. We still need to decode the MIDI events and time values and format them into the stream of MIDIEVENT structures that the API expects. However, the API will either provide its own timing and message processing or, if the device supports it, hand the buffer off to the MIDI device itself freeing up our CPU for other tasks. (much better than our tight usleep() loop!)

Anyway, our get_buffer() showed how to decode the MIDI file and pack the events into our own buffer format. The format I chose is not much different than the format that the midiStream*() functions expect. The buffer is an array of MIDIEVENT structures. The MIDIEVENT structure looks like this:?

typedef struct 
  DWORD dwDeltaTime
  DWORD dwStreamID;
  DWORD dwEvent;
  DWORD dwParms[];
} MIDIEVENT;

The first field is the time delta-time value. This is the same delta-time value we used in all our previous examples. It is an unsigned 4-byte integer in little-endian format. The second field is a stream ID. What it is for I don’t really know but MSDN says it is reserved and must be set to 0. The third field is the MIDI event. This event is the same format as the events we passed to midiOutShortMsg(). The last field is a variable length field and is only used for long messages such as System Exclusive messages. Up to this point we had no need for long messages and we still don’t. We do not need to do anything with this last field an essentially eliminate it altogether.

So, our buffer is an array of these structures that looks like this:

delta-time
0
event
delta-time
0
event
...

This is nearly identical to our previous buffer format, except it has an extra value between the delta-time and event.

delta-time
event
delta-time
event
...

Yeah, Microsoft is notorious for adding extra “reserved” fields in their data structures for what reason, nobody really knows.

Another important thing to note is that the buffer used by the midiStream*() functions must be less than 65k. When we are processing the MIDI file and constructing the stream buffer, we must do it in 65k or less chunks. For large MIDI files we may need to call get_buffer() several times.

The get_buffer() function is nearly identical to our previous examples, it now just needs to add that extra 0 and limit the size of the returned buffer. In this example we limit the buffer to a maximum of 512 MIDIEVENT structures:

#define MAX_BUFFER_SIZE (512 * 12)

unsigned int get_buffer( struct trk* tracks, unsigned int ntracks, unsigned int* out, unsigned int* outlen) {
  MIDIEVENT e, *p;
  unsigned int streamlen = 0;
  unsigned int i;
  static unsigned int current_time = 0; // remember the current time from the last time we were called.

  if(tracks == NULL || out == NULL || outlen == NULL)
    return 0;

  *outlen = 0;

  while(TRUE) {
    unsigned int time = (unsigned int )-1;
    unsigned int idx = -1;
    struct evt evt
    unsigned char c;

    if(((streamlen + 3) * sizeof(unsigned int)) >= MAX_BUFFER_SIZE)
      break;

    // get the next event
    for(i = 0; i < ntracks; i++) {
      evt = get_next_event(&tracks[i]);
      if(!(is_track_end(&evt)) && (evt.absolute_time < time)) {
        time = evt.absolute_time;
        idx = i;
      }
    }

    // if idx == -1 then all the tracks have been read up to the end of track mark
    if(idx == -1)
      break;  // we're  done

    e.dwStreamID = 0; // always 0

    evt = get_next_event(&tracks[idx]);

    tracks[idx].absolute_time = evt.absolute_time;
    e.dwDeltaTime = tracks[idx].absolute_time - current_time;
    current_time = tracks[idx].absolute_time;

    if(!(evt.event & 0x80)) { // running mode
      unsigned char last = tracks[idx].last_event;
      c = *evt.data++; // get the first data byte
      e.dwEvent = ((unsigned long)MEVT_SHORTMSG << 24) |
                  ((unsigned  long)last) |
                  ((unsigned long)c << 8);
      if(!((last & 0xf0) == 0xc0 || (last & 0xf0) == 0xd0)) {
        c = *evt.data++; // get the second data byte
        e.dwEvent |= ((unsigned long)c << 16);
      }

      p =  (MIDIEVENT*)&out[streamlen];
      *p = e;

      streamlen += 3;

      tracks[idx].buf = evt.data;
    } else if(evt.event == 0xff) { // meta-event
      evt.data++; // skip the event byte
      unsigned char meta = *evt.data++; // read the meta-event byte
      unsigned int len;

      switch(meta) {
      case 0x51 // only care about tempo events
        {
          unsigned char a, b, c;
          len = *evt.data++; // get the length byte, should be 3
          a = *evt.data++;
          b = *evt.data++;
          c = *evt.data++;

          e.dwEvent = ((unsigned long)MEVT_TEMPO << 24) |
                  ((unsigned long)a << 16) |
                  ((unsigned long)b << 8) |
                  ((unsigned long)c << 0);

          p = (MIDIEVENT*)&out[streamlen];
          *p = e;

          streamlen += 3;
        }
        break;
      default: // skip all other meta events
        len = *evt.data++; // get the length byte
        evt.data += len;
        break;
      }

      tracks[idx].buf = evt.data;
    } else if((evt.event & 0xf0) != 0xf0) { // normal command
      tracks[idx].last_event = evt.event;
      evt.data++; // skip the event byte
      c = *evt.data++;  // get the first data byte
      e.dwEvent = ((unsigned long)MEVT_SHORTMSG << 24) |
                ((unsigned long)evt.event << 0) |
                ((unsigned long)c << 8);
      if(!((evt.event & 0xf0) == 0xc0 || (evt.event & 0xf0) == 0xd0)) {
        c = *evt.data++; // get the second data byte
        e.dwEvent |= ((unsigned long)c << 16);
      }

      p = (MIDIEVENT*)&out[streamlen];
      *p = e;

      streamlen += 3;

      tracks[idx].buf = evt.data;
    }
  }

  *outlen = streamlen * sizeof(unsigned int);

  return 1;
}

Just like in the last two examples we will support mulit-track MIDI files. The difference is that now, instead of locating and processing all the tracks in the get_buffer() function, we will set up some variables in our main function to not only locate the individual tracks within the MIDI file but also remember where we leave off each time we need to call get_buffer().

HANDLE event;

unsigned int example9() {
  unsigned char* midibuf = NULL;
  unsigned int midilen = 0

  struct _mid_header* hdr = NULL;

  unsigned int i;

  unsigned short ntracks = 0;
  struct trk* tracks = NULL;

  unsigned int streambufsize = MAX_BUFFER_SIZE;
  unsigned in* streambuf = NULL;
  unsigned int streamlen = 0;

  ...

  hdr = (struct _mid_header*)midibuf;
  midibuf += sizeof(struct _mid_header);
  ntracks = swap_bytes_short(hdr->tracks);

  tracks = (struct</code><code>trk*)malloc(ntracks * sizeof(struct trk));
  if(tracks == NULL)
    goto error1;

  for(i = 0; i < ntracks; i++) {
    tracks[i].track = (struct _mid_track*)midibuf;
    tracks[i].buf = midibuf + sizeof(struct _mid_track);
    tracks[i].absolute_time = 0;
    tracks[i].last_event = 0;

    midibuf += sizeof(struct _mid_track) + swap_bytes_long(tracks[i].track->length);
  }

  streambuf = (unsigned int *)malloc(sizeof(unsigned int) * streambufsize);
  if(streambuf == NULL)
    goto error2;

  memset(streambuf, 0, sizeof(unsigned int) * streambufsize);

  event = CreateEvent(0, FALSE, FALSE, 0);

Once we have the file open and our track structures set up, we open the MIDI device for streaming by calling midiStreamOpen().

HMIDISTRM out;
unsigned int device = 0;
midiStreamOpen(&out, &device, 1, (DWORD)example9_callback, 0, CALLBACK_FUNCTION);

The first parameter is a variable to hold the opened MIDI stream handle. The second is a variable that contains the device ID to open. I’m not sure why this needs to be a pointer to a variable holding the ID rather than pass by value like midiOutOpen(). The third parameter is “reserved” and must be 1. (Does anyone actually know why Microsoft does things like that?) The fourth parameter is a pointer to a callback function that will be called during MIDI playback. The fifth parameter is data that is passed to the callback. Our callback doesn’t use any extra data so this parameter is set to 0. The final parameter is a flag that specifies we are using a callback function to receive playback information, as opposed to an event, thread or window.

MIDIPROPTIMEDIV prop;
prop.cbStruct =  sizeof(MIDIPROPTIMEDIV);
prop.dwTimeDiv = swap_bytes_short(hdr->ticks);
midiStreamProperty(out, (LPBYTE)&prop, MIDIPROP_SET|MIDIPROP_TIMEDIV);

Once the stream is open we need to set the PPQN value to control tempo. If not set the default tempo is 120 beats per minute (or 500,000 microseconds per quarter note). The default time division (PPQN) if not set is 96 ticks (pulses) per quarter note. The PPQN value is read from the MIDI file header in the ticks field and set here. Note that we swap the byte order since the ticks value is stored in big-endian format and dwTimeDiv is little-endian.?

mhdr.lpData = (char*)streambuf;
mhdr.dwBufferLength = mhdr.dwBytesRecorded = streambufsize;
mhdr.dwFlags = 0;
midiOutPrepareHeader((HMIDIOUT)out, &mhdr, sizeof(MIDIHDR));

The next thing we need to do is prepare the buffer for processing by midiStreamOut(). We set lpData to our buffer we allocated earlier. Many examples I’ve seen elsewhere show this buffer populated with MIDI data first before midiOutPrepareHeader() is called, however this is not necessary. Next, we set dwBufferLength to the size of the buffer as well as dwBytesRecorded. It is not really important to set dwBytesRecorded here as we will be overwriting it again later anyway. We clear dwFlags by setting it to 0. Windows will use this to return state information if we want it. Finally, we call midiOutPrepareHeader() passing our open stream handle, a pointer to our header structure and its size. From this point on, we can keep repopulating streambuf and passing this header to midiStreamOut().

That’s almost it. By default, when a stream is opened it is in stopped mode. If we cue a stream it won’t start playing until midiOutRestart() is called. We’ll call it here so we don’t have to worry about it later.?

midiStreamRestart(out);

We can now start buffering MIDI events and playing them. We’ll grab the first buffer full before we enter our loop. The number of events read will be our exit condition on the loop. Once this value is 0 we’ve hit the end of the MIDI score and we can exit.?

get_buffer(tracks, ntracks, streambuf, &streamlen);
while(streamlen > 0) {
  mhdr.dwBytesRecorded = streamlen;
  midiStreamOut(out, &mhdr, sizeof(MIDIHDR));
  WaitForSingleObject(event, INFINITE);
  get_buffer(tracks, ntracks, streambuf, &streamlen);
}

Here we call get_buffer() which was described above passing our tracks and the streambuf to read the events into. The streamlen parameter will receive the number of bytes read into streambuf. Once this value returns 0 we can exit.

After we have our first buffer and enter the loop we need to remember to update dwBytesRecorded with the number of bytes in streambuf that are valid. If we don’t midiStreamOut() may try to play garbage (or possibly old events) left at the end of the stream buffer.

I had previously skipped over talking about the event object we created earlier. It is used to pause our loop while the cued buffer is playing. When we opened the stream we passed a pointer to a callback function that will be called when the buffer has finished playing. That callback looks like this:?

void CALLBACK example9_callback(HMIDIOUT out, UINT msg, DWORD dwInstance, DWORD dwParam1, DWORD dwParam2) {
  switch(msg) {
  case MOM_DONE:
    SetEvent(event);
    break;
  case MOM_POSITIONCB:
  case MOM_OPEN:
  case MOM_CLOSE:
    break;
  }
}

The callback is fired for various events during playback. The only one we care about is MOM_DONE. This message indicates the cued stream has finished playing. When we receive this message, we signal the loop to continue by calling SetEvent().

In the main loop we pause by calling WaitForSingleObject() on the event. The loop will wait here until the event is signaled. Once signaled, we call get_event() to buffer the next chunk of MIDI data and loop again.

  midiOutReset((HMIDIOUT)out);
  midiOutUnprepareHeader((HMIDIOUT)out, &mhdr, sizeof(MIDIHDR));
  midiStreamClose(out);
  CloseHandle(event);

  free(streambuf);
  free(tracks);
  free(hdr);
  return(0);
}

Once the loop exits we are done and can clean up. The midiOutReset() function stops any notes that are playing and silences the device. The midiOutUnprepareHeader() cleans up after our header and anything Windows might have been doing with it. Finally, midiStreamClose() closes the stream handle.

A few things to note here, there is a very small delay between when the previously cued stream stops and the next one is buffered and starts. It is not really noticeable but it is obviously there from looking at the code.

midiStreamOut() has the ability to cue multiple streams at a time, when one finishes the next one starts immediately. It is normal for applications to cue two or more buffers at a time which will completely eliminate the delay.

In the MIDI library I created for some of my projects I use a double buffer. I cue the first buffer and while it is playing I process the next chunk and cue it. Once the first buffer completes the next cued buffer begins playing immediately and the callback is fired, at which time I reuse the first buffer and cue up another chunk.

That’s it for our MIDI tutorials unless there is interest in some other topics. When it’s ready I’ll be posting my multimedia library that implements the ideas discussed in these articles. Let me know by email or in the comments below what you think of these articles and if you have any questions or ideas you’d like to learn about.

Related files:

Further reading:

Related Posts