Improving regex for parsing YouTube / Vimeo URLs

Daniel picture Daniel · Apr 10, 2011 · Viewed 23.7k times · Source

I've made a function (in JavaScript) that takes an URL from either YouTube or Vimeo. It figures out the provider and ID for that particular video (demo: http://jsfiddle.net/csjwf/).

function parseVideoURL(url) {

    var provider = url.match(/http:\/\/(:?www.)?(\w*)/)[2],
        id;

    if(provider == "youtube") {

        id = url.match(/http:\/\/(?:www.)?(\w*).com\/.*v=(\w*)/)[2];
    } else if (provider == "vimeo") {

        id = url.match(/http:\/\/(?:www.)?(\w*).com\/(\d*)/)[2];
    } else {
        throw new Error("parseVideoURL() takes a YouTube or Vimeo URL");    
    }
    return {
        provider : provider,
        id : id
    }
}

It works, however as a regex Novice, I'm looking for ways to improve it. The input I'm dealing with, typically looks like this:

http://vimeo.com/(id)
http://youtube.com/watch?v=(id)&blahblahblah.....

1) Right now I'm doing three separate matches, would it make sense to try and do everything in one single expression? If so, how?

2) Could the existing matches be more concise? Are they unnecessarily complex? or perhaps insufficient?

3) Are there any YouTube or Vimeo URL's that would fail being parsed? I've tried quite a few and so far it seems to work pretty well.

To summarize: I'm simply looking for ways improve the above function. Any advice is greatly appreciated.

Answer

Yangshun Tay picture Yangshun Tay · Mar 31, 2014

Here's my attempt at the regex, which covers most updated cases:

function parseVideo(url) {
    // - Supported YouTube URL formats:
    //   - http://www.youtube.com/watch?v=My2FRPA3Gf8
    //   - http://youtu.be/My2FRPA3Gf8
    //   - https://youtube.googleapis.com/v/My2FRPA3Gf8
    // - Supported Vimeo URL formats:
    //   - http://vimeo.com/25451551
    //   - http://player.vimeo.com/video/25451551
    // - Also supports relative URLs:
    //   - //player.vimeo.com/video/25451551

    url.match(/(https?\/\/)(player.|www.)?(vimeo\.com|youtu(be\.com|\.be|be\.googleapis\.com))\/(video\/|embed\/|watch\?v=|v\/)?([A-Za-z0-9._%-]*)(\&\S+)?/);
    var type = null;
    if (RegExp.$3.indexOf('youtu') > -1) {
        type = 'youtube';
    } else if (RegExp.$3.indexOf('vimeo') > -1) {
        type = 'vimeo';
    }

    return {
        type: type,
        id: RegExp.$6
    };
}