diff options
author | Christopher Vollick <psycotica0@gmail.com> | 2012-10-03 12:26:18 -0400 |
---|---|---|
committer | Christopher Vollick <psycotica0@gmail.com> | 2012-10-03 12:26:18 -0400 |
commit | 20a4b892eef3df7bc50675fb6af40f3927d41583 (patch) | |
tree | ab33223a9850973b6c82e89c78f779790275960b | |
parent | f37d60042b9b53014b4abfbfc9753c4268f770de (diff) | |
download | get-flash-videos-20a4b892eef3df7bc50675fb6af40f3927d41583.tar.gz |
Use YouTube's rel="canonical"
There are a few different kinds of URLs that resolve to the same page.
In this case, "http://www.youtube.com/embed/CODE" is the same as
"http://www.youtube.com/watch?v=CODE"
So, rather than add a special case for it, I just used the tag in
YouTube's markup that says "I'm the same as this url"
That way the logic is consolidated across the different formats.
I'm almost leaning towards just having the upstream always perform
this... but that might involve a lot more checking than I have time for
right now.
-rw-r--r-- | lib/FlashVideo/Site/Youtube.pm | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/lib/FlashVideo/Site/Youtube.pm b/lib/FlashVideo/Site/Youtube.pm index 293f3d6..ce84859 100644 --- a/lib/FlashVideo/Site/Youtube.pm +++ b/lib/FlashVideo/Site/Youtube.pm @@ -23,6 +23,12 @@ my @formats = ( sub find_video { my ($self, $browser, $embed_url, $prefs) = @_; + # There are a few different kinds of URLs that end up on the same page + # So, let's canonicalize to the "real" one + if ($browser->content =~ m!<link *rel=['"]canonical['"] *href=['"]([^'"]*)!) { + $embed_url = "http://www.youtube.com$1" + } + if($embed_url !~ m!youtube\.com/watch!) { $browser->get($embed_url); if ($browser->response->header('Location') =~ m!/swf/.*video_id=([^&]+)! |