Following Redirects with Net/HTTP

Tháng Tư 19, 2009


The web is full of redirects. It isn’t that hard to figure out how to follow them using Ruby, but it always helps to have examples when you are learning. Not too long ago I was hacking on some feed auto discovery code and made a little class that, given a url, will find the endpoint and return the response from that endpoint.

I figured in the old spirit of clogging, I would post it here until I have time to package the full feed auto discovery library and release it on Github.

require 'logger'
require 'net/http'

class RedirectFollower
  class TooManyRedirects < StandardError; end

  attr_accessor :url, :body, :redirect_limit, :response

  def initialize(url, limit=5)
    @url, @redirect_limit = url, limit
    logger.level = Logger::INFO

  def logger
    @logger ||=

  def resolve
    raise TooManyRedirects if redirect_limit < 0

    self.response = Net::HTTP.get_response(URI.parse(url)) "redirect limit: #{redirect_limit}" "response code: #{response.code}" 
    logger.debug "response body: #{response.body}" 

    if response.kind_of?(Net::HTTPRedirection)      
      self.url = redirect_url
      self.redirect_limit -= 1 "redirect found, headed to #{url}" 

    self.body = response.body

  def redirect_url
    if response['location'].nil?
      response.body.match(/<a href=\"([^>]+)\">/i)[1]

You can then follow redirects as easily as this:

google ='').resolve
puts google.body

Which when run will output something like this:

I, [2009-03-04T17:16:58.879672 #69272]  INFO -- : redirect limit: 5
I, [2009-03-04T17:16:58.880669 #69272]  INFO -- : redirect found, headed to
I, [2009-03-04T17:16:58.987963 #69272]  INFO -- : redirect limit: 4

Followed by the html from which I did not include. The logger method comes in ridiculously handy when following redirects as sometimes it is kind of hard to figure out what is going on. You can also optionally pass in a limit for how many times you would like to redirect.'', 3).resolve

This would set the number of redirects to follow to 3, instead of the default 5. You always want to put some kind of limit on the number of redirects to follow or you could end up in infinite redirection.

Nothing fancy but it gets the job done. Maybe if I get a chance I’ll post on how to test this by stubbing responses.


Trả lời

Mời bạn điền thông tin vào ô dưới đây hoặc kích vào một biểu tượng để đăng nhập: Logo

Bạn đang bình luận bằng tài khoản Log Out / Thay đổi )

Twitter picture

Bạn đang bình luận bằng tài khoản Twitter Log Out / Thay đổi )

Facebook photo

Bạn đang bình luận bằng tài khoản Facebook Log Out / Thay đổi )

Google+ photo

Bạn đang bình luận bằng tài khoản Google+ Log Out / Thay đổi )

Connecting to %s

%d bloggers like this: