Commit aa5d40e5 authored by Michael Witrant's avatar Michael Witrant

parse links

parent 5fda2c19
source "http://rubygems.org"
gem 'nokogiri'
GEM
remote: http://rubygems.org/
specs:
nokogiri (1.5.0)
PLATFORMS
ruby
DEPENDENCIES
nokogiri
This source diff could not be displayed because it is too large. You can view the blob instead.
require "rubygems"
require "bundler/setup"
require "nokogiri"
files = %w(organisations.html public_authorities.html)
file = files.first
doc = Nokogiri::HTML(File.read(file))
links = doc.css("a").map do |link|
href =link["href"]
href if href =~ /\.pdf$/i
end.compact
p links
This source diff could not be displayed because it is too large. You can view the blob instead.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment