Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
LQDN Adminsys
amendments
Commits
e6c73c9c
Commit
e6c73c9c
authored
Sep 24, 2011
by
Michael Witrant
Browse files
parse generated files
parent
242ea49d
Changes
1
Hide whitespace changes
Inline
Side-by-side
consultation_ipred/parser.rb
View file @
e6c73c9c
...
...
@@ -2,24 +2,14 @@
require
"rubygems"
require
"bundler/setup"
require
"nokogiri"
files
=
%w(organisations.html public_authorities.html)
files
=
%w(organisations.links public_authorities.links)
file
=
files
.
first
doc
=
Nokogiri
::
HTML
(
File
.
read
(
file
))
links
=
doc
.
css
(
"a"
).
map
do
|
link
|
href
=
link
[
"href"
]
href
if
href
=~
/\.pdf$/i
end
.
compact
.
uniq
{
|
url
|
URI
.
parse
(
url
).
path
}
parsed_names
=
[]
links
=
File
.
read
(
file
).
split
(
"
\n
"
)
links
.
each
do
|
url
|
name
=
File
.
basename
(
URI
.
parse
(
url
).
path
,
".pdf"
)
next
if
parsed_names
.
include?
(
name
)
parsed_names
<<
name
names
=
name
.
split
(
"_"
)
language
=
names
.
pop
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment