RailsConf 2008

art-talk weather forecast rides again!

Posted by rick Mon, 27 Feb 2006 00:17:00 GMT

In order to make automated “7 day Art weather forecast” emails like these on the new AirSet art-talk calendar I had to update my ruby script. Here’s the new version (and you’ll note that the bulk of the code deals with cleanup of bastardized input):


#!/usr/bin/ruby

require 'open-uri'
require 'rexml/document'
require 'rexml/xpath'
require 'time'

URL = "http://www.airset.com/syndicate/public/1391/week.xml" 

def cleanup(text, keep_stars = false)
  result = text.
    gsub(/&/, '&').
    gsub(/&[lr]?quot;/, '"').
    gsub(/'/, "'").
    gsub(/'/, "'").
    gsub(/>/, '>').
    gsub(/&lt;/, '<').
    gsub(/&nbsp;/, ' ').
    gsub(%r{</?[^>]+>}, '').
    gsub(/\*\s*\*/, '**').
    gsub(/\342\200\235/, '"').
    gsub(/\342\200\234/, '"').
    gsub(/\342\200\231/, "'").
    gsub(/\342\200\223/, " -- "). 
    gsub(/\342\200\224/, " -- "). 
    gsub(/\303\242/, "a").
    gsub(/\303\251/, "e").
    gsub(%r{/+\s*$}, '')

  result.gsub!(/\s*\*\s*/, '') unless keep_stars
  result
end

def wordwrap(text, line_width = 70)
  text.gsub( /\n/, "\n\n" ).gsub( /(.{1,#{line_width}})(\s+|$)/, "\\1\n")
end

def output(title, time, location, link, description)
  title       = cleanup(title)
  time        = cleanup(time)
  location    = cleanup(location)
  link        = cleanup(link)
  description = wordwrap(cleanup(description, true))

  puts "#{time.chomp} - #{title}" 
  puts "    online:  <#{link.chomp}>" 
  puts "  location:  #{location}\n\n" 
  description.split("\n").each {|l| puts "   #{l}"}
  puts "" 
end

def fetch_document(link)
  open(link) { |f| return f.read.split("\n").join(' * ') }
end

def extract_location(doc)
  doc =~ %r{<span\s+class="evDescAndLoc">([^<]+)</span>}
  place  = $1 || ''
  place = place.sub(/^.* at /, '').gsub(/\s+\*\s+/, '')
  doc =~ %r{<span\s+class="evAddress">(.*?)</span>}
  address = $1 || ''
  address = address.gsub(/Get map/, '').gsub(/\s+\*\s+/, '')
  place += " / #{address}" unless address =~ /^\s*$/
  place
end

def extract_description(doc)
  doc =~ %r{<span\s+class="evNote">(.*?)</span>}
  return ($1 || '')
end

def retrieve_data(item)
  # Extract title and time
  if item.elements['title'].text =~ /\(([^)]+)\)\s*$/
    time = $1
  else
    t = Time.parse item.elements['pubDate'].text
    hour = t.hour % 12
    hour = 12 if 0 == hour
    time = "%02d/%02d/%4d (%d:%02d%sM)" % 
      [t.month, t.day, t.year, hour, t.min, t.hour > 11 ? 'P':'A']
  end

  title = cleanup(item.elements['title'].text.sub(/\([^)]+\)\s*$/, ''))
  link = item.elements['link'].text
  doc = fetch_document(link)
  location = extract_location(doc)
  description = extract_description(doc)

  [title, time, location, link, description]  
end

  puts "7 Day Art Weather Forecast" 
  puts
  puts "  ... see the Art-Talk Calendar for more events:" 
  puts
  puts "  online at: <http://www.airset.com/Public/Calendars.jsp?id=1391>" 
  puts
  puts " -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --" 
  puts

open(URL) do |f|
  xml = REXML::Document.new(f.read)
  REXML::XPath.each(xml, '//item') do |item|
    title, time, location, link, description = retrieve_data(item)
    output(title, time, location, link, description)
  end
end

Tags , , ,  | no comments


banned vocabulary
"+1" (and "-1")
"existential"
"onboarding"
"ferret"
"finesse"
"yeah."
<blank> <units> thin
<blank> warriors
<blank> years young
<blank>'s team
<x> of the moment
(it|)'s all good
(noo|new)b(|ie|y)
I.T.
ROI
P.D.I.
[web] portal
accountab(le|ility)
actuate
advocate (v. and n.)
anyhoo
assessment
belief system
best practice(s)
best practice(s)
blog(|ger|ging)
business rules
cautiously optimistic
celebrate
closure
construct (n.)
creative(s) (n.)
dialogue
divers(e|ity)
diversity
document (n.)
emerg(ing|ent)
emoticon(s)
enabler
eponymous
everyday heroes
extreme <blank>
facilitate
faith-based
foment
gestalt
git 'r done
gradation(|s) [sic]
guiding vision
hoi polloi
human drama
ill-fated
incentivise
jejune
kerfuffle
killer app
kudos
leverage
marginalize(d)
matriculate
merch
monetize
mouth-feel
multitask(|ing|er) (n.t.)
n(oo|ew)b(ie|)
network(|ing) (n.t.)
nexus
outsider art
podcast(ing)
proggy
protocol
quantum leap
reflect (v.)
repurpose
revamp
river system
schadenfreude
sea change
shopping (etc.) culture
shout-out
some <blank>-action
sophomore effort
strategic repositioning
synergy
team members / partners
the <blank> arena
the <x> Street
tix
under 30-set
value system
vertical
where('|i)s the outrage?
win-win
winders
Weltanschauung