syntax error, unexpected tLABEL when running a custom rake task

narzero picture narzero · Jul 1, 2014 · Viewed 7.2k times · Source

I'm working on something to scrape real estate data from a certain website. It works as a standalone .rb file while saving to a JSON file. But I want this to run on Heroku and save the data to MongoDB.

Problem:

I keep getting the following errors when running:

rake aborted!
SyntaxError: /Users/user/Dropbox/Development/Rails/booyah/lib/tasks/properties_for_sale.rake:35: syntax error, unexpected tLABEL
            street_name: @street_name,
                        ^
/Users/user/Dropbox/Development/Rails/booyah/lib/tasks/properties_for_sale.rake:41: syntax error, unexpected tLABEL, expecting '='
            bedrooms: @rooms[1],
                     ^
/Users/user/Dropbox/Development/Rails/booyah/lib/tasks/properties_for_sale.rake:42: syntax error, unexpected tLABEL, expecting '='
            number_of_floors: @number_of_floors,
                             ^
/Users/user/.rvm/gems/ruby-2.1.1/gems/railties-4.1.1/lib/rails/engine.rb:654:in `load'
/Users/user/.rvm/gems/ruby-2.1.1/gems/railties-4.1.1/lib/rails/engine.rb:654:in `block in run_tasks_blocks'
/Users/user/.rvm/gems/ruby-2.1.1/gems/railties-4.1.1/lib/rails/engine.rb:654:in `each'
/Users/user/.rvm/gems/ruby-2.1.1/gems/railties-4.1.1/lib/rails/engine.rb:654:in `run_tasks_blocks'
/Users/user/.rvm/gems/ruby-2.1.1/gems/railties-4.1.1/lib/rails/application.rb:362:in `run_tasks_blocks'
/Users/user/.rvm/gems/ruby-2.1.1/gems/railties-4.1.1/lib/rails/engine.rb:449:in `load_tasks'
/Users/user/Dropbox/Development/Rails/booyah/Rakefile:6:in `<top (required)>'
/Users/user/.rvm/gems/ruby-2.1.1/bin/ruby_executable_hooks:15:in `eval'
/Users/user/.rvm/gems/ruby-2.1.1/bin/ruby_executable_hooks:15:in `<main>'

This is the code I'm using:

require 'mechanize'

namespace :properties_for_sale do
  desc "Scrape all properties currently for sale"
  task :start => :environment do

    a = Mechanize.new
    @a2 = Mechanize.new
    @i = 1
    BASE_URL = 'http://www.funda.nl'

    def scrape_objects_on_page(page)
      objects_on_page = page.search('//*[contains(concat( " ", @class, " " ), concat( " ", "object-street", " " ))]')

      objects_on_page.each do |object|

        @a2.get(BASE_URL + object[:href] + 'kenmerken/') do |page_2|
          break if page_2.title == '404 - Pagina niet gevonden'

          @street_name = page_2.search('//*[@id="main"]/div[1]/div/div/div/h1').text.strip
          @price = page_2.search('//*[@id="main"]/div[1]/div/div/div/p[2]/span/span').text.strip.gsub("€ ", "").gsub(".", "").to_i
          @url = page_2.uri.to_s
          @living_area = page_2.search('//*[@id="twwo13"]/td/span[1]').text.strip.gsub(" m²", "").to_i
          @content = page_2.search('//*[@id="twih12"]/td/span[1]').text.strip.gsub(" m³", "").to_i
          @rooms = page_2.search('//*[@id="aaka12"]/td/span[1]').text.strip.scan(/\d/).to_i
          @number_of_floors = page_2.search('//*[@id="twva12"]/td/span[1]').text.strip.to_i
          @year = page_2.search('//*[@id="boja12"]/td/span[1]').text.strip.to_i
          @broker = page_2.search('//*[contains(concat( " ", @class, " " ), concat( " ", "rel-info", " " ))]//h3').text.strip
          @city = page_2.search('//*[@id="nav-path"]/div/p[1]/span[4]/a/span').text.strip
          @district = page_2.search('//*[@id="nav-path"]/div/p[1]/span[5]/a/span').text.strip
          @province = page_2.search('//*[@id="nav-path"]/div/p[1]/span[3]/a/span').text.strip
          @type_of_property = page_2.search('//*[@id="soap12"]/td/span[1] | //*[@id="sowo12"]/td/span[1] | //*[@id="twsp12"]/td/span[1]').text.strip

          Property.create = (
            street_name: @street_name,
            price: @price,
            url: @url,
            living_area: @living_area,
            content: @content,
            rooms: @rooms[0],
            bedrooms: @rooms[1],
            number_of_floors: @number_of_floors,
            year: @year,
            broker: @broker,
            city: @city,
            district: @district,
            province: @province,
            type_of_property: @type_of_property
          )

          puts Property.last
        end
      end
    end

    loop do
      a.get("http://www.funda.nl/koop/rotterdam/sorteer-datum-af/p#{@i}/") do |page|
        @end = page.search('//h3').text == 'Geen koopwoningen gevonden die voldoen aan uw zoekopdracht' ? true : false
        scrape_objects_on_page(page) unless @end == true
        @i = @i + 1
      end

      break if @end
    end

    puts "==================================================================================="
    puts "# Done scraping #{@i - 1} pages and collected #{@all_objects_array.length} objects."
    puts "==================================================================================="



  end
end

This is what my Property model looks like (MongoMapper):

class Property
  include MongoMapper::Document

  key :street_name, String
  key :price, Integer
  key :url, String
  key :living_area, Integer
  key :content, Integer
  key :rooms, Integer
  key :bedrooms, Integer
  key :number_of_floors, Integer
  key :year, Integer
  key :broker, String
  key :city, String
  key :district, String
  key :province, String
  key :type_of_property, String

end

What am I doing wrong?

Answer

Sean Hill picture Sean Hill · Jul 1, 2014

You have a typo. Remove the equal sign between Property.create and the parens. Like below:

Property.create(
            street_name: @street_name,
            price: @price,
            url: @url,
            living_area: @living_area,
            content: @content,
            rooms: @rooms[0],
            bedrooms: @rooms[1],
            number_of_floors: @number_of_floors,
            year: @year,
            broker: @broker,
            city: @city,
            district: @district,
            province: @province,
            type_of_property: @type_of_property
          )

Also, it might be better to store the #create call in a variable instead of calling Property.last. That way, you don't have to issue another query.