I'm using Heroku to host my Ruby on Rails application and for one reason or another, I may have some duplicate rows.
Is there a way to delete duplicate records based on 2 or more criteria but keep just 1 record of that duplicate collection?
In my use case, I have a Make and Model relationship for cars in my database.
Make Model
--- ---
Name Name
Year
Trim
MakeId
I'd like to delete all Model records that have the same Name, Year and Trim but keep 1 of those records (meaning, I need the record but only once). I'm using Heroku console so I can run some active record queries easily.
Any suggestions?
class Model
def self.dedupe
# find all models and group them on keys which should be common
grouped = all.group_by{|model| [model.name,model.year,model.trim,model.make_id] }
grouped.values.each do |duplicates|
# the first one we want to keep right?
first_one = duplicates.shift # or pop for last one
# if there are any more left, they are duplicates
# so delete all of them
duplicates.each{|double| double.destroy} # duplicates can now be destroyed
end
end
end
Model.dedupe