interesting. thanks for sharing that experience.
i have not personally looked at the posts:remap
yet. i actually didn’t know it existed, but for something so finicky as bulk text modification, i’d want to roll my own anyways. and since you say the behavior is goofy, that goes double.
i’ve got some experience with rails now and have written my own rake tasks to do similar things.
i’ve learned it’s almost always a bad idea to deal with the db directly with a rails app. you should, at a minimum, be using methods like .update()
on active record objects because it often does validations and may hook other relations to do good things.
i think this specific example is not such a hard one to tackle. food for thought…
if i wanted to bulk-replace /rules to /tos in all posts:
i’d start like this using a postgres regex for efficiency (just for initial identification of posts)
# your_regex is a trivial regex to find posts with links to the rules page
# i would use the cooked column here because it mostly guarantees finding real links
# because links in the cooked version are html tags, so you can match on <a href=...
rules_posts = Post.where('cooked ~ ?', your_regex)
before modifying, i’d hold onto the post ids (maybe even write to a file for safekeeping) so you can figure out rebaking later in another step.
# get an array of all the post ids
matching_post_ids = rules_posts.pluck(:id)
after that, i would iterate over the matching posts and use regex tools in ruby itself (not postgres) to replace the urls in the raw text and update each post like:
rules_posts.each do |p|
# fix the urls
old_post_text = p.raw
# do your string replacements
# new_post_text = ...some magic
# update the post
p.update(raw: new_post_text)
end
when all’s good, i would then rebake just those posts at a leisurely pace. no need to scale up your hardware.
# those ids you savd from before :)
matching_post_ids.sort.reverse.each do |id|
puts "rebaking post #{id}"
Post.find(id).rebake!
sleep(1)
end
that will rebake them in descending chronological order (newest posts first), which i think makes sense. it will also wait a leisurely 1s between posts, which is probably unnecessary. look at matching_post_ids.length
and experiment with delay. maybe just try the first 100 posts to start.
btw, you can put your script in a rake task like this:
rules_link_replace.rb
desc 'replace /rules liks to /tos'
task :rules_replace => :environment do
require 'pry'
#...
# btw, you can drop into an interactive ruby env at any point in your code
# just put this line wherever you want:
binding.pry # debugging happens here
end
then run the script with a helper like this:
run_rules.sh
#!/usr/bin/env bash
container=app
docker cp ./rules_link_replace.rb "$container:/var/www/discourse/lib/tasks/"
docker exec -it "$container" rake rules_replace