Command to get urls from a sitemap url:
curl -s sitemap_url | grep "<loc>" | awk -F"<loc>" '{print $2} ' | awk -F"</loc>" '{print $1}' To get these in a bash variable for iteration, use sub-shell command execution as shown below:
urls=$(curl -s site_map_url | grep "<loc>" | awk -F"<loc>" '{print $2} ' | awk -F"</loc>" '{print $1}')
for i in $urls
do
echo "$i"
doneUse case
This approac can be used to write a quick cache warming script for a site.