InfoHeap
Tech tutorials, tips, tools and more
Navigation
  • Home
  • Tutorials
    • CSS tutorials & examples
    • CSS properties
    • Javascript cookbook
    • Linux/Unix Command Line
    • Mac
    • PHP
      • PHP functions online
      • PHP regex
    • WordPress
  • Online Tools
    • Text utilities
    • Online Lint Tools
search

Bash shell scripting

  • Bash - add a number to a variable
  • Bash - append text to a variable
  • Bash - how to check if a variable is set
  • Bash - how to compare file timestamps
  • Bash - how to find last command exit status code
  • Bash - how to get main program and current file dir location
  • Bash - how to redirect stderr to stdout or file
  • Bash - how to run custom commands at script exit
  • Bash - how to stop at error
  • Bash - how to use functions - quick tutorial
  • Bash - iterate over array
  • Bash - local and global variables
  • Bash - newline and other escape character in string
  • Bash - pass all arguments from one script to another
  • Bash - set default value if a variable is empty
  • Bash - variables in double quotes vs without quotes
  • Bash associative array tutorial
  • Bash check if file begins with a string
  • Bash shell - check if file or directory exists
  • Can global variables be modified in bash function?
  • Find memcache request hit rate on linux command line
  • How to return a value from bash function
  • Iterate over specific file extension in a dir in shell script
  • Linux - Yesterday's Date in YYYYMMDD format
  • bash - extract urls from xml sitemap
  • bash - how to use regex in if condition
 
  • Home
  • > Tutorials
  • > Bash shell scripting

bash – extract urls from xml sitemap

By admin on Jan 10, 2016

Command to get urls from a sitemap url:

curl -s sitemap_url | grep "<loc>" | awk -F"<loc>" '{print $2} ' | awk -F"</loc>" '{print $1}' 


To get these in a bash variable for iteration, use sub-shell command execution as shown below:
urls=$(curl -s site_map_url | grep "<loc>" | awk -F"<loc>" '{print $2} ' | awk -F"</loc>" '{print $1}')
for i in $urls 
do
  echo "$i"
done

Use case

This approac can be used to write a quick cache warming script for a site.

Suggested posts:

  1. How to override priority and change frequency in Yoast xml sitemap
  2. How to Preload cache in wordpress on Linux
  3. Xml sitemap – quick introduction
  4. Php apc vs memcache
  5. Bash – pass all arguments from one script to another
  6. PHP apc – setup and performance benchmarks on Ubuntu Linux
  7. Bash – local and global variables
  8. Bash – how to use functions – quick tutorial
Share this article: share on facebook share on linkedin tweet this submit to reddit
Posted in Tutorials | Tagged Bash shell scripting

Follow InfoHeap

facebook
twitter
googleplus
  • Browse site
  • Article Topics
  • Article archives
  • Recent Articles
  • Contact Us
  • Omoney
Popular Topics: Android Development | AngularJS | Apache | AWS and EC2 | Bash shell scripting | Chrome developer tools | CSS | CSS cookbook | CSS properties | CSS Pseudo Classes | CSS selectors | CSS3 | CSS3 flexbox | Devops | Git | HTML | HTML5 | Java | Javascript | Javascript cookbook | Javascript DOM | jQuery | Kubernetes | Linux | Linux/Unix Command Line | Mac | Mac Command Line | Mysql | Networking | Node.js | Online Tools | PHP | PHP cookbook | PHP Regex | Python | Python array | Python cookbook | SEO | Site Performance | SSH | Ubuntu Linux | Web Development | Webmaster | Wordpress | Wordpress customization | Wordpress How To | Wordpress Mysql Queries

Copyright © 2023 InfoHeap.

Powered by WordPress