Importing a Sitemap XML into Google Sheets

The Goal In the following short article, we want to import data from an existing sitemap XML file into a new Google Sheet document. The sheet must pull the sitemap via HTTP protocol and extract all the URLs from the sitemap and insert them. into the sheet. Implementation Now let’s implement it .. it only takes 1 minute …​ For demonstration purpose, I’m going to use the sitemap from my old blog, to be found at https://www.hascode.com/sitemap.xml. ...

January 5, 2023 · 2 min · 280 words · Micha Kops

Linux Snippets

These are not only linux snippets but also bash snippets and snippets using tools that run under Linux, *nix or sometimes even MacOSX, I should reorder this article someday ;) Settings for more reliable bash scripts set -euo pipefail this gives us …​ -e: exit script if a single command fails -u: exit script if an unset variable is used -o pipefail: return value of a pipeline is the status of the last command to exit with a non-zero status, or zero if no command exited with a non-zero status ...

March 1, 2010 · 15 min · 3006 words · Micha Kops

XML Snippets

Ignore Namespaces in XPath Query e.g. Query for all xxx nodes ignoring their namespace: xmllint --xpath '//*[local-name()="xxx"]' input.xml An example parsing URLs from a sitemap XML. The URLs are located in //url/loc where all nodes are bound to the namespace http://www.sitemaps.org/schemas/sitemap/0.9. The following query ignores the namespace xmllint --xpath '//*[local-name()="url"]/*[local-name()="loc"]' sitemap.xml Pretty Print XML in the Console using xmllint echo '<blogs><blog url="https://www.hascode.com/">hasCode.com</blog></blogs>' | xmllint --format - <?xml version="1.0"?> <blogs> <blog url="https://www.hascode.com/">hasCode.com</blog> </blogs> ...

March 1, 2010 · 1 min · 99 words · Micha Kops