Once in a while I need to write a shell script that extracts some information from a file and does something with it. It is easy to accomplish with the standard UNIX tools, such as cat, grep, head, sed, etc., once you’ve learned their features. Today’s example is, having an index.html file, we need to extract the css files imported in it (except the commented ones and, say, mobile.css).
It works, but kind of long: filter out strings not having .css lines, ignore excluded patterns, remove everything except href="…", and finally cut the contents. Turns out this can be rewritten with just one command, using sed:
Here -n parameter disables output of the file by default, and -E enables extended regexes. The command is: for each line containing .css, replace the whole line with the part inside href="…" and print it, unless the line contains excluded patterns. Nice, easy enough, and should be faster. Enjoy!