previously, wrote small shell script "retokenize" file (useful comparing sanity checks). i'm in need of doing similar folder instead of 1 file.
i'm curious if there easy way rework following method / function , how recursively pass files in folder method, end result files in folder "retokenized". hoping see if there quick , easy way this. being doing googling , playing around, want see if here has quick / easy / clean solution.
working version 1 file:
#!/bin/bash date outputdump="output.txt" prodpropsfile="input.properties" prodpropssortedfile="sorted.properties" temppropsfile="temp.properties" echo "removing comments , empty lines prod properties file" sed '/^#/d' < $prodpropsfile > $temppropsfile sed '/^s*$/d' < $temppropsfile > $prodpropssortedfile cp $prodpropssortedfile $temppropsfile echo "sorting prod properties value length. don't double tokenization" awk -f"=" '{ st = index($0,"="); print length(substr($0,st+1)),$0 }' $temppropsfile | sort -rn | cut -d" " -f2- > $prodpropssortedfile echo "retokenizing." while ifs== read k v; # sed escape /, \, , &. needed urls jdbc connections, etc. escapedv=$(echo $v | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g') # /gi replace tokens globally case insensitive, important in case "http://..." versus "http://...". sed -i -- "s/$escapedv/$k/gi" $outputdump; done < "$prodpropssortedfile"
example property file:
%%token1%%=value1 %%token2%%=value2
example input file:
this file has value1 , value2.
example output file:
this file has %%token1%% , %%token2%%.
updated script works files in folder on mac:
#!/bin/bash date retokenize() { echo "retokenizing $file" while ifs== read k v; # sed escape /, \, , &. needed urls jdbc connections, etc. escapedv=$(echo $v | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g') sed -i '' "s/$escapedv/$k/g" $file; done < "$prodpropssortedfile" } # copy our input output file modify, don't affect original. inputdump="iiqexports" prodpropsfile="input.properties" prodpropssortedfile="sorted.properties" temppropsfile="temp.properties" echo "removing comments , empty lines prod properties file" sed '/^#/d' < $prodpropsfile > $temppropsfile sed '/^s*$/d' < $temppropsfile > $prodpropssortedfile cp $prodpropssortedfile $temppropsfile echo "sorting prod properties length." awk -f"=" '{ st = index($0,"="); print length(substr($0,st+1)),$0 }' $temppropsfile | sort -rn | cut -d" " -f2- > $prodpropssortedfile echo "retokenizing." find ./$inputdump/ -type f > foo.txt ifs=$'\n';for file in $(cat foo.txt); retokenize $file; done echo "done." date
Comments
Post a Comment