Day 4: data, variables, coercion, strings and maths except for local vars in user-defined functions, all vars are GLOBAL unlike other languages, AWK treats data as *BOTH* numbers & strings treatment is largely context driven, i.e. if maths => treat as numbers in AWK '0' or "" = False, everything else = True ; unassigned vars = 0 or "" => makes assignment test easy: if (Var) print "T"; else print "F" => same with ternary operator: print (Var) ? "T" : "F" # ex. print value of 'V' before & after assignment: # $ awk 'BEGIN { Fmt = "V = %d (%c)\n" # format str # printf Fmt, V, (V) ? "T" : "F" # V = 1 # printf Fmt, V, (V) ? "T" : "F" # }' # V = 0 (F) # V = 1 (T) # # where, # "%d" in Fmt str coerces V's unset value of "" to 0 for visibility # (V)?"T":"F" ternary oper. returns "T" if V is set, "F" otherwise # by default $0 = current record, split in fields $1 - NF using FS all record fields are reassignable, i.e. {$1="" ; print $0} in above, use 'print substr($0, 2)' to avoid printing leading space setting RS = "" can select multi-line data separated by blank lines # ex. print records w/ "--" separator + total record count: # $ printf 'a\nb\n\nc\nd\n\ne\nf' | \ # awk -vRS='' '//;NR>Prev{print "--";Prev=NR};END{print "record cnt =", NR}' # a # b # -- # c # d # -- # e # f # -- # record cnt = 3 # decrementing NF truncates $0 ; incrementing NF appends empty fields data can be coerced as needed: "42" + 0 => digit ; 42 "" => string as previously shown, AWK will try to use data per context of use: ex. possibly unexpected behavior due to data coercion: # # $ echo "000" |awk '!$0{print "F"}' # "000" => 0 # F # $ awk 'BEGIN{print "123" + "zero" + 4}' # "zero" => 0 # 127 # $ awk 'BEGIN{print "123" + "4evar"}' # "4evar" => 4 # 127 # $ awk 'BEGIN{print "12" 1 + 2 "45"}' # result => str # 1234 # string concatenation: no special operator, just string them together # $ awk 'BEGIN{Str = "A" SUBSEP "wk" FS "is" OFS "odd"; print Str}' # Awk is odd # # where, # Str is a concatenation of 7 strings # FS = OFS = " " # SUBSEP = non-printing char "\034" (used w/ arrays) # useful commands for strings: length(), index(), match(), substr() AWK has several built-in math fuctions: see reference materials a subsectional sampling.. int(x) - returns integer part of x, truncates toward zero rand() returns uniformly distributed pseudorandom # r, 0 ≤ r < 1. srand(x) sets pseudorandom-number generator seed to x ; returns seed srand() uses current time in secs as seed (relative to system epoch) mawk excepted, AWK uses same default seed on each run if srand() not called # ex. rand() with & without srand(): # # $ for R in {1..3} ;do # echo "run #$R:" # awk 'BEGIN { while(i++ < 4) { N=rand() ; print "N =", N} }' # done # run #1: # N = 0.924046 # N = 0.593909 # N = 0.306394 # N = 0.578941 # run #2: # N = 0.924046 # N = 0.593909 # N = 0.306394 # N = 0.578941 # run #3: # N = 0.924046 # N = 0.593909 # N = 0.306394 # N = 0.578941 # # $ for R in {1..3} ;do # echo "run #$R:" # awk 'BEGIN { srand() ; while(i++ < 4) { N=rand() ; print "N =", N} }' # done # run #1: # N = 0.142423 # N = 0.82679 # N = 0.287402 # N = 0.186038 # run #2: # N = 0.142423 # N = 0.82679 # N = 0.287402 # N = 0.186038 # run #3: # N = 0.142423 # N = 0.82679 # N = 0.287402 # N = 0.186038 # => be sure to call srand() for pseudorandom randomness.. int() can be used with rand() to get random integer ranges # ex. print random integer between 1 - N, inclusive: # # $ awk -vN=9 'BEGIN { srand() ; print int(N * rand()) + 1 }' # 7 # $ awk -vN=9 'BEGIN { srand() ; print int(N * rand()) + 1 }' # 3 # $ awk -vN=9 'BEGIN { srand() ; print int(N * rand()) + 1 }' # 6 # beware potentially unexpected behavior when running in shell.. # calling srand() yet not getting random numbers.. # $ for n in {1..5} ;do # bash shell # awk -vN=9 'BEGIN {srand() ; print int(N * rand()) + 1 }' # done # 5 # 5 # 5 # 5 # 5 # # same but with mawk - and NOT calling srand().. # $ for n in {1..5} ;do # bash shell # mawk -vN=9 'BEGIN { print int(N * rand()) + 1 }' # done # 4 # 9 # 8 # 2 # 2 # # looping within awk instead... # $ awk -vN=9 'BEGIN{srand();for(;i<5;i++)print int(N * rand()) + 1}' # 7 # 4 # 2 # 1 # 8 # a bit about precedence of operatorions # AWK operators in the order of precedence (low to high): # (from Sed & Awk, 2nd ed.) # # Operators Description # ---------------------------------------------------------------- # = += -= *= /= %= ˆ= Assignment # ?: C conditional expression # || Logical OR # && Logical AND # ~ !~ Match regular expression and negation # < <= > >= != == Relational operators # (blank) Concatenation # + - Addition, subtraction # * / % Multiplication, division, and modulus # + - ! Unary plus and minus, and logical negation # ˆ Exponentiation # ++ -- Increment & decrement, prefix or postfix # $ Field reference # ---------------------------------------------------------------- # order of precedence can be modified using parens '()' as needed