Introduction of AWK and use case

Recommended for you: Get network issues from WhatsUp Gold. Not end users.

Introduction of AWK and method of use

Awk is a powerful text analysis tools, relative to the grep lookup, editor of SED, awk in the data analysis and report generation, is very powerful. In simple terms awk is the file line by line reading, with an empty for the default delimiters to each slice, cut part of various analysis and processing.
Awk has 3 different versions: awk, nawk and gawk, especially not, generally refers to gawk, gawk is a AWK version of GNU.
The first letter of the name awk from its founder Alfred Aho, Peter Weinberger and Brian Kernighan name. In fact, AWK does have its own language: AWK programming language, the three founders it has been officially defined as "style scanning and processing language".
It allows you to create a short program, the program reads the input file, generate reports for data sorting, data processing, calculation for the input performs well, and countless other functions.
Three kinds of method calls AWK:
The 1 command line
awk [-F field-separator] 'commands' input-file(s)
Commands is a real awk commands, [-F field separator] is optional, the default empty . Input-file (s) is pending file
A 2.shell script
All of the awk commands into a file, and the awk executable program, then the awk command interpreter as the first line of the script, the script name to call type.
The equivalent of the shell script: /bin/sh into the #!: #!/bin/awk
3 all of the awk commands into a single file, and then calls the:
awk -f awk-script-file input-file(s) The --f option to load the awk-script-file in awk scripts, input-file (s) with the above is the same.
##################################################################################################

Awk built-in variables

The built-in variable is used to set the environment information, these variables can be changed, the following are some of the most commonly used variables.
$The 0 variable refers to the whole record. said the first domain in the current row, said the current line of the second domains, and so on...
The ARGC command line parameters
Arrangement of ARGV command line parameters
Use the ENVIRON support system environment variables in the queue
FILENAME awk browsing file name
FNR browse the file record number
Set the FS input field delimiter, equivalent to the -F command line options
NF browsing a record number of domains
NR read the number of records
The OFS output field separator
The ORS output record separator
RS control record separator
The experimental data are as follows: - a part of ORACLE from the start of the ALERT log out
[oracle@bys3 ~]$ Cat awktest.log - the last two lines manually increase: number, convenient experiment
MMAN started with pid=9, OS id=22862
DBW0 started with pid=10, OS id=22866
LGWR started with pid=11, OS id=22870
CKPT started with pid=12, OS id=22874
SMON:started with pid=13, OS id=22878
RECO:started with pid=14, OS id=22882

Note - the person that used to write out, about AWK of each parameter did not write, can read DAVE's blog:
##################################################################################################

Output type and file merge, ranks change etc.

Awk also provides the function of print and printf two kinds of print output:
The print parameters of the function can be a variable, or string number value. The string must be quoted using double quotes, parameters are separated by commas. If there is no comma, parameters of series to distinguish together. Here, separated and output file comma characters of the effect is the same, but the latter is empty .
The printf function, the usage and the C language printf like type, string, the output of complex, printf is more easy to use, the code more understandable.
Use the built-in variable shows the input file name, line number, column number, for the specific content of --filename if data came through | pipeline, filename display-
[oracle@bys3 ~]$ awk '{print "filename:" FILENAME ",linenumber:" NR ",columns:" NF ",linecontent:"$0}' awktest.log
filename:awktest.log,linenumber:1,columns:6,linecontent:MMAN started with pid=9, OS id=22862 - displays only one line, the back row omitted
Each of the 2 columns into a line
[oracle@bys3 ~]$ awk '{if (NR%2==0){print $0} else {printf"%s ",$0}}' awktest.log
MMAN started with pid=9, OS id=22862 DBW0 started with pid=10, OS id=22866
Each of the 3 line draw a line:
[oracle@bys3 ~]$ awk '(NR%3==0){print $0}' awktest.log
LGWR started with pid=11, OS id=22870
RECO:started with pid=14, OS id=22882
Merge and split files
[oracle@bys3 ~]$ cat awktest.log >awkt
[oracle@bys3 ~]$ awk '{print FILENAME,$0}' awktest.log awkt >A.log - awkt to a.log with awktest.log
[oracle@bys3 ~]$ Cat a.log - intercept part
awktest.log MMAN started with pid=9, OS id=22862
awktest.log DBW0 started with pid=10, OS id=22866
awkt MMAN started with pid=9, OS id=22862
awkt DBW0 started with pid=10, OS id=22866
[oracle@bys3 ~]$ rm -rf awkt*
The merged file on the step, split into two documents before the merger. In accordance with the first column a.log to generate a new file name
[oracle@bys3 ~]$ awk '$1!=fd{close(fd);fd=$1} {print substr($0,index($0," ")+1)>$1}' a.log
[oracle@bys3 ~]$ cat awkt
MMAN started with pid=9, OS id=22862
DBW0 started with pid=10, OS id=22866
LGWR started with pid=11, OS id=22870
CKPT started with pid=12, OS id=22874
SMON:started with pid=13, OS id=22878
RECO:started with pid=14, OS id=22882
[oracle@bys3 ~]$ Cat awktest.log - and cat awkt content.
##################################################################################################

Delimiter usage examples: the default is empty or tab separated

[oracle@bys3 ~]$ cat awktest.log |awk '{print $5}' Using the default delimiter
OS
OS
OS
OS
id=22878
id=22882
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '{print $4" #@# "$5"\t"$9}' -- simultaneously use four separators:,: = null; display; fourth, 5, 9 domains, domain are separated by the specified symbol
pid #@# 9 22862
pid #@# 10 22866
pid #@# 11 22870
pid #@# 12 22874
pid #@# 13 22878
pid #@# 14 22882
##################################################################################################

At the same time, the use of'BEGIN END

BEGIN: Let the user specify in the first input records are processed before the action, usually set the global variable in here.
END: Let users after being read in the last input record movements.
The working process of awk is this: the first implementation of BEGING, and then read the file, read a /n line breaks a record segmentation, then record in the specified field delimiter domain division, region filling, awk工作流程是这样的: 先执行BEGING, 然后读取文件, 读入有/n换行符分割的一条记录, 然后将记录按指定的域分隔符划分域, 填充域, $0则表示所有域,$1表示第一个域,$n表示第n个域,随后开始执行模式所对应的动作action. 接着开始读入第二条记录······直到所有的记录都读完, 最后执行END操作 said all domain, said the first domain, $n said the N domain, then the corresponding the execution mode of action of action. Then began to read into the record second · · · · · · until all the records are read, and finally perform a END operation.
I'm here on BEGIN: END:, display point character, BEGIN/END can use only one
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' 'BEGIN {print "header-a#@#bb ospid"} {print $4"#@#"$5"\t"$9} END {print "Hello everone,my name is leifeng!"}'
header-a#@#bb ospid
pid#@#9 22862
pid#@#10 22866
pid#@#11 22870
pid#@#12 22874
pid#@#13 22878
pid#@#14 22882
Hello everone,my name is leifeng!
##################################################################################################

Display the row filter:

First use'/MMAN/'to filter out the line containing MMAN, and then into another AWK running, \n line, \t is equivalent to TAB
[oracle@bys3 ~]$ cat awktest.log |awk -F '[,= :]' '/MMAN/' |awk -F'[,= :]' '{print $9"\n"$1"\t"$0}'
22862
MMAN MMAN started with pid=9, OS id=22862
Can be simplified as:
[oracle@bys3 ~]$ cat awktest.log |awk -F '[,= :]' '/MMAN/{print $9"\n"$1"\t"$0}' -- show a MMAN line
22862
MMAN MMAN started with pid=9, OS id=22862
Display at the beginning of the LGWR line to CKPT at the beginning of the lineIf at the beginning of the CKPT line and LGWR lines beginning, continued to show to the beginning of the next CKPT lines, if not the beginning of the next CKPT lines, then display to the end of the file
[oracle@bys3 ~]$ cat awktest.log |awk '/^LGWR/,/^CKPT/'
LGWR started with pid=11, OS id=22870
CKPT started with pid=12, OS id=22874
[oracle@bys3 ~]$ cat awktest.log |awk '/^LGWR/,/^CKPq/' No CKPq at the beginning of the line, has been shown to the end of the file
LGWR started with pid=11, OS id=22870
CKPT started with pid=12, OS id=22874
SMON:started with pid=13, OS id=22878
RECO:started with pid=14, OS id=22882
##################################################################################################

The comparison operation: Can do is greater than or equal to the small and the addition and subtraction addition operation

[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '{print $5}'
9
10
11
12
13
14
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5>12 {print $5}'
13
14
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5==10 {print $5 "#\t#" $0}'
10# #DBW0 started with pid=10, OS id=22866
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5<10 {print $5 "\t" $0}'
9 MMAN started with pid=9, OS id=22862
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5<10 {print$5*9 "\t" $0}' ---$5*9 Display 9*9 --81
81 MMAN started with pid=9, OS id=22862
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5<10 {print $5/9 "\t" $0}' ---$5/9 9/9 --1
1 MMAN started with pid=9, OS id=22862
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5<10 {print $5 "\t" $9}'
9 22862
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5*$9<220000 {print $5*$9 "###" $5 "\t" $9}' --$5*$9<220000 shows * is less than 22W, the interception of the trip$5 $9
205758###9 22862
##################################################################################################

The use of regular expressions matching character example:

Beginning with MM line search
[oracle@bys3 ~]$ cat awktest.log |awk '/^MM/'
MMAN started with pid=9, OS id=22862
Search is the beginning of MM or D or L.
[oracle@bys3 ~]$ cat awktest.log |awk '/^(MM|D|L)/'
MMAN started with pid=9, OS id=22862
DBW0 started with pid=10, OS id=22866
LGWR started with pid=11, OS id=22870
Search switch is letter M D L.
[oracle@bys3 ~]$ cat awktest.log |awk '/^[MDL]/'
MMAN started with pid=9, OS id=22862
DBW0 started with pid=10, OS id=22866
LGWR started with pid=11, OS id=22870
Find the specified domain after two is digital, and digital respectively is 0-9 and 0-2. In this representation from the end is
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5 ~/[0-9][0-2]$/{print $5}'
10
11
12
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5 ~/[3-4]$/{print $5}' - find the letters at the end of 3-4
13
14
##################################################################################################

Regular logic logical operators: greater than the not equal to and and or operation

[oracle@bys3 ~]$ cat awktest.log
MMAN started with pid=9, OS id=22862
DBW0 started with pid=10, OS id=22866
LGWR started with pid=11, OS id=22870
CKPT started with pid=12, OS id=22874
SMON:started with pid=13, OS id=22878
RECO:started with pid=14, OS id=22882
Display ==10||> 22880 is equal to 10 or 22880 rows of >
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5==10||$9>22880 {print $5"\t" $9}'
10 22866
14 22882
Display > 10& & > 22880 greater than 10 and > 22880 rows
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5>10&&$9>22880 {print $5"\t" $9}'
14 22882
Display ! =10 ! =11 is not equal to 10 and not equal to 11.
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '$5!=10&&$5!=11 {print $5"\t" $9}'
9 22862
12 22874
13 22878
14 22882
If the print ( > 12? Here if judge > 12 is really, really is displayed before the colon value if not, really, is displayed after the colonvalue
[oracle@bys3 ~]$ cat awktest.log |awk -F'[,= :]' '{print ($5 > 12 ? "ok \t"$5: "error\t"$5)}'
error 9
error 10
error 11
error 12
ok 13
ok 14
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download

Posted by Lena at December 11, 2013 - 2:58 AM