Saturday, April 26, 2008

Using awk to extract lines in a text file

awk is not an obvious choice as a tool for strictly extracting rows from a text file. It is better known for its column/field manipulation capabilities in a text file. More obvious choices are sed, and perl. You can see how sed does it in my earlier entry.

If you opt for awk, you can use its NR variable which contains the number of input records so far.

Suppose the text file is somefile:
$ cat > somefile.txt
Line 1
Line 2
Line 3
Line 4

To print a single line number, say line 2:
$ awk 'NR==2' somefile.txt
Line 2


If the text file is huge, you can cheat by exiting the program on the first match. Note that this hack will not work if multiple lines are being extracted.
$ awk 'NR==2 {print;exit}' somefile.txt
Line 2


To extract a section of the text file, say lines 2 to 3:
awk 'NR==2,NR==3' somefile.txt
Line 2
Line 3


A more interesting task is to extract every nth line from a text file. I showed previously how to do it using sed and perl.

Using awk, to print every second line counting from line 0 (first printed line is line 2):
$ awk '0 == NR % 2'  somefile.txt
Line 2
Line 4


To print every second line counting from line 1 (first printed line is 1):
$ awk '0 == (NR + 1) % 2'  somefile.txt
Line 1
Line 3


% is the mod (i.e. remainder) operator.

7 comments:

Unknown said...

I'm having trouble using the print every nth command inside an actual script, basically I'm using awk to split a long string of characters into having each be its own line using the split "" command, then I wanna pull out every 1001-1003 character, any suggestions?

sgruenwald said...

want to pull a number of lines from a long file using a specific formula

n (number of iterations in a loop)
a (offset number)
b (stretch factor)

example with n {1..100}

for (( n=1; n<101; n++ )); do awk -v n=$n 'NR==a+(b*n)' a=0 b=1 inputfile >>outputfile

marwa said...

is it possible to make NR equal variable I entered before ?!!

like this example :
awk -F: '{if(NR==$no) {print $0}}' /etc/passwd

it is right or not : NR==$no

Alexandrescu Sergiu said...

@marwa: no, this is a correct example:
COUNT=1;
for i in `nvram show | grep traff- | cut -f1 -d=""`;
do
NEW=` nvram show | grep traff- | awk 'NR == a {print}' a=$COUNT`
echo $NEW
done

Anonymous said...

What if you would like only the first word from a line (or for the same matter the second word). Words are separated by single space.

Anonymous said...

Much appreciated, especially for a Linux newbie! Glad I found this blog :-)

thota said...

file.txt

line1
line2
line3

i need to assign each line to a variable like

a=line1
b=line2