Saturday, May 31, 2008

How to disable vim syntax highlighting and coloring

Syntax highlighting is my top annoyance in using vi/vim. Syntax highlighting is just a fancy term meaning that the text editor will auto-color parts of a text file according to some rules that makes sense to it, using some default color scheme.

To be precise, only vim, not vi, has syntax highlighting. vi has a 2 color scheme only: background and foreground. Yet, on my Centos 4 system (and many other distros), the vi command is just a soft link to vim.

Syntax highlighting is useful, and usually nothing to complain about. However, I find the default vim color scheme to be an eye-killer for me.

If you are already in vi/m, you can disable it by typing
:syntax off (and press the return key).

To re-enable coloring, type
:syntax on (and press the return key).

If you want to permanently disable syntax highlighting, insert this in your ~/.vimrc file:
syntax off

Note that even with vim, there can be different versions. On my Debian Etch system, the vim is vim.tiny, and it does not support syntax highlighting. So, you don't need to explicitly disable syntax highlighting.

Tuesday, May 27, 2008

Use the OR operator in grep to search for words and phrases

grep is a very powerful command-line search program in the Linux world. In this article, I will cover how to use OR in the grep command to search for words and phrases in a text file.

Suppose you want to find all occurrences of the words "apples" and "oranges" in the text file named fruits.txt.

$ cat fruits.txt
yellow bananas
green apples
red oranges
red apples


$ grep 'apples\|oranges' fruits.txt
green apples
red oranges
red apples


Note that you must use the backslash \ to escape the OR operator (|).

Using the OR operator, you can also search for phrases like "green apples" and "red oranges". You must escape all spaces in a phrase in addition to the OR operator.
$ grep 'green\ apples\|red\ oranges' fruits.txt
green apples
red oranges


You can get away with not escaping the spaces or the | operator if you use the extended regular expression notation.
$ grep -E 'green apples|red oranges' fruits.txt
green apples
red oranges


egrep is a variant of grep that is equivalent to grep -E.
$ egrep 'green apples|red oranges' fruits.txt
green apples
red oranges


P.S. Additional grep articles from this blog:


Sunday, May 25, 2008

Root edit a file using emacs in the same session

We know that we should always log in using our regular non-root account, and only sudo in when necessary to do things that only root can do. Most of the time, you are logged in as a regular user, and you have your emacs editor open.

Now, you realize that you need to edit a file which is only writable by root (say /etc/hosts.allow).

What you can always do is to open up another emacs session with the right credential, and edit the file there:
$ sudo emacs /etc/hosts.allow


This becomes a little tedious, doesn't it?

A nifty little trick is to use tramp, an emacs package for transparent remote editing of files using a secure protocol like ssh. You then use tramp to ssh into localhost as root, and modify the target file.

tramp comes pre-packaged within GNU emacs 22+. This is pretty handy, especially if you have already configured and using it to remotely edit files.

If you are new to tramp, insert the following lines into ~/.emacs (your emacs configuration file), and restart emacs:
(require 'tramp)
(setq tramp-default-method "scp")


The scp method for tramp uses ssh to connect to the remote host. In this case, you are merely connecting to localhost as root. This provides security for you as you edit the file as root.

Note that if you are also using the emacs package recentf (for remembering the most recently opened files), insert the following line as well. Otherwise, when you restart emacs in subsequent sessions, it will prompt you for the root password.
(setq recentf-auto-cleanup 'never) 


That is it for configuring tramp for use in emacs.

With this setup, you can use the same emacs session you opened as a non-root user to edit a root-only writable file.

To edit the target file, hit Cntl-x followed by Cntl-f, and enter the following before hitting return:
/su::/etc/hosts.allow


When prompted, enter the password for root.

After you finish editing, save the file as you normally do in emacs.

A final note is that you need to be aware of the side-effects of using tramp to edit a file while the auto backup feature of emacs is enabled. Specifically, make sure that the backup file is saved in an expected safe location. See this article for more details.

Thursday, May 22, 2008

Delete Windows/DOS carriage return characters from text files

Different operating system may use different characters to indicate the line break. Unix/Linux uses a single Line Feed (LF) character as line break. Windows/DOS uses 2 characters: Carriage Return/Line Feed (CR/LF). MacOS uses CR.

Nowadays, it is a reality that we operate on multiple platforms. If you transfer a text file created on a Windows machine to a Linux machine, the file will contain those extra Carriage Return characters. Some Linux programs run just fine with those characters in their input, but some are less forgiving.

Below are various ways to remove the Carriage Control characters from each line of a text file:

  • dos2unix
    $ dos2unix input.txt 
    dos2unix: converting file input.txt to UNIX format ...


    dos2unix will convert and overwrite the input file by removing the CR characters.

    Be warned that dos2unix is not by default pre-installed in all Linux distributions. If you have a RedHat-based distribution (e.g., Centos), you are safe.

    On my Debian Etch system, you need to install a package named fromdos, and even then, dos2unix is just a soft link to another program, fromdos. See next command.

  • fromdos
    fromdos and the corresponding todos reside in a package named tofrodos.

    To install,
    $ apt-get install tofrodos  


    To run fromdos,
     $ fromdos input.txt 


    Note that fromdos will overwrite the input.txt file.

  • tr

    $ tr -d '\r' < input.txt > output.txt
    $ cp output.txt input.txt


    \r is the carriage control character.

    tr -d removes the specified character (\r in this case) from the standard input.

    tr deals with the standard input and standard output only. So, tr cannot write directly to the original input file (input.txt): an intermediate file (output.txt) is needed.

  • sed
    $ sed -i.bak -e 's/\r//g' input.txt 


    The advantage of sed over tr is that you can do in-line substitution. No need to create an intermediate file. This is done by the -i option.

    If you want to make a backup of the original input.txt, you can specify a different file suffix like this:
    $ sed -i.bak -e 's/\r//g' input.txt 


    -i.bak will make a backup file by appending the suffix .bak to your original file name, resulting in something like input.txt.bak

  • perl
    $ perl -i.bak -pe 's/\r//g' input.txt



If your system has dos2unix or fromdos installed, then using either one is probably the simplest. Otherwise, tr seems like a safe bet, and it is available on all Linux systems, if you don't mind the extra step of copying the intermediate file. If you absolutely want a one-liner to do the job, then either sed or perl with their in-line modification will satisfy you.

Tuesday, May 20, 2008

Run ifconfig as non-root user for read-only access to network interfaces

It is a frequent scenario that you are logged in to the console of a Linux system, and you need to know its IP address.

If you are the root user, that is easy:
$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:0B:6B:E1:BC:14
inet addr:192.168.0.103 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::20b:6aff:fed0:bb04/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8100 errors:0 dropped:0 overruns:0 frame:0
TX packets:7727 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:5385440 (5.1 MiB) TX bytes:1454259 (1.3 MiB)
Interrupt:177 Base address:0xdc00

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:68 errors:0 dropped:0 overruns:0 frame:0
TX packets:68 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:5204 (5.0 KiB) TX bytes:5204 (5.0 KiB)


However, if you are not root....
$ ifconfig
bash: ifconfig: command not found


At this point, you are probably ready to give up. Don't: there is always hope.

A not well-publicized fact is that the ifconfig command is executable by anyone: it is just NOT on the default PATH for non-root users.

To find out where ifconfig is:
$ whereis ifconfig
ifconfig: /sbin/ifconfig /usr/share/man/man8/ifconfig.8.gz


Is it true that anyone can run ifconfig?
$ ls -l /sbin/ifconfig
-rwxr-xr-x 1 root root 66024 Aug 12 2006 /sbin/ifconfig

The answer is yes.

To run ifconfig, /sbin needs to be on your PATH, is it?
$ echo $PATH
/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/cdk4msp/bin:/home/peter/bin

No, afraid not. No wonder you cannot run the ifconfig command.

It is straight-forward to append that to your PATH.
$ export PATH=$PATH:/sbin
$ echo $PATH
/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/cdk4msp/bin:/home/peter/bin:/sbin


Let's give ifconfig another try.
$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:0B:6B:E1:BC:14
inet addr:192.168.0.103 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::20b:6aff:fed0:bb04/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8590 errors:0 dropped:0 overruns:0 frame:0
TX packets:8218 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:5530730 (5.2 MiB) TX bytes:1509759 (1.4 MiB)
Interrupt:177 Base address:0xdc00

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:68 errors:0 dropped:0 overruns:0 frame:0
TX packets:68 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:5204 (5.0 KiB) TX bytes:5204 (5.0 KiB)


To save some typing, you can combine the setting of the PATH, and the ifconfig command as follow:
$ PATH=$PATH:/sbin ifconfig


Now, non-root users are happy.

Note that only non-root users can only get/read interface data, but not set/write it. Setting interface parameters as a non-root user will generate errors:

$ ifconfig eth0 192.168.0.155
SIOCSIFADDR: Permission denied
SIOCSIFFLAGS: Permission denied

Monday, May 19, 2008

Ping or nmap to identify machines on the LAN

You can use ping or nmap to find out what machines are currently on the local network.

The first method involves pinging the LAN broadcast address.

To find out the broadcast address of the local network:
$ ifconfig eth0
eth0 Link encap:Ethernet HWaddr 01:1B:6B:D8:B1:26
inet addr:192.168.0.103 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::20b:6aff:fed0:bb04/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:70324 errors:0 dropped:0 overruns:0 frame:0
TX packets:69429 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:28758708 (27.4 MiB) TX bytes:9680092 (9.2 MiB)
Interrupt:177 Base address:0xdc00


From the ifconfig output, we determine that the broadcast address is 192.168.0.255. Now, we ping the broadcast address.

$ ping -b -c 3 -i 20 192.168.0.255
WARNING: pinging broadcast address
PING 192.168.0.255 (192.168.0.255) 56(84) bytes of data.
64 bytes from 192.168.0.100: icmp_seq=1 ttl=64 time=0.208 ms
64 bytes from 192.168.0.1: icmp_seq=1 ttl=150 time=0.625 ms (DUP!)
64 bytes from 192.168.0.100: icmp_seq=2 ttl=64 time=0.218 ms
64 bytes from 192.168.0.1: icmp_seq=2 ttl=150 time=0.646 ms (DUP!)
64 bytes from 192.168.0.100: icmp_seq=3 ttl=64 time=0.217 ms

--- 192.168.0.255 ping statistics ---
3 packets transmitted, 3 received, +2 duplicates, 0% packet loss, time 39998ms
rtt min/avg/max/mdev = 0.208/0.382/0.646/0.207 ms


Note that:
-b is required in order to ping a broadcast address.
-c is the count (3) of echo requests (pings) it will send.
-i specifies the interval in seconds between sending each packet. You need to specify an interval long enough to give all the hosts in your LAN enough time to respond.

The ping method does not guarantee that all systems connected to the LAN will be found. This is because some computers may be configured NOT to reply to broadcast queries, or to ping queries altogether.

The second method uses nmap. While nmap is better known for its port scanning capabilities, nmap is also very dependable for host discovery.

You can run nmap as either a non-root user, or root. nmap will only give non-root users the IP address of any host found.
$ nmap -sP 192.168.0.1-254

Starting Nmap 4.11 ( http://www.insecure.org/nmap/ ) at 2008-05-19 17:02 PDT
Host 192.168.0.1 appears to be up.
Host 192.168.0.100 appears to be up.
Host 192.168.0.103 appears to be up.
Nmap finished: 254 IP addresses (3 hosts up) scanned in 2.507 seconds


If you run nmap as root, you will also get the MAC address:
$ nmap -sP  192.168.0.1-254

Starting Nmap 4.11 ( http://www.insecure.org/nmap/ ) at 2008-05-19 18:06 PDT
Host 192.168.0.1 appears to be up.
MAC Address: 03:05:6D:2D:87:B3 (The Linksys Group)
Host 192.168.0.100 appears to be up.
MAC Address: 00:07:95:A9:3A:77 (Elitegroup Computer System Co. (ECS))
Host 192.168.0.103 appears to be up.
Nmap finished: 254 IP addresses (3 hosts up) scanned in 5.900 seconds



-sP instructs nmap to only perform a ping scan to determine if the target host is up; no port scanning or operating system detection is performed.
By default, the -sP option causes nmap to send an ICMP echo request and a TCP packet to port 80.

Using either ping or nmap, you can find out what machines are connected to your LAN.

Saturday, May 17, 2008

How to indent lines in text files using sed, awk, perl

Suppose I have a text file named input.txt, and I want to indent each line in the file by 5 spaces.
$ cat input.txt
12
34
56
78


The following are different ways to do the same thing:
  • sed
    $ sed  's/^/     /'  input.txt
    12
    34
    56
    78

    s/^/ / searches for the beginning of a line (^), and "replaces" that with 5 spaces.

  • awk
    $ awk  '{ print "     " $0 }'  input.txt
    12
    34
    56
    78


  • perl
    $ perl -pe  's/^/     /' input.txt
    12
    34
    56
    78

To indent a range of lines, say lines 1 to 3, inclusive:
  • sed
    $ sed  '1,3s/^/     /' input.txt
    12
    34
    56
    78

    Note that the comma specifies a range (from the line before the comma to the line after).

  • awk
    $  awk  'NR==1,NR==3 {$0 = "     "$0} {print }' input.txt
    12
    34
    56
    78

    An awk program consists of condition/action pairs. The first pair has the condition "if NR (current line number) is from 1 to 3", and the action is to append 5 spaces to $0 (the current line). The second pair has a null condition which means it applies to every line, and the action is just to print the line.

  • perl
    $ perl  -pe '$_ = "     " . $_ if  1 <= $. and $. <=3' input.txt
    12
    34
    56
    78
    $. is the current input line number. $_ is the current line. . (dot) is the concatenate operator.

Click here for another post on sed tricks.

Sunday, May 11, 2008

How to convert text files to all upper or lower case

How can you convert a text file to all lower case or all upper case?

As usual, in Linux, there are more than 1 way to accomplish a task.

To convert a file (input.txt) to all lower case (output.txt), choose any ONE of the following:

  • dd
    $ dd if=input.txt of=output.txt conv=lcase

  • tr
    $ tr '[:upper:]' '[:lower:]' < input.txt > output.txt

  • awk
    $ awk '{ print tolower($0) }' input.txt > output.txt

  • perl
    $ perl -pe '$_= lc($_)' input.txt > output.txt

  • sed
    $ sed -e 's/\(.*\)/\L\1/' input.txt > output.txt

    We use the backreference \1 to refer to the entire line and the \L to convert to lower case.


To convert a file (input.txt) to all upper case (output.txt):

  • dd
    $ dd if=input.txt of=output.txt conv=ucase

  • tr
    $ tr '[:lower:]' '[:upper:]' < input.txt > output.txt

  • awk
    $ awk '{ print toupper($0) }' input.txt > output.txt

  • perl
    $ perl -pe '$_= uc($_)' input.txt > output.txt

  • sed
    $ sed -e 's/\(.*\)/\U\1/' input.txt > output.txt


Thursday, May 8, 2008

How to prevent Linux man pages from clearing after you quit reading

Man pages are excellent resources for learning the specifics of a Linux command. After all, who can remember all the nitty gritty of a command?

One annoyance of reading man pages on some Linux distributions is that after you quit reading it, the contents are cleared off screen. The man page contents simply don't stay around after you quit man. If that happens to you, it means that the default pager for viewing man pages is the less command, and that is how less behaves.

Wiping man contents out or not is a personal preference. Some may like the man stuff being wiped out because it won't clutter up the command window. However, there are times when you want the man contents to be visible after you finish reading it. You will have that information in front of you when you enter the next command.

The good news is that you can change the man page behavior. This is done by changing the default pager from less to something like more.

[beranger-org has a great/better suggestion: instead of changing to more, use less -X]
You can change it permanently or on demand, and for everyone or just individual users. Because it is a personal preference, I would recommend changing it only for yourself.

First, you need to find out where more is.
$ which more
/bin/more


Add the following line in the .bashrc file in your home directory:
export PAGER=/bin/more 


If you want to stick with less, add this line instead.
export PAGER='less -X' 


That customizes the PAGER environment variable every time a shell process is started.

Beware that the default pager is also used in commands other than man. less is a more powerful pager than more. You may wish just to change the PAGER for the current shell session. This can be done by typing in the exact export statement above into the command line.

If you want to revert to the default PAGER (i.e., less), enter this:
$ unset PAGER


If you really insist on changing the default man behavior permanently for everyone on your system, edit the file /etc/man.config (on RedHat-based systems) or /etc/manpath.config on Debian-based systems, and change the line for PAGER. This affects only man page viewing. Alternatively, you can run update-alternatives --config pager (for Debian) or alternatives --config pager (for RedHat) to globally change the PAGER environment variable for all applications.



This is how you can control man when it exits.

Tuesday, May 6, 2008

bash quicksand 2: Quotes needed in string tests

This is part 2 of the bash quicksand series. Part 1 is about whitespaces
in variable assignment
. When you write bash shell scripts, you want to avoid these innocent-looking mistakes. We all fall into these traps at some point, especially when we first write bash scripts.

You define a bash variable, say $somevar. Later, you need to test if it is equal to some string, say "linux".

1  #!/bin/bash
...
5 if [ $somevar == "linux"]
6 then ...
9 fi
...
25 exit 0


Depending on the value of $somevar at the time of the test, you can see some puzzling errors:

  • If $somevar is null (""), then you will see this error:
    ./myscript.sh: line 5: [: =: unary operator expected

  • If $somevar has embedded spaces, e.g., "tux is great", you will see this error:
    ./myscript.sh: line 5: [: too many arguments



Now, you know this trap, don't fall into it.

A good defence is to always enclose the bash variable in quotes:
if [ "$somevar" == "linux"]
then ...
fi

Monday, May 5, 2008

How to Display Routing Table

To display the kernel routing table, you can use any of the following methods:

  • route
    $ sudo route -n
    Kernel IP routing table
    Destination Gateway Genmask Flags Metric Ref Use Iface
    192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
    0.0.0.0 192.168.0.1 0.0.0.0 UG 0 0 0 eth0

    You need to be root to execute route.

    The -n option means that you want numerical IP addresses displayed, instead of the corresponding host names.
  • netstat
    $ netstat -rn
    Kernel IP routing table
    Destination Gateway Genmask Flags MSS Window irtt Iface
    192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
    0.0.0.0 192.168.0.1 0.0.0.0 UG 0 0 0 eth0

    The -r option specifies that you want the routing table. The -n option is similar to that of the route command.
  • ip
    $ ip route list
    192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.103
    default via 192.168.0.1 dev eth0


Sunday, May 4, 2008

Compare Directories using Diff in Linux

To compare 2 files, we use the diff command. How do we compare 2 directories? Specifically, we want to know what files/subdirectories are common, what are only in 1 directory but not the other.

Unix old-timers may remember the dircmp command. Alas, that command is not available in Linux. In Linux, we use the same diff command to compare directories as well as files.

$ diff  ~peter ~george
Only in /home/peter: announce.doc
diff /home/peter/.bashrc /home/george/.bashrc
76,83d72
<
< # Customization by Peter
< export LESS=-m
< export GREP_OPTIONS='--color=always'
< shopt -s histappend
< shopt -s cmdhist
< export PROMPT_COMMAND="history -a;$PROMPT_COMMAND"
< #echo keycode 58 = Escape |loadkeys -
Only in /home/george: .mcoprc
Only in /home/peter: .metacity
Only in /home/george: .newsticker-images
Only in /home/peter: .notifier.conf
Only in /home/george: targets.txt
Only in /home/peter: .xsession-errors


Without any option, diffing 2 directories will tell you which files only exist in 1 directory and not the other, and which are common files. Files that are common in both directories (e.g., .bashrc in the above listing) are diffed to see if and how the file contents differ.

If you are NOT interested in file differences, just add the -q (or --brief) option.

diff -q ~peter ~george  |sort
Files /home/peter/.bashrc and /home/george/.bashrc differ
Only in /home/george: .mcoprc
Only in /home/george: .newsticker-images
Only in /home/george: targets.txt
Only in /home/peter: .metacity
Only in /home/peter: .notifier.conf
Only in /home/peter: .xsession-errors
Only in /home/peter: announce.doc


diff orders its output alphabetically by file/subdirectory name. I prefer to group them by whether they are common, and whether they only exist
in the first or second directory. That is why I piped the output of diff through sort in the above command.

Note that by default diff does not reach into the subdirectories to compare the files and subdirectories at that level. To change its behavior to recursively go down subdirectories, add -r.

diff -qr ~peter ~george  |sort