Shell Parse Order: Breaking the rules

June 21, 2010

Rules are meant to be broken, hence the Shell parsing order.

There might need arises to skip some of the steps in shell parsing order mentioned in my previous post and we use Quotes ( single & double quotes ) or ‘eval’ command to skip or redo the steps in the parsing order.

Double quotes (“):

Double quotes would run through steps 1, 6, 7, 8 from the parsing order before execution. ie.

1. Tokenize

6. Parameter Substitution

7. Command Substitution

8. Arithmetic Expression

Single Quotes:

Single quotes (‘) would simply skip through all the steps after Tokenize and execute the command.

eval:

While quotes have been used to skip through, the ‘eval’ command has been used to redo the entire sequence of shell parsing once again. ie. The output from the step 11 would be fed to the step 1 and the entire parsing would happen once again.


Understanding the Shell Parsing Order

February 9, 2010

The shell understands the commandline as pipeline and list,

* Pipeline is sequence of one or more commands separated by the character |
* List is sequence of one or more pipelines separated by one of the operators: ; , & , && , ||

For each pipeline it will perform the following steps before executing the command,

1. Tokenize

Splits the commands into tokens that are separated by the fixed set of metacharacters.
Tokens => words, keywords, I/O redirectors and semicolons
metacharactes => space, tab, newline, ;(,),<,>,| and &

2. Compound commands
Checks the first token to see if its a keyword with no quotes or backslashes. It can be opening keyword ( like if , { or () ie. compound command) or control structure middle ( like then, else, do ) , end ( fi , done )

3. Aliases
Checks the first word of each command against the list of aliases.
3.i) if match is found, substitutes the alias’s definition and goto Step 1
eg. ll becomes ‘ls -l *’ ### where alias ll=”ls -l *” defined already
3.ii) Otherwise goto Step 4

4. Brace expansion
Similar to pathname expansion except file names generated need not exist. Its a mechanism by which arbitrary strings may be generated.

eg. a{b,c} becomes ab ac.

5. Tilde expansion
If a word begins with an unquoted tilde character ( ~ ), the characters following the tilde are treated as a possible login name.
eg. cd ~joe becomes cd ‘/home/joe’ ### where /home/joe is the $HOME of user joe

6. Parameter substitution
The value of the parameter is substituted.
eg. # echo ${HOME} becomes ‘/home/joe’

7. Command substitution
Command name will be replaced with the standard output of the command . It is of two forms $(string) or `string`.
eg. # `which find` or $(which find) becomes ‘/usr/bin/find’

8. Arithmetic expression
Performs evaluation of an arithmetic expression and substitute the results. It is of the form $((string)).
eg.1 # $(($a+$b)) becomes ’9′ ### where a=5; b=4

eg.2 # $(($(cat /tmp/file1 | wc -l) + $b)) becomes 7 ### calculate no. of lines in /tmp/file1 ( 3 lines ) and add it to the value of $b ( 4 )

9. Word splitting
Takes part of the line resulted from parameter, command and arithmetic substitution and splits them into word again using $IFS as delimiters instead of metacharacters used in step 1.

The default value of IFS is exactly [space][tab][newline]

10. Pathname expansion
Also known as filename generation or wildcard expansion. The shell scans for the characters *, ? and [ ie. pattern and replace it with an alphabetically sorted list of file names matching the pattern.
eg.1 # file* becomes ‘file1 file2 file3 file1.txt file2.txt file3.txt’

eg. 2 #file? becomes file1 file2 file3

11. Functions, Built-ins and $PATH

At this stage the command has been already split into words containing a simple command and optional list of arguments. And if that simple command ie. first word, contains no slashes ( not an absolute path ) , the shell will attempt to locate it in the following search order

11.i) Function
If there exists a function by that name, it ‘ll be invoked
11.ii) Built-in commands
If there exists a built-in command, it ‘ll be invoked
11.iii) Search $PATH
Finally searches each element of the path for directory containing the command name. The first file found will be chosen.

12. Redirection
Before a command is executed, its input and output may be redirected. Redirection operators < , > , <<, >>
eg. cat /tmp/file1 > /tmp/file2


Introducing patternchecker

January 24, 2010

Often I need a small utility script/program to check the matched results of regex pattern against known text. So I dont have to execute my lengthy main script/program multiple times to get the correct pattern.

I’ve used the following patternchecker script (PERL) & program (JAVA) while coming up with patterns and for demonstration in my previous posts.

Usage:
# java patterncheck
Enter pattern : (x*)(x)(x+)
Enter string:foxxxxx
Matched : fo<<< xxxxx >>>
group 1 = xxx
group 2 = x
group 3 = x

Perl and Java uses the same regex engine and thus would behave similarly while matching patterns, I’ve listed here the sourcecode in perl and java out of my curiosity although you just need any one of them,

i) perl

#!/usr/bin/perl
print "Enter pattern :";
### Read the regex pattern
chomp($pattern=<STDIN>);

### count the number of paranthesis
$cnt++ while( $pattern =~ /\(/g );
print "Enter string :";
chomp($input=<STDIN>);

### read the string to be tested against
if ( $input =~ /$pattern/ ){

      ### print prematch, match, postmatch
      print "Matched: $`<<< $& >>> $'\n";

      ### print $1, $2, etc
      foreach $i (1..${cnt}) { print "\$",$i," = ",${$i},"\n";}
} else {
      print "No Match\n";
}
<div>

ii) java

import java.lang.*;
import java.io.*;
import java.util.regex.*;

public class patterncheck{
	public static void main(String args[]){
		String pattern, str;
		int groups;

		//Read pattern and string from the console
		Console c = System.console();
		pattern=c.readLine("%s","Enter pattern :");
		str=c.readLine("%s","Enter string:");

		//compile and associate the pattern with the text
		Pattern p = Pattern.compile(pattern);
		Matcher m = p.matcher(str);

		//if there is a match
		if (m.find()){
			// print prematch, match & postmatch
			System.out.println("Matched : "+str.substring(0,m.start())+"<<< "+ m.group() +" >>>"+str.substring(m.end()));

			// print what captured in paranthesis
			if (m.groupCount() != 0) {
				groups=m.groupCount();
				for (int i=1;i<=groups;i++){
					System.out.println("group "+i+" = "+m.group(i));
				}
			}
		}
		else {
			System.out.println("No Match");
		}
	}
}


Regular Expressions: Icebreakers

January 24, 2010

Few days back, there was a knowledge session on regular expressions within my team. After discussing the usual topics like greedy & lazy quantifiers, backreferences, etc, we started analyzing match results for few expressions. I ‘ve familiarity with regex and used them in majority of my throw-away scripts and I thought I knew regex unless been baffled with the simple questions from the team. I list down few of those simplest of the simple patterns and what they match and why ( which actually led to me learn the rules of the game),

Before even starting to look at them, did I mentioned earlier that regex engine would start its search just before the first character of the string ? If not, let me tell you now, it need to start before the first character, if and all the patterns  contains anchors ( ^, \b, etc ), it needs to check them too. And the search would go beyond the last character in the string and now you know why ( to match $, \b, etc ).

(i) x*

pattern : x*

string :foxxx

Matched: <<<  >>> foxxx

Explanation: As mentioned in the rule 2 here,  the greediness would always try to match more, hence read the pattern ‘x*’ as  ’match more occurrence of x or nothing’. And the engine going to do its search character by character in the string. Since it could not find any ‘x’ to match at the starting position, it tries with its other choice ‘ match nothing’ and it succeeds.

(ii) .*

pattern : .*

string :foxxx

Matched: <<< foxxx >>>

Explanation: ‘.’ matches anything other than ‘\n’. Though the pattern ‘.*’ can be read as ‘match more of any characters other than ‘\n’ or nothing’, the rule of greediness gives the preference to match more characters.

(iii) x*

pattern : x*

string : xxxfoxxx

Matched: <<< xxx >>> foxxx

Explanation: Same greediness favors the match more criteria.


Regular Expressions: The Rules

January 24, 2010

The following are the rules, a non-POSIX regular expression engine(such as in PERL, JAVA, etc ) would adhere to while attempting to match with the string,

Notation: the examples would list the given regex(pattern) , the string tested against (string) and the actual match happened in the string in between ‘<<<’ and ‘>>>’.

1. The match that begins earliest/leftmost wins.

The intention is to match the cat at the end but the ‘cat’ in the catalogue won the match as it appears leftmost in the string.

pattern : cat

string :This catalogue has the names of different species of cat.

Matched: This <<< cat >>> alogue has the names of different species of cat.

1a.The leftmost match in the string wins, irrespective of the order a pattern appears in alternation

Though last in the alternation, ‘catalogue’ got the match as it appeared leftmost among the patterns in the alternation.

pattern :species|names|catalogue

string :This catalogue has the names of different species of cat.

Matched: This <<< catalogue >>>  has the names of different species of cat.

1b. If there are more than one plausible match occurs in the same position, then the order of the plausible matching patterns in the alternation counts.

All three patterns have a possible match at the same position, but ‘over’ is successful as it appeared first in the alternation.

pattern : over|o|overnight

string :Actually, I’m an overnight success. But it took twenty years.

Matched: Actually, I’m an <<< over >>> night success. But it took twenty years.


2. The standard quantifiers (* +, ? and {m,n}) are greedy

Greediness (*,+,?) would always try to match more before it tries to match minimum characters needed for the match to be successful ( ’0′ for *,? ; ’1′ for + )

The intention is to match the “Joy is prayer”, though .* went pass across all the double quotes and grabbing all the strings only to match the last double quote (“).

pattern :”.*”

string :”Joy is prayer”.”Joy is strength”.”Joy is Love”.

Matched: <<< “Joy is prayer”.”Joy is strength”.”Joy is Love” >>> .

2a. Lazy quantifiers would  favor the minimum match

Laziness (*?,+?,??) would always try to settle with minimum characters needed for the match to be successful before it tries to match the maximum.

The first double quote (‘) appeared was matched using lazy quantifier.

pattern :”.*?”

string :”Joy is prayer”.”Joy is strength”.”Joy i
s Love”.

Matched: <<< “Joy is prayer” >>> .”Joy is strength”.”Joy is Love”.

2b. The only time the greedy quantifiers would give up what they’ve matched earlier and settle for less is ‘when matching too much ends up causing some later part of the regex to fail’.

The \w* would match the whole word ‘regular_expressions’ initially. Later, since ‘s’ didn’t have a character to match and tend to fail would trigger the \w* to backtrack and match one character less. Thus the final ‘s’ matches the ‘s’ just released by \w* and whole match succeeds.

Note: Though the pattern would work the same way without paranthesis, I’d used them to show the individual matches in $1, $2, etc.

pattern : (\w*)(s)

string :regular_expressions

Matched: <<< regular_expressions >>>

$1 = regular_expression

$2 = s

Similarly, the initial match ‘x’ by ‘x*’ was given by later for the favor of the last ‘x’ in the pattern.

pattern : (x*)(x)

string : ox

Matched: o<<< x >>>

$1 =

$2 = x


2c. When more than one greedy quantifiers appears in a pattern, the first greedy would get the preference.

Though the .* initially matched the whole string, the [0-9]+ would able to grab just one digit ’5′ from the .*, and the 0-9]+ settles with it since that satisfies its minimum match criteriat. Note that the ‘+’ is also a greedy quantifier and here it cant grab beyond its minimum requirement, since already there is an another greedy quantifier shares the same match.

Enter pattern : (.*)([0-9]+)

Enter string : Bangalore-560025

Matched: <<< Bangalore-560025 >>>

$1 = Bangalore-56002

$2 = 5


3. Overall match takes precedence.

Ability to report a successful match takes precedence. As its shown in previous example, if its necessary for a successful match the quantifiers ( greedy or lazy ) would work in harmony with the rest of the pattern.


Modify attribute properties in Active Directory

October 5, 2009

Long long ago, I did an active directory configuration with minor tweaking ( yet not widely documented ) for converting a single-valued attribute to multi-valued attribute. And automated couple of testcases for adding multiple values to an attribute using ldap eway and was living happily till last week.

Last week was when one of those unfortunate event had happened ie. the automated tests were failing and with the few minutes of investigation revealed I need to redo the ADS setup in a different m/c due to multiple reasons. 

Everything went cool, duplicating the domain data, SSL configuration till I tried to add multiple values to the attribute and only to see the below error in the ldap client,


the attribute cannot be modified because its owned by the system

 I searched my documents in vain for the tiny-secret-formulae which helped me long back. I was struggling, googling, bing(l)ing for hours and desperately installing few third-party tools hoping it would do the job, finally I stumbled on it,

Down here the steps to convert single-valued attribute to multi-valued attribute,

  1. Login as a member of Schema Admins
  2. Launch LDP.EXE
  3. Connect to the Schema Master using LDP.EXE
  4. Bind to the Schema Master using an account with Schema Admin permissions.
  5. From the Browse menu, choose Modify
  6. In the Modify dialog box, leave the DN field blank, and type schemaUpgradeInProgress in the Attribute field. In the Value field, enter the number 1. Click the Enter button, then click the Run button.
  7. Close the Modify dialog box.
  8. Launch ADSIEDIT.MSC and goto the properties of the attribute here its ‘sn’
  9. Check for the property name ‘isSingleValued’ and change the value to False. Click on Apply and close the property window.
  10. Run LDP again, and change the value of schemaUpgradeInProgress from 1 to 0.
  11. From the Active Directory Schema console, right click on the console and choose "Reload the Schema"
Check with your favorite ldap client adding multiple values for the same attribute.


GlassFish ESB v2.1 AMI for EC2

June 23, 2009

Now GlassFish ESB v2.1 is available via Amazon Elastic Compute Cloud ( Amazon EC2 ). To help easily build and secure cloud applications, the GlassFish ESB v2.1 AMI is created on hardened opensolaris .

This AMI is installed only with runtime component of the GlassFish ESB v2.1 ie. GlassFish 2.1 server + jbi components, with default configurations, for details on port numbers & credentials check the /root/ec2sun/GFESBV21_README. The applications created using the designtime component of the GlassFish ESB v2.1 can be deployed in the EC2 instance using admin console. Port ’4848′ has to be opened in EC2 instance to access the admin console.

AMI ID: ami-5347a13a

AMI Manifest: opensolaris-2008.11-gfesb-v2.1.img.manifest.xml


Need for Speed: Undercover

April 26, 2009
If you are a race-gaming freak, you never could’ve missed NFS. I’ve been playing NFS from the Hot Pursuit days and here is my review on the last version Undercover.

Need for Speed: Undercover

April 26, 2009

I remember vaguely playing NFS first time around 2000. If I recall it well, I was stunned by the introduction video of the NFS : Hot Pursuit which captivated me to play the game till end, spending countless hours and sleepless nights. Since then my love for that game never faded, almost even after a decade NFS stayed top in my skimpy list of PC games.

One thing about NFS, starting from HOT Pursuit to Underground, Carbon Canyon to ProStreet, they all resemble a flick either ‘the Fast & the Furious’ sequel or ‘Gone in sixty seconds’ or you name it. Thats what makes it more interesting.

About NFS undercover:

The cops are back and this time ‘You ‘re the Cop’. Hey,  I’m not trying to confuse, remember though a cop, you’re in undercover, trying to be one of those bad gangs. The graphics, the plot and the choice of the cars everything is superb as usual. I’ve already finished the game once and started all again. Till the NFS:Shift been released, I guess I would play undercover couple of times.

Lets take the wheel, infiltrate them, take them out one by one.


My daughter’s graduation@Playschool

March 5, 2009

Its a wonderful experience watching your kid’s first graduation. Even more watching her performance on stage. I’ve started a blog for her own here where I post all wonderful happening in her life.


Follow

Get every new post delivered to your Inbox.