The lexical analyzer

Recommended for you: Get network issues from WhatsUp Gold. Not end users.
A colleague asked me how to use regular expressions to today, information over time, found a lexical analyzer in university period of writing, Share with you! If can help to learn this knowledge, I will be very happy

The purpose of the experiment:

1, Debugging through the design of the preparation of a specific lexical analysis program, to deepen understanding of the principles of lexical analysis. And the hands of the programming language source code for the scanning process of its decomposition for different types of word lexical analysis.

2, Preparation of read a word process, from the input source, identify all the independent significance of the word, that is to retain the basic character, identifier, constant, operator, separator five categories. And turn out every word internal code and the word symbol itselfvalue.

The experimental requirements

Requirements for C+ + Builder or Dephi or VB or VC or Jbuilder visual programming tools to write requirements; interface (i.e. general windows application program interface).



The experimental principle and thought

1, Definition: define constants, variables, data structure.

2, Initialization: from the file source input to a character buffer.

3, Take the words: remove the extra blank.

4, The character of form words, word scanning process graph construction using the textbook P90 Fig. 4.5 conversion

5, The word category codes

6, The results show (export).


Identification of reserved words: if, int, for, while, do, return, break, continue and so on; other types of code is 1.

All the other identified as the identifier; other types of code is 2.

Constant unsigned number; other types of code is 3.

Operators include: +, -, *, /,,, = > <etc.; can consider more complex, < > =; =,! =; other types of code is 4.

Separators include: "," ";" "(" ")" "{" "}"; other types of code is 5.


In the source file, the keyword delimiter, operator

String[]key={"auto","break","case","char","const","continue","default","do","double",

"else","enum","extern","float","for","goto","if","int","long","register","return","short","signed","static","struct","switch","typedef","union","unsigned","void","volatile","while"};

String[]delimiter={"(",")","[","]",",",";","{","}","#"}; //Delimiter

String[]operation1={"*","%","&","+","-","<","=",">","!","|"};//Operator

String[]operation2={"++","--","&&","==","!=","||","<=",">=","<<",">>"};



Code


Judge identifier module: 
private char isIdentifier(char ch)
{
	int flag=1;
	String str="";
	str+=ch;
	count++;
	ch=allchar.charAt(count);       //Read characters from a source text area
	while(Character.isLetter(ch)||Character.isDigit(ch)||ch=='_')//Start symbol distinguishing identifier
	 {
	     str+=ch;
	     count++;
	     ch=allchar.charAt(count);
	 }
for(int j=0;j<key.length;j++)     //Match the known key
	{
		if(key[j].compareTo(str)==0)     		
{
			 flag=1;        //That judgment is key to exit
			 break;
		}
			else
				flag=0;    //The judgment result is other identifier
	}
	if(flag==0)
		result.append(str+"\The T identifier \n");
	else
		result.append(str+"\The T reserved word \n");
	return ch;
}




Judgment is a constant (including floating point judgment) module
private char isDigit(char ch)
{
	String str="";
	str+=ch;
	count++;
	ch=allchar.charAt(count);
	while(Character.isDigit(ch))
	{
	      str+=ch;
	      count++;
	      h=allchar.charAt(count);
	}
	 if (ch=='.')//Non digital and point
	 {
	     str+=ch; //Will join
	     count++;
	     ch=allchar.charAt(count);//Read the next character
	     if (Character.isDigit(ch))
	     {
	        while(Character.isDigit(ch)) //Is digital, income, and will add a
	        {
	            str+=ch; //Will join
	    	    count++;
	    	    ch=allchar.charAt(count);
	        }
	        if(Character.isDigit(ch))//If digital plus plus digital re emergence of letters, is wrong
	        {
	            while(Character.isDigit(ch)||Character.isLetter(ch)||ch=='.')
	            {
	                str+=ch; 
		    	    count++;
		    	    ch=allchar.charAt(count);
	            }
	            result.append(str+"\T identifier error \n");
	       }
	      else
	       	   result.append(str+"\The t floating point \n");
	      }
	  }
	  else if(Character.isLetter(ch))//If appear on the digital's character, judge as wrong identification
	  {
	      while(Character.isDigit(ch)||Character.isLetter(ch)||ch=='.')
          {
             	str+=ch; 
	    	    count++;
	    	    ch=allchar.charAt(count);
          }
          result.append(str+"\T identifier error \n");
	  }
	  else//If word segment terminator, judge is constant
	  {
	      result.append(str+"\T integer \n");
	  }
		return ch;
	}

Judge notes (including // type annotations and /**/ type annotation) module
private char isComment(char ch)
{
	String str="";
	str+=ch;
	count++;
	ch=allchar.charAt(count);
	if(ch=='/')//" //" type notes
	{
	   while(ch!='\n'){
	   	   str+=ch;
	       count++;
	       ch=allchar.charAt(count);
	   }       
	}
	else if(ch=='*')//Judging /**/ type annotation{	
		str+=ch;
		count++;
		ch=allchar.charAt(count);
		while(!(str.charAt(str.length()-1)=='*'&&ch=='/')){				
			str+=ch;
			count++;
			ch=allchar.charAt(count);				
		}
		count++;
		ch=allchar.charAt(count);
	}
	else//Otherwise, the division operator
		    result.append(str+"\T the division operator \n"); // the division operator
	return ch;
}

Judgment operator, delimiter module
	private char isOther(char ch)
{
	String str="";
	str+=ch;
	count++;
	for(int j=0;j<delimiter.length;j++)//A match with the known boundary
	{
		if(delimiter[j].compareTo(str)==0)
		{
			result.append(str+"\T delimiter \n");
			ch=allchar.charAt(count);
			return ch;				
		}
	}
	for(int i=0;i<operation1.length;i++)//First with a character of the operator to match
	{
		if(operation1[i].compareTo(str)==0)
		{
			ch=allchar.charAt(count);
			for(int k=0;k<operation2.length;k++)
			{
				if(operation2[k].charAt(1)==ch)
				{
					str+=ch;
					result.append(str+"\The T operator \n"); // two characters.
					count++;
					ch=allchar.charAt(count);	
					return ch;
				}
			}	
			result.append(str+"\The T operator \n"); // a character.
			return ch;
		}
	}
	if(ch=='"')//The string of judgment
{
		str+=ch;
		ch=allchar.charAt(count);
		while(ch!='"')
		{
		    str+=ch;
		    count++;
		    ch=allchar.charAt(count);
	   }
		if(ch=='"')
		{
		    count++;
		    str+=ch;
		    result.append(str+"\The t string \n");
		  }
		  else
		  {
		      result.append(str+"\The t string for error \n");
		  }
	}
	ch=allchar.charAt(count);
	return ch;
}
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download

Posted by Regina at December 01, 2013 - 12:10 PM