This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: semantic error: cannot expand unknown type


Frank Ch. Eigler wrote:

Hi -



- The parser does not recognize keywords as such. [...]


[...]
We've got several cases of a single token being used for multiple uses:



Yikes. Thanks for giving it a start.




The token "function" - it has 2 different uses:
- as a keyword, as in 'function foo()'
- as an identifier, as in 'probe kernel.function("sys_read")'
The token "return" - it has 2 different uses:
- as a keyword, as in "function foo() { return; }"
- as an identifier, as in "probe kernel.function("sys_read").return'
The above two seem reasonable and I've worked around them (not very
elegantly).



A more elegant approach for these may be to explicitly accept either
tok_identifier or tok_keyword elements in parse_probe_point(). I
would not hard-code "return" and "function" in this way.



Yeah, it is a bit ugly. However, if we don't explicitly only take "return" and "function" here, then the user is free to write 'probe kernel.while("sys_read").if'.


Perhaps we could take a hybrid approach. Instead of tok_types being exclusive, what if they were a set of flags instead. Then, the "return" and "function" tokens could be both identifiers and keywords at the same time.



Then there are the odd cases, like:
The token "string" - it has (at least) 3 different uses:
- as a keyword, as in "function foo(a:string)"
- as an identifier as in the name of a function, like "function
string:string(num:long)" (as is done in conversions.stp)
- as an identifier as in the name of a variable, like 'string = "abc"'



We should outlaw the latter two (and rename or remove the current
string() function in conversions.stp).


Glad we agree. What do we want to rename the string() function to? ltoa()? long2str()? long_to_string()? Or something else? I'd probably pick "long_to_string()".

Note that there are other keywords that could be used similarly.
The keyword "long" could be used as a function name or variable
name. The keywords "if", "while", "foreach", etc. can be used as
function names (of course they will never get called, but still).
The keyword "global" can be used as a variable name (and is in the
testsuite that way). The keyword "probe" could be used as a
variable name.



These will all go away if the parse functions related to
expressions/symbols refuse to accept tok_keyword. That "global
global" test case would have to be changed. The parser should end up
with no strange hacks.


Glad we agree here too. I was just trying to point out some of the existing oddities.

I've removed all the hacks (except for the one in parse_probe_point() that we were discussing above).

(e.g., expect_unknown() should not need to be changed at all; "string"
and "global" indeed shouldn't show up in parse_statement() or
parse_functiondecl() or so on.)


When I removed all the hacks, I didn't need to change "expect_unknown()" at all. I've attached a new patch.



My suggestion would be to "reserve" keywords, so that using keywords as
function names, parameter names, or variable names isn't allowed. [...]



That is in effect what would happen if the new tok_keyword token class is not permitted in most terminals.



[...] Finally note that this doesn't actually solve my original
problem of using "return" instead of "next" in a probe but is a step
in that direction.



What is the parse tree (-p1) that results from your test case, after the parser changes?

- FChE


I've attached the whole thing, but the relevant part is here:

if (!(ok)) return (value) = (retval())

When the parser sees the return, it isn't smart enough to realize that a return keyword isn't valid in a probe point, only in a function.


Index: parse.cxx
===================================================================
RCS file: /cvs/systemtap/src/parse.cxx,v
retrieving revision 1.45
diff -u -p -r1.45 parse.cxx
--- parse.cxx	9 May 2006 12:55:57 -0000	1.45
+++ parse.cxx	12 May 2006 22:23:04 -0000
@@ -72,6 +72,7 @@ tt2str(token_type tt)
     case tok_string: return "string";
     case tok_number: return "number";
     case tok_embedded: return "embedded-code";
+    case tok_keyword: return "keyword";
     }
   return "unknown token";
 }
@@ -91,7 +92,7 @@ operator << (ostream& o, const token& t)
 {
   o << tt2str(t.type);
 
-  if (t.type != tok_embedded) // XXX: other types?
+  if (t.type != tok_embedded && t.type != tok_keyword) // XXX: other types?
     {
       o << " '";
       for (unsigned i=0; i<t.content.length(); i++)
@@ -505,6 +506,26 @@ lexer::scan ()
               n->content = arg;
             }
         }
+      else
+        {
+	  if (n->content    == "probe"
+	      || n->content == "global"
+	      || n->content == "function"
+	      || n->content == "if"
+	      || n->content == "else"
+	      || n->content == "for"
+	      || n->content == "foreach"
+	      || n->content == "in"
+	      || n->content == "return"
+	      || n->content == "delete"
+	      || n->content == "while"
+	      || n->content == "break"
+	      || n->content == "continue"
+	      || n->content == "next"
+	      || n->content == "string"
+	      || n->content == "long")
+	    n->type = tok_keyword;
+        }
 
       return n;
     }
@@ -725,11 +746,11 @@ parser::parse ()
 	    break;
 
           empty = false;
-	  if (t->type == tok_identifier && t->content == "probe")
+	  if (t->type == tok_keyword && t->content == "probe")
             parse_probe (f->probes, f->aliases);
-	  else if (t->type == tok_identifier && t->content == "global")
+	  else if (t->type == tok_keyword && t->content == "global")
 	    parse_global (f->globals);
-	  else if (t->type == tok_identifier && t->content == "function")
+	  else if (t->type == tok_keyword && t->content == "function")
             parse_functiondecl (f->functions);
           else if (t->type == tok_embedded)
             f->embeds.push_back (parse_embeddedcode ());
@@ -782,7 +803,7 @@ parser::parse_probe (std::vector<probe *
 		     std::vector<probe_alias *> & alias_ret)
 {
   const token* t0 = next ();
-  if (! (t0->type == tok_identifier && t0->content == "probe"))
+  if (! (t0->type == tok_keyword && t0->content == "probe"))
     throw parse_error ("expected 'probe'");
 
   vector<probe_point *> aliases;
@@ -926,23 +947,23 @@ parser::parse_statement ()
     }
   else if (t && t->type == tok_operator && t->content == "{")  
     return parse_stmt_block ();
-  else if (t && t->type == tok_identifier && t->content == "if")
+  else if (t && t->type == tok_keyword && t->content == "if")
     return parse_if_statement ();
-  else if (t && t->type == tok_identifier && t->content == "for")
+  else if (t && t->type == tok_keyword && t->content == "for")
     return parse_for_loop ();
-  else if (t && t->type == tok_identifier && t->content == "foreach")
+  else if (t && t->type == tok_keyword && t->content == "foreach")
     return parse_foreach_loop ();
-  else if (t && t->type == tok_identifier && t->content == "return")
+  else if (t && t->type == tok_keyword && t->content == "return")
     return parse_return_statement ();
-  else if (t && t->type == tok_identifier && t->content == "delete")
+  else if (t && t->type == tok_keyword && t->content == "delete")
     return parse_delete_statement ();
-  else if (t && t->type == tok_identifier && t->content == "while")
+  else if (t && t->type == tok_keyword && t->content == "while")
     return parse_while_loop ();
-  else if (t && t->type == tok_identifier && t->content == "break")
+  else if (t && t->type == tok_keyword && t->content == "break")
     return parse_break_statement ();
-  else if (t && t->type == tok_identifier && t->content == "continue")
+  else if (t && t->type == tok_keyword && t->content == "continue")
     return parse_continue_statement ();
-  else if (t && t->type == tok_identifier && t->content == "next")
+  else if (t && t->type == tok_keyword && t->content == "next")
     return parse_next_statement ();
   // XXX: "do/while" statement?
   else if (t && (t->type == tok_operator || // expressions are flexible
@@ -960,7 +981,7 @@ void
 parser::parse_global (vector <vardecl*>& globals)
 {
   const token* t0 = next ();
-  if (! (t0->type == tok_identifier && t0->content == "global"))
+  if (! (t0->type == tok_keyword && t0->content == "global"))
     throw parse_error ("expected 'global'");
 
   while (1)
@@ -994,12 +1015,14 @@ void
 parser::parse_functiondecl (std::vector<functiondecl*>& functions)
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "function"))
+  if (! (t->type == tok_keyword && t->content == "function"))
     throw parse_error ("expected 'function'");
 
 
   t = next ();
-  if (! (t->type == tok_identifier))
+  if (! (t->type == tok_identifier)
+      && ! (t->type == tok_keyword
+	    && (t->content == "string" || t->content == "long")))
     throw parse_error ("expected identifier");
 
   for (unsigned i=0; i<functions.size(); i++)
@@ -1014,9 +1037,9 @@ parser::parse_functiondecl (std::vector<
   if (t->type == tok_operator && t->content == ":")
     {
       t = next ();
-      if (t->type == tok_identifier && t->content == "string")
+      if (t->type == tok_keyword && t->content == "string")
 	fd->type = pe_string;
-      else if (t->type == tok_identifier && t->content == "long")
+      else if (t->type == tok_keyword && t->content == "long")
 	fd->type = pe_long;
       else throw parse_error ("expected 'string' or 'long'");
 
@@ -1044,9 +1067,9 @@ parser::parse_functiondecl (std::vector<
       if (t->type == tok_operator && t->content == ":")
 	{
 	  t = next ();
-	  if (t->type == tok_identifier && t->content == "string")
+	  if (t->type == tok_keyword && t->content == "string")
 	    vd->type = pe_string;
-	  else if (t->type == tok_identifier && t->content == "long")
+	  else if (t->type == tok_keyword && t->content == "long")
 	    vd->type = pe_long;
 	  else throw parse_error ("expected 'string' or 'long'");
 	  
@@ -1078,8 +1101,10 @@ parser::parse_probe_point ()
   while (1)
     {
       const token* t = next ();
-      if (! (t->type == tok_identifier ||
-             (t->type == tok_operator && t->content == "*")))
+      if (! (t->type == tok_identifier
+	     || (t->type == tok_keyword
+		 && (t->content == "function" || t->content == "return"))
+	     || (t->type == tok_operator && t->content == "*")))
         throw parse_error ("expected identifier or '*'");
 
       if (pl->tok == 0) pl->tok = t;
@@ -1160,7 +1185,7 @@ if_statement*
 parser::parse_if_statement ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "if"))
+  if (! (t->type == tok_keyword && t->content == "if"))
     throw parse_error ("expected 'if'");
   if_statement* s = new if_statement;
   s->tok = t;
@@ -1178,7 +1203,7 @@ parser::parse_if_statement ()
   s->thenblock = parse_statement ();
 
   t = peek ();
-  if (t && t->type == tok_identifier && t->content == "else")
+  if (t && t->type == tok_keyword && t->content == "else")
     {
       next ();
       s->elseblock = parse_statement ();
@@ -1205,7 +1230,7 @@ return_statement*
 parser::parse_return_statement ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "return"))
+  if (! (t->type == tok_keyword && t->content == "return"))
     throw parse_error ("expected 'return'");
   return_statement* s = new return_statement;
   s->tok = t;
@@ -1218,7 +1243,7 @@ delete_statement*
 parser::parse_delete_statement ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "delete"))
+  if (! (t->type == tok_keyword && t->content == "delete"))
     throw parse_error ("expected 'delete'");
   delete_statement* s = new delete_statement;
   s->tok = t;
@@ -1231,7 +1256,7 @@ next_statement*
 parser::parse_next_statement ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "next"))
+  if (! (t->type == tok_keyword && t->content == "next"))
     throw parse_error ("expected 'next'");
   next_statement* s = new next_statement;
   s->tok = t;
@@ -1243,7 +1268,7 @@ break_statement*
 parser::parse_break_statement ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "break"))
+  if (! (t->type == tok_keyword && t->content == "break"))
     throw parse_error ("expected 'break'");
   break_statement* s = new break_statement;
   s->tok = t;
@@ -1255,7 +1280,7 @@ continue_statement*
 parser::parse_continue_statement ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "continue"))
+  if (! (t->type == tok_keyword && t->content == "continue"))
     throw parse_error ("expected 'continue'");
   continue_statement* s = new continue_statement;
   s->tok = t;
@@ -1267,7 +1292,7 @@ for_loop*
 parser::parse_for_loop ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "for"))
+  if (! (t->type == tok_keyword && t->content == "for"))
     throw parse_error ("expected 'for'");
   for_loop* s = new for_loop;
   s->tok = t;
@@ -1333,7 +1358,7 @@ for_loop*
 parser::parse_while_loop ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "while"))
+  if (! (t->type == tok_keyword && t->content == "while"))
     throw parse_error ("expected 'while'");
   for_loop* s = new for_loop;
   s->tok = t;
@@ -1364,7 +1389,7 @@ foreach_loop*
 parser::parse_foreach_loop ()
 {
   const token* t = next ();
-  if (! (t->type == tok_identifier && t->content == "foreach"))
+  if (! (t->type == tok_keyword && t->content == "foreach"))
     throw parse_error ("expected 'foreach'");
   foreach_loop* s = new foreach_loop;
   s->tok = t;
@@ -1426,7 +1451,7 @@ parser::parse_foreach_loop ()
     }
 
   t = next ();
-  if (! (t->type == tok_identifier && t->content == "in"))
+  if (! (t->type == tok_keyword && t->content == "in"))
     throw parse_error ("expected 'in'");
  
   s->base = parse_indexable();
@@ -1672,7 +1697,7 @@ parser::parse_array_in ()
     }
 
   t = peek ();
-  if (t && t->type == tok_identifier && t->content == "in")
+  if (t && t->type == tok_keyword && t->content == "in")
     {
       array_in *e = new array_in;
       e->tok = t;
# parse tree dump
# file rwtop_fail.stp
global OPT_name
global OPT_pid
global PID
global NAME
global app_r
global app_w
probe begin{
(OPT_name) = (0)
(OPT_pid) = (0)
(PID) = (0)
(NAME) = (".")
(app_r) = (0)
;
(app_w) = (0)
;
printf("Tracing... Please wait.\n")
;
}
probe syscall.read.return,
syscall.write.return{
(ok) = (0)
(((OPT_name) == (1)) && ((NAME) == (execname())))?((ok) = (1)):(1)
(((OPT_pid) == (1)) && ((PID) == (pid())))?((ok) = (1)):(1)
if (!(ok)) return (value) = (retval())

if ((value) > (0)) {
if ((name) == ("read")) {
(app_r) += (value)
}
else {
(app_w) += (value)
}

}

}

/*
 * Command line arguments
 */
global OPT_name
global OPT_pid
global PID
global NAME

global app_r
global app_w

 
/*
 * Print header
 */
probe begin 
{
    OPT_name 	= 0
    OPT_pid 	= 0
    PID		= 0
    NAME 	= "."

    /* starting values */
    app_r = 0;
    app_w = 0;

    printf("Tracing... Please wait.\n");
}

/*
 * Check event is being traced
 */
probe syscall.read.return, syscall.write.return
{
    ok = 0

    /* check each filter, */
    (OPT_name == 1 && NAME == execname())? ok = 1 : 1
    (OPT_pid == 1 && PID == pid()) ? ok = 1 : 1

    if (! ok)
	return

    /*
     * Increment tallys
     */
    value = retval()
    if (value > 0)
    {
	if (name == "read")
	{
	    app_r += value
	}
	else
	{
	    app_w += value
	}
    }
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]