Jump to content


Photo
- - - - -

A horrible bug....


77 replies to this topic

#46 Tim Woodall

Tim Woodall

    Member

  • Members
  • PipPip
  • 20 posts

Posted 19 July 2001 - 01:27 PM

>Show me a case where it makes sense for a literal with a value of 1 to be 0 then I'll eat my shorts.

Consider a global logic with 6 parameters. The function returns from 1 to 5 values and returns them in the variables 2, 3, 4...

Variable 1 is the number of values to return.

Now, if you only want to return 1 parameter wouldn't it be nice if you could put zeros into the other 'O' parameters.

I don't think it ever makes sense to set 1 to 0 (unless we are considering starting the Obfuscated ProIV Competition :-) )

#47 Mike Nicholson

Mike Nicholson

    Expert

  • Members
  • PipPipPipPip
  • 196 posts
  • Gender:Male
  • Location:Stockholm, Sweden

Posted 19 July 2001 - 01:29 PM

> 'If VAR2 is an alias to a literal and VAR1 was retrieved from a database then should this generate a gen error?'

No, because it's not a literal, it's a scratch and you can change the value of a scratch.

> 'What about if the database has a constraint such that VAR1 can never equal 1?'

Then the IF statement will never be executed, what's the problem there?

Lets rephrase this as:

IF $OUT.MODE = 'O', 'B' THEN
1 = #OUTPUT.VARIABLE
ENDIF

Which is really what we're talking about. You wouldn't be able to put this in logic - the gen would fail - so why is it allowed in this form??

We're not really asking for brain surgery here, just take the check that puts a warning out in DS and put in the gen routine.

Alternatively I'd love to hear the example where a developer actually needs to change the value of a literal.

Cheers

Mike (who wasn't going to post again, damn!! ;-) )

#48 Mike Nicholson

Mike Nicholson

    Expert

  • Members
  • PipPipPipPip
  • 196 posts
  • Gender:Male
  • Location:Stockholm, Sweden

Posted 19 July 2001 - 01:35 PM

<drill sergeant mode>

ProIV team, about face!!

<drill sergeant mode>

#49 Stuart Burton

Stuart Burton

    Advanced

  • Members
  • PipPipPip
  • 71 posts
  • Gender:Male
  • Location:Luton, United Kingdom

Posted 19 July 2001 - 01:45 PM

Tim, you seem to have got a little confused here mate! Your global logic example is 100% correct. You should be able to change the values of variables. It's literals that should be prevented from being amended.

#50 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 24 July 2001 - 06:57 PM

WARNING: This is a very long post. If you need a coffee or other stimulant go and get it now :-)

I've been away and I see activity in this topic tailed off with no really satisfactory conclusion.

It bothers me that despite a lot of discussion, no real progress seemed to be made in the sense that no common understanding seemed to result and what to me is a serious problem didn't seem likely to be fixed even though it could be (at least to my satisfaction as explained later).

One reason it bothers me is because it indicates that an important aspect of this forum - the potential ability to communicate and resolve serious technical issues - is getting devalued. So, just this once, I'm taking the trouble to set out my view in painful detail. I think most if not all the relevant issues came up in previous postings. Maybe it'll help to put it in one place or maybe not.

Maybe we all need to reflect on the rather adverserial nature of the discussion. If that's part of the problem (and really it shouldn't be for techies should it?) we need to do something because no way can I afford the time to do this regularly and nor I imagine can anyone else.


Formal and Actual parameters
----------------------------
A formal parameter is the variable in the called subprogram (global logic or global function)
An actual parameter is the corresponding variable, literal or expression in the calling code.


Semantic models of parameter passing
------------------------------------
The behaviour of most formal parameters can be categorized in one three ways:
1/ They (only) receive data from the corresponding actual parameter.
2/ They (only) transmit data to the corresponding actual parameter.
3/ They do both.

These three 'models' are often referred to as IN, OUT and INOUT respectively (and these are near as dammit the keywords actually used in Ada). I'll use these as they map reasonably nicely to the ProIV (I)nput, (O)utput and (B)oth parameter types for global functions.


Implementations of parameter passing
------------------------------------
As far as I can see there are four methods of passing parameters relevant to this discussion (I'm not going to waste my time on esoterica like call-by-name in Algol):

1/ Pass by Value - suitable for IN parameters and sometimes referred to as 'copy-in semantics'. The formal parameter is (initialized with) a copy of the value of the actual parameter. The formal parameter can be safely modified. There is no attempt to modify the actual parameter when the subprogram returns. The actual parameter can safely be a variable, constant or an expression.

2/ Pass by Result - suitable for OUT parameters and sometimes referred to as 'copy-out semantics'. The formal parameter starts out uninitialized and can be modified by the subprogram. The final value of the formal parameter when the subprogram returns is copied into the actual parameter. For safety, the actual parameter must be a variable.

3/ Pass by Value-Result - suitable for INOUT parameters and sometimes referred to as 'copy-in/out semantics' or 'pass by copy'. The formal parameter is (initialized with) a copy of the value of the actual parameter. The formal parameter can be modified. The final value of the formal parameter when the subprogram returns is copied into the actual parameter. For safety, the actual parameter must be a variable.

4/ Pass by Reference - suitable for INOUT parameters and sometimes referred to as 'call by reference' or 'reference semantics'. The formal parameter is an ALIAS for the actual parameter. The formal parameter can be modified and the modifications are made directly to the actual parameter. For safety the actual parameter must be a variable.

NOTE that at this point I have merely tried to say that in the GENERAL case, pass by reference and pass by value-result are suitable for INOUT parameters (only) and that, consequently, the actual parameter should be a variable for safety.

The SPECIFIC problem we are interested in (see below) is the use of any of the mechanisms other than pass by value for IN parameters where actual parameters such as literals or named constants MUST not be modified.


The problem
-----------
I quote from 'Concepts of Programming Languages' by Robert Sebesta because he puts it very nicely:
'A subtle but fatal error can occur with pass by reference parameters. Suppose a program contains two references to the constant 10, the first being an actual parameter in a call to a subprogram. Further suppose that the subprogram mistakenly changes the formal parameter that corresponds to the 10 to the value 5. The compiler for this program may have built a single location for the value 10 during compilation, as compilers often do, and may use that location for all references to the constant 10 in the program. But after the return from the subprogram, all subsequent occurrences of 10 will actually be references to the value 5. This can happen in some systems, and it creates a programming problem that is very difficult to diagnose. This did in fact happen with many implementations of Fortran IV.'

Please note that although the above example and forum discussion have focused on numeric literals, the problem applies equally to string literals (indeed any data intended to be constant). Note also that the problem applies for pass by result and value-result as well as pass by reference.


Call by reference does NOT have to be unsafe
--------------------------------------------
It has been suggested several times in the forum that where the call by reference mechanism is used, this will OF NECESSITY bring with it the problem of constants/literals potentially being modified.

This is not true. PROVIDED a language requires that all formal parameters are classified as IN, OUT or INOUT, the language disallows modifications to IN parameters and the language ensures constants/literals/expressions are ONLY passed as IN parameters THEN the problem cannot arise. Ada has been doing this since 1983.

However, as we all know, most languages (including ProIV) were not designed to be as safe as Ada. Thus in most languages using call by reference this problem does arise. Even then it is not a disaster, PROVIDED that the runtime implementation/environment (whatever form that takes) raises an error when any attempt is made to modify data that should be immutable (for example memory known to be read-only).

What mechanisms are actually being used in ProIV?
-------------------------------------------------
The documentation (NativeDevGuide46r204_20000308.pdf page 160) suggests that the parameter passing for global functions is as follows:
(I)input parameters - pass by value
(O)utput parameters - pass by result
(B)oth parameters - pass by value-result.

So, interestingly despite the extent of discussion on the topic, none of these are actually passed by reference in the sense I'm used to. However, since the problem we are interested in arises equally using pass by result or value-result that does not matter much.

Incidentally, it is lucky that (I)nput parameters are passed by value rather than by reference. If they were passed by reference, it would be necessary to check that the (I)nput formal parameters in a global function were never passed to other global functions except as (I)nput parameters also.

The documentation (LogicGuide_46r204_20000308.pdf page 124) states that Global logic parameters are passed by reference. It also of course cautions 'For best results, do not change the value of a literal' (no 'you see' comments please - the whole point is no one would do this knowingly, only inadvertently).

Pragmatic solutions
-------------------
It has been pointed out that later development environments (DS, VIP) will warn the programmer when a literal is passed as a global FUNCTION parameter of type (O)utput or (B)oth.

Personally, I agree with the suggestions that this should be a fatal error not a warning and that the check should have been built into the gen process for safety regardless of the development environment used. This is important in solution 'A' suggested below but cannot solve the whole problem.

The more serious problem is with global LOGIC. ProIV has no syntax for global logic parameters to indicate whether a parameter is IN, OUT or INOUT and so all parameters are treated as INOUT.

To be able to spot the problem at gen time, ProIV PARMS syntax would have to be extended and the gen process would have to check that global logic 'IN' parameters were never modified and were only passed to another global logic or to a global function as 'IN' parameters.

Even then this would not help with the millions of lines of existing code. some of which undoubtedly contain this error. Also, because it cannot realistically be made mandatory (backward compatibility with those millions of existing lines) it isn't guaranteed to prevent the problem in new code.

SO, pragmatically, I don't think we can fix the whole problem with gen-time errors because the stable door is open and the horse bolted long ago. Instead, I can see two options for modifying the runtime behaviour of the kernel - either to prevent literals being damaged inadvertently or to detect damage at the point it occurs. I'll try to explain these below:

A/ As has been suggested already, a copy could be made (internally) whenever a literal is passed to a global logic and a reference to the copy then used. This would give literals the 'copy-in semantics' of pass by value and solve the problem for global logic (I'm sure the performance hit would be negligible). In this case, a 'full' solution would still really require that the global FUNCTION parameter check currently performed by DS should be part of the gen process and raise a fatal error.

B/ The kernel could be modified so that literals were marked as such in memory and that any attempt to modify the value of a literal would raise a runtime error. Doubtless someone will say such a runtime check is 'expensive'. I'm not sure it HAS to be, but assuming it were it could be made switchable and switched on for coding/testing/debugging and off in production. Personally I like this kind of solution as it clearly must catch all cases - even those we may have overlooked or which might be introduced later by new features.


Right, I need a beer.
Nothing's as simple as you think

#51 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 24 July 2001 - 07:06 PM

For the record, previous posts have said that C does not have 'call by reference' and that there are no 'references' in C. While it may be an atypical part of C, arrays in C (including character strings) nevertheless are passed by reference and do exhibit reference-like behaviour.

Consider the following code fragments (yes the mistake is deliberate because it nicely illustrates the main problem we are discussing).

....int j;
....char greeting[] = 'Hello World';
....j = strlen(greeting);

....int strlen(char str[])
....{...int i = 0;
........while (!(str[i] = ''))
............i = i plus 1;
........return i;
....}

This does not mean I think C's idiosyncracies should have any bearing on what ProIV ought to do :-)
Nothing's as simple as you think

#52 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 24 July 2001 - 07:09 PM

Tim,

I didn't mean call by reference and 'call by pointer' were the same. By 'simulate' I meant you can achieve the same effects as call by reference (ie: reference semantics in a general sense) not that you could achieve it with the same syntax. 'Equivalent' pointer syntax will always be more verbose I guess because of the explicit dereferencing required - hence the mention of Cfront because it must have been doing that for you.
Nothing's as simple as you think

#53 Dan Shannon

Dan Shannon

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 374 posts
  • Gender:Male
  • Location:Australia

Posted 24 July 2001 - 09:03 PM

Richard,

1. Marvellous, some sensible semantic argument.

2. I need a beer after reading all that too.

3. I agree with solution 'A'.

4. In extension to solution 'A', and more specifically with reference to calls made to global functions, it is easy to spot, at gen time, whether the 'INOUT' or 'OUT' parameter is in fact a constant, i.e. a literal in the case of PRO-IV. Therefore it shouldn't be too hard to modify the gen process to implement a copy of the literal, allow the subprogram (global function) to modify it, but then restore the original value (or refuse to overwrite the initial value, which is the same thing) on return to the calling program. The alternative fix would be to enforce the use of genuine variables in a call (otherwise resulting in a gen error), which would effectively require a logic execution point before call to global function via a given interface, so that literals can be manually coded into variables. This second alternative, however, would surely require a great deal more coding in the kernel, as well as in all of our applications.

5. I'd STILL like to hear of ANY example in an existing application where this behaviour is used deliberately to provide functionality - my guess is that there isn't one. If this is the case (there's no extant deliberate use of this feature of the language) and the development community genuinely doesn't want this feature to operate in the way it does, then by force of sheer pragmatism, and totally ignoring (computer) scientific correctness, I'd say that the way forward is to change the functionality that has been suggested.

6. I found a beer. Happiness is mine.

Dan Shannon

#54 Tim Woodall

Tim Woodall

    Member

  • Members
  • PipPip
  • 20 posts

Posted 25 July 2001 - 09:21 AM

This is getting tedious.

C DOES NOT HAVE PASS BY REFERENCE EVEN FOR ARRAYS.

int main(void)
{
char ptr[] = 'Hello world';
printf('%p
',ptr);
callbyref(ptr);
printf('%p
',ptr);
return(0);
}

You can write what ever you like for callbyref. The value of ptr WILL NOT CHANGE

C does have call by reference.

void callbyref(char* const &ptr)
{
char* &hack=static_cast(ptr);
// hack =1;
}

If you uncomment the hack =1 the code will have undefined results. In all probability one of two things will happen:

It will run but the value of ptr will not change - the complier is doing a copy of the constant before calling callbyref. This is how ProIV would address the issue but it does incur some runtime consequences. I would expect that with todays processors the consequences will be minimal to undetectable but this hasn't always been the case.

It will crash with an access violation because the hack =1 attempts to write to memory marked as read only. ProIV could also implement this model but might break code that used to work. Note that this is badly written code but all the same ...

However, it is possible for the value of the 'constant' ptr to change in a fully standards compliant implementation of C . This is because you are invoking undefined behaviour.

Will people please find out what the difference between call by reference and call by value are and how you can use pointers to simulate call by reference in a call by value model.

Note that in a call by reference model you cannot pass constants. You have to simulate call by value by passing a reference to a copy of the constant.

#55 Rob Donovan

Rob Donovan

    rob@proivrc.com

  • Admin
  • 1,640 posts
  • Gender:Male
  • Location:Spain

Posted 25 July 2001 - 09:59 AM

Whether it is or not, its irrelivant.

1 should not equal 0.

I dont care about other languages.

This was a simple request that PROIV should not allow this, to help us while developing applications, and I think everyone apart from PROIV LTD agree.

Rob D

#56 Tim Woodall

Tim Woodall

    Member

  • Members
  • PipPip
  • 20 posts

Posted 25 July 2001 - 10:30 AM

> In this case, a 'full' solution would still really require that the global FUNCTION parameter check currently performed by DS should be part of the gen process and raise a fatal error.

It can't be a fatal error: consider existing code something like this

PARAMETERS (#N,#V1,#V2,#V3)

CASE #N
WHEN 3:
#V3=3
#V2=2
#V1=1
WHEN 2:
#V2=2
#V1=1
WHEN 1:
#V1=1
OTHERWISE:
#N=0
ENDCASE

Called with the values

1,#R1,0,0
3,#R1,R2,R3

Maybe noone has written code like this but if it is changed there is bound to be someone somewhere who has a global function that is referenced half a million times using exactly this trick :-( This change would force all those dummy constants that are never used to be changed to dummy scratch variables.

I think the solution is to ALWAYS do a copy for a literal, not just for global logics. I don't think there will be any performance issues. However, when ProIV was originally written not wasting machine cycles was much more important.


Finally, there is a difference between call by reference and call by value-result.

If you pass the same parameter into a function twice then modify one of those input values inside the function, both will immediately change in the call by reference model and the value at return will be defined. For call by value-result only one will change inside the function and the return value is undefined. (unclear is probably a better description. It can, of course be defined to be the first or second parameter arbitrarily)

I have already suggested (not here) that for a global function a gen error (or possibly a warning) should be raised if you attempt to pass the same variable in more than one B and/or O parameter.

Finally, ProIV allows you to use the same name for two 'target' variables of a call. This is a minefield. I can only assume that this allows B parameters to be implemented as I O parameters or maybe B was added later and there is some legacy code that uses I O to achieve the same result but I suspect that other queer results can be achieved. 1=2? You ain't seen nothing yet :-)

So the problem/issue is much bigger than this thread would suggest. Fixing some bits of it should be simple but there are fundamental issues of support for legacy apps that need to be addressed.

#57 Glenn Meyers

Glenn Meyers

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 221 posts
  • Gender:Male
  • Location:St. Louis, MO, United States
  • Interests:I also raise African Gray Parrots and build hot rod automobiles.

Posted 25 July 2001 - 12:49 PM

I haven't recovered from Political Correctness yet.
Now we have Computer Correctness and Scientific Correctness??


Where's my Lemming suit?? I feel the need to hurl myself off a tall cliff!

#58 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 25 July 2001 - 01:17 PM

You are absolutely correct in saying my 'full' solution would break code of the type you describe.

Personally, I think the caller should abide by the parameter types.
For example (assume a global logic for ease of illustration):

Where definition of MYGLOBAL has PARMS(#N, #V1, #V2, #V3)

Then I think it better practice for a caller to say:
MYGLOBAL(1, #R1, #unused, #unused)
than to say
MYGLOBAL(1, #R1, 0, 0)

Nevertheless, I accept that people may well have done this.

As you and Dan suggest, the solution is to copy the literal to provide call by value behaviour.

As I said in reply to Dan, the only reason I don't like this is that it makes the global function parameter types a bit of a misnomer in the case of literals. As a 'strictness' enthusiast, I would be prepared to correct existing usage which I considered bad practice but I accept that not everybody will.

>I have already suggested (not here) that for a global function a gen error (or possibly a warning) should be raised if you attempt to pass the same variable in more than one B and/or O parameter.
Absolutely agree.

ER.. So as far as I can see, implementing call by value semantics for literals in all cases for global logics and globals functions will solve the problem to everyone's (almost complete) satisfaction. And a warning for using the same variable twice as a 'target' argument would be icing on the cake.

Any chance of this happening then?
Nothing's as simple as you think

#59 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 25 July 2001 - 01:22 PM

Your point 4 is valid, not least because copying the literal is the way to address Tim's objection regarding backward compatibility. The reason I didn't like it is because it makes the (O)utput and (B)oth parameter types 'misleading' in the case of literals. For example, if they were OUT and INOUT keywords in a 3GL they would no longer always 'mean what they said' because literals would always have call-by-value (IN) semantics regardless of the coded parameter type.

This is a case where I think there is conflict between backward compatibility and what I would call keeping the language 'honest'. Nevertheless I wouldn't have a problem with backward compatibility winning out, it usually does :-)

I certainly wouldn't want to see you alternative (more logic entry points).
Nothing's as simple as you think

#60 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 25 July 2001 - 01:31 PM

> This is getting tedious.
I guess so for those of us who think they have nothing to learn.
Seriously, I don't want to get into a pissing contest here, but I get the impression you're about to explode with righteous anger and I really don't think it's justified.

>C DOES NOT HAVE PASS BY REFERENCE EVEN FOR ARRAYS.
'C arrays are always passed by reference'
p.371 Programming Language Pragmatics, Michael L Scott, Morgan Kaufman, 2000

> char ptr[] = 'Hello world';
There is an important difference between these definitions:
char amessage[] = 'now is the time'; /* an array */
char *pmessage = 'now is the time'; /* a pointer */
p.104 The C Programming Language 2nd Ed, Kernighan & Ritchie, Prentice Hall, 1988
Your choice of the variable name ptr is misleading.

> You can write what ever you like for callbyref. The value of ptr WILL NOT CHANGE
You missed the whole point. No one is interested in the value of ptr changing, only in the value of the array (ie: 'Hello World'). If you look at my previous posting the variable str behaves precisely as an alias for the variable greeting - with reference semantics. The deliberate mistake (the classic = for ==) made it clear that modifying the string was the issue.

> C does have call by reference.
PlusPlus eaten by Rob I presume (he must be getting full by now).

>Will people please find out what the difference between call by reference and call by value are and how you can use pointers to simulate call by reference in a call by value model.
I thought my 'very long' post set out the difference pretty clearly as I understand it, which bit didn't you like? Also when I previously said 'You can simulate call-by-reference in any language which has pointers' you dissed me saying 'Call by reference and call by pointer are NOT the same' so I'm not clear what you're trying to say here really.

>Note that in a call by reference model you cannot pass constants. You have to simulate call by value by passing a reference to a copy of the constant.
Possibly you are remaking your (valid) point about numeric literals which might be difficult to access 'by reference' (in registers maybe?). However for constants in the general sense I disagree.You don't need a copy if the language is safe enough to guarantee the constant will not be modified. This was in my 'very long' post so I'll just copy it:

Call by reference does NOT have to be unsafe
--------------------------------------------
It has been suggested several times in the forum that where the call by reference mechanism is used, this will OF NECESSITY bring with it the problem of constants/literals potentially being modified.
This is not true. PROVIDED a language requires that all formal parameters are classified as IN, OUT or INOUT, the language disallows modifications to IN parameters and the language ensures constants/literals/expressions are ONLY passed as IN parameters THEN the problem cannot arise. Ada has been doing this since 1983.
Nothing's as simple as you think



Reply to this topic



  


0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users