Entries in language (4)

Saturday
Mar202010

Google Translate: from here to there and back (I do not speak Русский)

round-trip n.: from here to there and back again

The live translation feature in Google Translate works quite well, but it would be even better with live round-trip translation. The state-of-the-art in automatic translation is very good now, but I find the round-trip lets me refine the original text, and decrease the “information loss” in the translation.

Information loss in this context can be considered the difference (or delta) between the source and the round-trip. Making the complete round-trip interactive makes re-phrasing and re-wording easier, and I can easily adjust what I am trying to say based on the results. 

The results are improved because adjusting the source either

  1. makes it less ambiguous or more similar to the target language, or
  2. brings it closer to a well-translated or common phrase in the target language.

The linked text and images take you right to Google Translate with this example data pre-loaded.

Source:

Sudden Internet fame is unpredictable. What is predictable is the temptation to use his fame to push some political or nationalist message. Poor Mr. Trololo!

Translation:

Внезапная слава Интернету непредсказуемо. Что является вполне предсказуемой соблазн использовать свою славу интернет доставить некоторые политические или националистические сообщения. Бедный мистер Trololo!

Round-trip:

Sudden fame internet unpredictably. What is predictable temptation to use his fame to bring the Internet, some political or nationalistic message. Poor Mr. Trololo!

Before the final two illustrations, I will break to introduce Mr. Trololo. He discusses his Internet fame here (in Русский of course)

Video:

 

Illustrations:

figure 1

figure 2

Sunday
Dec062009

Google's Go language: multi-value return vs. exceptions (C++)

The following  is a question I asked on Stack Overflow and some of the answers. There is more interesting stuff at the site. This question is tricky in that I don’t want to start a religious debate about exceptions and error handling, but I do find this feature interesting in light of Google’s “exception” prohibition.

I do think it is worthwhile to start with a “blank slate” and borrow the best features of other languages together to address a specific need. I think the Go language shows promise; the feature set, syntax and run-time properties are very interesting. Apparently, the decision to support the C object file linkage standard means it begins life with a large set of libraries. Another interesting question is whether the language can be implemented using the JVM runtime.

Is Google’s “Go” language multi-value return statement an alternative to exceptions?

It seems to me Google’s alternatives to exceptions are

  • GO: multi-value return “return val, err;”
  • GO, C++: nil checks (early return)
  • GO, C++: “handle the damn error” (my term)
  • C++: assert(expression)

Is multi-value return useful enough to act as an alternative? Why are “asserts” considered alternatives? Does Google think it O.K. if a program halts if an error occurs that is not handled correctly?

Effective GO: Multiple return values

One of Go’s unusual features is that functions and methods can return multiple values. This can be used to improve on a couple of clumsy idioms in C programs: in-band error returns (such as -1 for EOF) and modifying an argument.

In C, a write error is signaled by a negative count with the error code secreted away in a volatile location. In Go, Write can return a count and an error: “Yes, you wrote some bytes but not all of them because you filled the device”. The signature of *File.Write in package os is:

func (file *File) Write(b []byte) (n int, err Error)

and as the documentation says, it returns the number of bytes written and a non-nil Error when n != len(b). This is a common style; see the section on error handling for more examples.

Effective GO: Named result parameters

The return or result “parameters” of a Go function can be given names and used as regular variables, just like the incoming parameters. When named, they are initialized to the zero values for their types when the function begins; if the function executes a return statement with no arguments, the current values of the result parameters are used as the returned values.

The names are not mandatory but they can make code shorter and clearer: they’re documentation. If we name the results of nextInt it becomes obvious which returned int is which.

func nextInt(b []byte, pos int) (value, nextPos int) {

Because named results are initialized and tied to an unadorned return, they can simplify as well as clarify. Here’s a version of io.ReadFull that uses them well:

func ReadFull(r Reader, buf []byte) (n int, err os.Error) {
 
for len(buf) > 0 && err == nil {
       
var nr int;
        nr
, err = r.Read(buf);
        n
+= nr;
        buf
= buf[nr:len(buf)];
 
}
 
return;
}

Why does Go not have exceptions?

Exceptions are a similar story. A number of designs for exceptions have been proposed but each adds significant complexity to the language and run-time. By their very nature, exceptions span functions and perhaps even goroutines; they have wide-ranging implications. There is also concern about the effect they would have on the libraries. They are, by definition, exceptional yet experience with other languages that support them show they have profound effect on library and interface specification. It would be nice to find a design that allows them to be truly exceptional without encouraging common errors to turn into special control flow that requires every programmer to compensate.

Like generics, exceptions remain an open issue.

Google C++ Style Guide: Exceptions

Decision:

On their face, the benefits of using exceptions outweigh the costs, especially in new projects. However, for existing code, the introduction of exceptions has implications on all dependent code. If exceptions can be propagated beyond a new project, it also becomes problematic to integrate the new project into existing exception-free code. Because most existing C++ code at Google is not prepared to deal with exceptions, it is comparatively difficult to adopt new code that generates exceptions.

Given that Google’s existing code is not exception-tolerant, the costs of using exceptions are somewhat greater than the costs in in a new project. The conversion process would be slow and error-prone. We don’t believe that the available alternatives to exceptions, such as error codes and assertions, introduce a significant burden.

Our advice against using exceptions is not predicated on philosophical or moral grounds, but practical ones. Because we’d like to use our open-source projects at Google and it’s difficult to do so if those projects use exceptions, we need to advise against exceptions in Google open-source projects as well. Things would probably be different if we had to do it all over again from scratch.

 

Answers I liked

Steve Jessop

Multiple returns are not unique to Go, and they’re not a substitute for exceptions. In C (or C++) terms, they are a concise and user-friendly substitute for returning a struct (object) containing multiple values.

They do provide a convenient means of indicating errors, if that’s all you mean.

Why are “asserts” considered alternatives?

Asserts are initially for debugging. They halt the program in situations where it is in an “impossible” state, one that the design says should not happen, but which has anyway. Returning an error is unlikely to help much. The code base obviously doesn’t work yet, so how on earth can it successfully recover? Why would you even want it to, when there’s a bug that needs attention?

Using asserts in production code is a bit of a different matter - obviously there are performance and code size concerns, so the usual approach is to remove them once your code analysis and tests have convinced you that the “impossible” situations really are impossible. But, if you’re running code at this level of paranoia, that it’s auditing itself, then you’re probably also paranoid that if you let it carry on running in an “impossible” state, then it might do something dangerously broken: corrupting valuable data, overflowing a stack and perhaps creating security vulnerabilities. So again, you just want to shut down as soon as possible.

The stuff you use asserts for really isn’t the same as the stuff you use exceptions for: when programming languages like C++ and Java provide exceptions for “impossible” situations (logic_error,ArrayOutOfBoundsException), they unintentionally encourage some programmers to think that their programs should attempt to recover from situations where really they’re out of control. Sometimes that is appropriate, but the Java advice not to catch RuntimeExceptions is there for a good reason. Very occasionally it’s a good idea to catch one, which is why they exist. Almost always it’s not a good idea to catch them, meaning that they amount to halting the program (or at least the thread) anyway.

 

dribeas

It’s not Go, but in Lua, multiple return is an extremely common idiom for handling exceptions.

If you had a function like

function divide(top,bottom)
   
if bottom == 0 then
        error
("cannot divide by zero")
   
else
       
return top/bottom
   
end
end

Then when bottom was 0, an exception would be raised and the program’s execution would halt, unless you wrapped the function divide in a pcall (or protected call).

pcall always returns two values: the first is result is a boolean telling whether the function returned successfully, and the second result is either the return value or the error message.

The following (contrived) Lua snippet shows this in use:

local top, bottom = get_numbers_from_user()
local status, retval = pcall(divide, top, bottom)
if not status then
    show_message
(retval)
else
    show_message
(top .. " divided by " .. bottom .. " is " .. retval)
end

Of course, you don’t have to use pcall, if the function you’re calling already returns in the form ofstatus, value_or_error.

Multiple return has been good enough for Lua for several years, so while that doesn’t ensure that it’s good enough for Go, it is supportive of the idea.

 

Neil Butterworth

Here’s an example of how multiple return values might work in c++. I wouldn’t write this code myself, but I don’t think it is entirely out of the question to use such an approach.

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

// return value type
template <typename T>
struct RV {
   
int mStatus;
    T mValue
;

    RV
( int status, const T & rv )
       
: mStatus( status ), mValue( rv ) {}
   
int Status() const { return mStatus; }
   
const T & Value() const {return mValue; }
};

// example of possible use
RV
<string> ReadFirstLine( const string & fname ) {
    ifstream ifs
( fname.c_str() );
   
string line;
   
if ( ! ifs ) {
       
return RV <string>( -1, "" );
   
}
   
else if ( getline( ifs, line ) ) {
       
return RV <string>( 0, line );
   
}
   
else {
       
return RV <string>( -2, "" );
   
}
}

// in use
int main() {
    RV
<string> r = ReadFirstLine( "stuff.txt" );
   
if ( r.Status() == 0 ) {
        cout
<< "Read: " << r.Value() << endl;
   
}
   
else {
        cout
<< "Error: " << r.Status() << endl;
   
}
}
Monday
Oct052009

OMGWTFBBQ: if CONC trees replace CONS lists, then mapreduce can be parallel

Alternate title:  Mary had a little λ (kidding!)

Guy Steele: Organizing Functional Code for Parallel Execution

CONS (or “Lisp”) lists are inherently linear and sequential. One of CAR, CDR is constant-time, and the other is linear (on the length of the list). For a given list, there is only one CONS representation.

CONC primitives are null <>, singleton <42>, and “concatenation” a || b

CONC trees can represent CONS style lists, or balanced binary trees, or sparse trees, or other structures.



  • CAR CDR and CONS are compared to CONC primitives… and rather than use accessors left, right Guy prefers a SPLIT functional accessor which calls the second argument (a function) with the left and right parts as parameters.
  • Guy says this somewhere later on, and it’s easy to miss:

        “We are going to use CONC trees to optimize delay”
  • An important goal is a corollary to the equivalence 

    (cons (car xs (cdr xs)) = xs

  • It’s not this:

    (conc (left xs) (right xs)) = xs

  • but this functional gem:

    (split xs (λ (ys zs) (conc ys zs))) = xs

  • Think of the lambda λ as a continuation, and a way to keep the left and right together. In practice, split does not have to behave uniformly, but for the sake of this talk,  consider split to be purely functional with no side effects. You will notice we now have a corollary to the most low-level CONS equivalence, but one which is expressed functionally, recursively, by “binary decomposition and reassembly”

Having taken care of the basics, Guy breaks down the implementations of  MAP, REDUCE, MAPREDUCE, LENGTH, FILTER, QUICKSORT and MERGESORT for both CONS lists and the new CONC trees.

Heres the recursive mapreduce implementation with the Opportunity for Parallelism.

 

(define (mapreduce f g id xs)     ; Logarithmic in (length xs)??
  (cond ((null? xs) id)
            ((singleton? xs) (f (item xs)))
            (else (split xs (λ (ys zs)
                 (g (mapreduce f g id ys)       ; OMG Opportunity for
                     (mapreduce f g id zs))))))) ; WTF Parallelism
                                                             ; BBQ (mmmm, barbecue… <gurgle>)

 

…. Almost done! ….

 

RANT ON

If you are on Windows searching the Unicode characters looking for “LAMBDA”, you should STOP IMMEDIATELY. Search for “LAMDA” (no B). Microsoft doesn’t know how to spell “LAMBDA” much less describe its purpose in binding new variables dynamically, so the lambda (and its children) have a protected scope that hides variables in outer scopes, until the lambda completes and its scope goes away. Anyway HTML for the λ is ” &lambda; ” — that’s ampersand lambda semicolon  (no spaces)
RANT OFF

 

Wednesday
Mar252009

Annoucement : SmartGWT Enterprise Edition Release - Google Web Toolkit News - onGWT.com

Annoucement : SmartGWT Enterprise Edition Release - Google Web Toolkit News - onGWT.com.

Moments ago Isomorphic, creators of SmartGWT, announced SmartGWT Enterprise Edition. SmartGWT Enterprise Edition (SmartGWT EE for short) is a commercially licensed version of SmartGWT that includes Java Server side functionality, additional tools, and a classic commercial license in lieu of the LGPL. For teams with existing Java functionality, SmartGWT EE provides greatly accelerated integration with SmartGWT’s visual components. In many cases it is possible to take existing Java methods in an application and bind a SmartGWT grid or form to those methods without writing any new code, without the need for redundant DTOs (data transfer objects), simply by specifying what method to call in a DataSource XML file. SmartGWT EE also provides wizards that generate DataSources which immediately provide full read-write binding to any Hibernate entity or SQL database table, including the ability to search, update, delete and add new records. You can easily add Java business logic that runs before or after the Hibernate or SQL binding, which can modify the request before it executes, modify the output, or take any other action