Entries in concurrency (5)

Monday
05Oct2009

OMGWTFBBQ: if CONC trees replace CONS lists, then mapreduce can be parallel

Alternate title:  Mary had a little λ (kidding!)

Guy Steele: Organizing Functional Code for Parallel Execution

CONS (or “Lisp”) lists are inherently linear and sequential. One of CAR, CDR is constant-time, and the other is linear (on the length of the list). For a given list, there is only one CONS representation.

CONC primitives are null <>, singleton <42>, and “concatenation” a || b

CONC trees can represent CONS style lists, or balanced binary trees, or sparse trees, or other structures.



  • CAR CDR and CONS are compared to CONC primitives… and rather than use accessors left, right Guy prefers a SPLIT functional accessor which calls the second argument (a function) with the left and right parts as parameters.
  • Guy says this somewhere later on, and it’s easy to miss:

        “We are going to use CONC trees to optimize delay”
  • An important goal is a corollary to the equivalence 

    (cons (car xs (cdr xs)) = xs

  • It’s not this:

    (conc (left xs) (right xs)) = xs

  • but this functional gem:

    (split xs (λ (ys zs) (conc ys zs))) = xs

  • Think of the lambda λ as a continuation, and a way to keep the left and right together. In practice, split does not have to behave uniformly, but for the sake of this talk,  consider split to be purely functional with no side effects. You will notice we now have a corollary to the most low-level CONS equivalence, but one which is expressed functionally, recursively, by “binary decomposition and reassembly”

Having taken care of the basics, Guy breaks down the implementations of  MAP, REDUCE, MAPREDUCE, LENGTH, FILTER, QUICKSORT and MERGESORT for both CONS lists and the new CONC trees.

Heres the recursive mapreduce implementation with the Opportunity for Parallelism.

 

(define (mapreduce f g id xs)     ; Logarithmic in (length xs)??
  (cond ((null? xs) id)
            ((singleton? xs) (f (item xs)))
            (else (split xs (λ (ys zs)
                 (g (mapreduce f g id ys)       ; OMG Opportunity for
                     (mapreduce f g id zs))))))) ; WTF Parallelism
                                                             ; BBQ (mmmm, barbecue… <gurgle>)

 

…. Almost done! ….

 

RANT ON

If you are on Windows searching the Unicode characters looking for “LAMBDA”, you should STOP IMMEDIATELY. Search for “LAMDA” (no B). Microsoft doesn’t know how to spell “LAMBDA” much less describe its purpose in binding new variables dynamically, so the lambda (and its children) have a protected scope that hides variables in outer scopes, until the lambda completes and its scope goes away. Anyway HTML for the λ is ” &lambda; ” — that’s ampersand lambda semicolon  (no spaces)
RANT OFF

 

Sunday
05Apr2009

Just Software Solutions series on concurrency in C++0x

Tuesday
31Mar2009

A Curious Course on Coroutines and Concurrency (Python)

A Curious Course on Coroutines and Concurrency - www.dabeaz.com

Shameless Plug I teach Python courses for programmers of all levels. If you like this tutorial, consider coming to one of my public classes. -Dave.Introduction to Python, May 11-13, 2009. Python Concurrency Workshop, May 14-15, 2009.
Copyright (C) 2009, All Rights Reserved David Beazley http://www.dabeaz.com Presented at PyCon 2009, March 25, 2009.

Introduction

This tutorial is a practical exploration of using Python coroutines (extended generators) for solving problems in data processing, event handling, and concurrent programming. The material starts off with generators and builds to writing a complete multitasking environment that can run thousands of concurrent tasks without using threads or using code based on event-driven callbacks (i.e., the “reactor” model).
Wednesday
25Mar2009

Dr. Dobb's | Use Threads Correctly = Isolation + Asynchronous Messages | March 16, 2009

Where Threads Fit (and Thread Pools, Sometimes) In [1], I described the three pillars of concurrency. The first two pillars summarize the two main kinds of concurrency we need to be able to express: 1. Keep things separate, so that independent parts of the program can run asynchronously. 2. Use more cores to get the answer faster using data-parallel and similar techniques. (The third pillar is about controlling concurrency once it has been expressed, using tools like locks and atomics.) Table 1 summarizes these two pillars, and also summarizes how well each is served by four major tools at our disposal today for expressing concurrency: threads, thread pools, work stealing runtimes, and data-parallel facilities like OpenMP.

via Dr. Dobb’s | Use Threads Correctly = Isolation + Asynchronous Messages | March 16, 2009.

Friday
13Mar2009

Suggested reading and topics for future discussion