Behind the connection: generics

Showing posts with label generics. Show all posts

January 2, 2016

DirWatch: Delphi 10 in action

In this article, I will show you how I used Delphi 10 Seattle to create a nice application aimed at watching a directory tree. Source code provided, see at the end.

Background:

I frequently use shared directories as repository for files. Some user on the network just drops the file in the directory where I can grab it. The user is supposed to notify me when he has dropped a file. Unfortunately some user forgets to notify e and the file stay there for a long time before I notice it myself. In other situation, I like to know which files are affected by a given application or by a setup.

DirWatch will be of great help in those situation and many others. DirWatch will monitor a whole directory tree, even a full disk, for changes and send me an email when a change is detected. A change means a file or subdirectory is added / renamed / deleted / modified.

How it works:

Part 1: Getting notifications

Using Windows, it is actually very easy to be notified of changes in the file system. There is a specific and simple API for the purpose. The main API functions are FindFirstChangeNotification, FindNextChangeNotification and ReadDirectoryChanges. The actual work is done by Windows!

This notification API works much like listing a directory: you do a FindFirst, then if successful, you read the changes that occurred and then call FindNext to be notified for the next change. As simple as that.

OK, things can be a little bit more complex if you want your application to stay responsive you should resort to asynchronous Input/Output and multithreading. That code already exists on the web in numerous instances. I selected to use DWScript implementation because it is simple and efficient and also because I use DWScript for other project and find this open source product excellent. DWScript is available under Mozilla Public License Version 1.1.

dwsDirectoryNotifier is available from DWScript project (https://bitbucket.org/egrange/dwscript). Direct link to the source file at the time of writing is

https://bitbucket.org/egrange/dwscript/raw/5514c091096d0ca0217a2154a7ec8f5d95de443e/Libraries/SimpleServer/dwsDirectoryNotifier.pas

Part 2: Sending an email

To programmatically send an email, you have many solutions available. In my opinion, the best solution is to directly address the email server, either company email server or Internet Service Provider email server. This may sound difficult but it is not much. An email server makes use of a protocol named SMTP, just like your browser make use of a protocol name HTTP. SMTP stands for Simple Mail Transfer Protocol and as its name implies, it is simple! SMTP is a text based protocol and involves the exchange of a few messages to send an email.

SMTP becomes a little bit more complex when you start talking authentication, compression, and security but it basically remains simple.

I have implemented the SMTP protocol in an open source freeware I published nearly 20 year ago. I mean the Internet Component Suite (ICS for short). See http://wiki.overbyte.be. ICS is a huge component library. We will only use the SMTP component.

ICS (Internet Component Suite) is available from http://wiki.overbyte.be/wiki/index.php/ICS_Download and also can be installed directly in Delphi 10 Seattle using Embarcadero GetIT (Delphi menu Tools / GetIi Package Manager, type ICS into the search box and select "ICS for VCL").

Part 3: be discreet

When I use a utility which I don't have to interact with, I like to have it very discreet. I don't even want to see it on my screen. There are two good solutions for this: write a service application or write a "tray-icon" application. I selected the second which is slightly easier to write and debug.

A tray-icon application is a normal Windows application which once minimized is only noticeable by its icon displayed in the systray (You know, this little area usually on the bottom right of the screen where the time is displayed).

Using Delphi 10, you create an application as a "VCL form application" and drop a TTrayIcon component on it. You have just two event handler to write: FormResize and TrayIconDoubleClick. FormResize is used to hide the form once it is minimized and TrayIconDoubleClick is used to make the form visible again when the user double clicks on the small icon.

There is a side problem when building a tray-icon application: you need a small 16x16 icon. If you don't create your own 16x16 icon, Windows will shrink the large icon to create the small one. The result is frequently horrible.

To create the multiple resolution icon file, I used IconMagick (http://www.imagemagick.org) command line utility "convert" to combine icons created using my favorite paint application.

convert DirWatchIcon32x32.ico DirWatchIcon16x16.ico DirWatchIcon.ico

Once DirWatchIcon.ico was created, I associated it with the application using Delphi project options dialog.

Please note that TrayIcon component dropped on the form must be loaded with the DirWatchIcon16x16.ico file otherwise window will use the 32x32 icon file resized to 16x16 which gives an horrible result.

Part 4: Work offline

I wanted to be notified even for changes occurring while DirWatch was not running. This is especially useful if the watched directory is not local but a shared resource on a file server.

When DirWatch starts, I create notifications for the changes occurred since last run by comparing the list of files actually on disk with the list of files that where on disk when DirWatch was stopped.

This means that at program startup, DirWatch enumerates all the files and directories in the watched directory. This could take some time if the directory contains millions of files! On my hard drive with about one million files, it takes approximately 15 seconds to enumerate the files, compare that list with the previous list and send the email for the changes.

The implementation I made makes use of Delphi 10 Seattle generics. Internally the list of files is not a list but a dictionary. This gives fast access to each file data. Each file (Or directory entry) is stored with his name, attributes, size and timestamp. This is used to find out what has changed with a very low chance to miss something. Of course it could be possible to write a program which saves the timestamp of a file, update some bytes in the file without changing the size and then restore the timestamp. Actually this never occurs in real life!

It is interesting to look at the code I wrote for the dictionary. There are a few involved classes: TDirDictionary, TDirEntry, TDirDictionaryPair, TDirDictionaryObject and TDirDictionaryEnumerator.

I implemented TDirDictionary by delegation instead of inheritance. I frequently do that to have complete freedom about the interface I expose. In most cases, delegation is a better encapsulation that inheritance. Look at the source code for details and you'll quickly see the code is really very simple.

Part 5: Data persistence

DirWatch must be configured with some parameters. Of course, the directory which must be watched but also the SMTP server information (Those information are the same that Outlook uses for a classical POP3/SMTP account).

I also like that my applications remember where their window was and which size they had when I run the application again.

For all those purposes, I like to use the good old INI file. It is easy to use within the program and easy you me to edit with my favorite text editor. I store the INI file in the "Loacl AppData" folder in the user profile. The Windows API function SHGetFolderPath is used to query the exact location of the "Local AppData" folder.

Part 6: Hide sensitive data

Sending an email may sometimes involve a usercode and password that the DirWatch must know. As explained above, those values are stored in the INI file and this is a problem if they are saved in clear text.

I wrote two functions Decrypt and EncryptRestore for the purpose. One trick is used to allow encryption without using a special program. When the data is enclosed in angle bracket in reverse order, then it's assumed not encrypted but needing to be encrypted. Then once encrypted the resulting value is saved between angle brackets. So "]MyName[" will result in "MyName" and a flag is set so that EncryptRestore will produce "[Qyaze2==]" which is an encrypted value. If no angle bracket is used at all, then the value is assumed not encrypted. Here in this implementation I used very weak encryption: base64 encoding. In a fully secured application you should use a real encryption mechanism providing strong security.

Part 7: Command line options

Some information cannot be stored in the INI file and yet be changed at run time. For example the name and location of the INI file itself can be changed.

-i IniFileName Select INI file

-h Show this help

-m Start minimized

-a AppName Select application name

-c CompanyName Select company name

To parse command line argument, I used code largely inspired by by code from Stehan Huberdoc (http://stefan.huberdoc.at/comp/software/delphi/sandkasten.html) which is an implementation of the well-known GNU GetOpt command line parsing function (https://en.wikipedia.org/wiki/Getopt).

Part 8: Source code

Source code is split into two units: DirWatchMain.pas which is the main and only form of the application and DirWatchProcess.pas which contains the code which does the actual work without any user interface. Of course other units are used as explained above: some from ICS, some from DWScript and GetOpt.

Download links:

DirWatch: http://www.overbyte.be/eng/blog_source_code.html

ICS: http://wiki.overbyte.be/wiki/index.php/ICS_Download

DWScript: https://bitbucket.org/egrange/dwscript

Follow me on Twitter
Follow me on LinkedIn
Follow me on Google+
Visit my website: http://www.overbyte.be

December 15, 2013

FireMonkey, Android, Windows and PostMessage

FireMonkey framework (FMX for short) is definitely able to use custom [Windows] messages much like we have always done with the VCL. And this is also true when using FireMonkey to build Android applications.

Both Windows and Android support a messaging system. It is well known by Windows developers who use it with PostMessage, GetMessage, PeekMessage and similar Windows API call. It is much less known by Android developers. Android has a “looper” API which has the same purpose as Windows own messaging system although it is implemented differently and has somewhat more features.

Often, we use FireMonkey framework to build multi-platform applications. Thanks to Delphi XE5, we can build an application for different targets such as Win32, Win64, Android, iOS and MAC OSx. If correctly written, the same application source code can be recompiled for different target and run unchanged. Embarcadero made a lot of efforts to hide differences between the supported platforms.

Speaking about the messaging system, it must admit that Embarcadero forgot to write the abstraction layer required for the platforms. They made some work but it is incomplete and undocumented. This is why I wrote it. At least for Win32, Win64 and Android which are the 3 platforms I currently use.

The layer I wrote is made of a single class I named “TMessagingSystem”. I made two different implementations: one for Android and one for Win 32/64. TMessagingSystem class allows you to register any number of custom messages to a form and associate a custom message handler. Of course it also allows you to call PostMessage to put a message into the message queue.

At the application level, you use the exact same code for Windows or Android. You just have to make use of one of the implementations. You’ll do that using a conditional compilation.

Before showing the implementation details, I will present a demo application. That you can target for Windows or Android without changing a single line.

Demo application for Windows and Android

I built a simple application to emphasize how to use TMessagingSystem. Actually it does not do anything very interesting. It is made of a single form having a button and a memo. When you click on the button, it starts a new thread which will periodically PostMessage a custom message to the main form. You can click many times on the button to start many threads. Each thread will do the same.

The image above shows on the left a screen dump of the application running under Win7 and on the right, the same application running on my Nexus7.

All you see is a memo with messages. Nevertheless, this is really one of the main usages of a messaging system: organize asynchronous operation between threads.

Each line looks like this:

8380] Thread=2 Count=8 ThreadID=7528

“8380” is the thread ID of the thread doing the display. This is always the same and is the main thread ID. “Thread=2” is the sequential thread number having generated the message, “Count=8” is the number of messages generated by this thread and finally, “ThreadID=7528” is the thread ID of the thread generating the message. The later change according to each started thread.

Demo application source code

unit FmxMultiplatformPostMessageDemoMain;

interface

uses
    System.SysUtils, System.Types, System.UITypes, System.Classes,
    Generics.Collections,
    FMX.Types, FMX.Controls, FMX.Forms, FMX.Graphics, FMX.Dialogs,
    FMX.StdCtrls, FMX.Layouts, FMX.Memo,
    FMX.Overbyte.MessageHandling;

const
    WM_SHOW_MESSAGE = WM_USER + 1;

type
    TWorkerThread = class(TThread)
    public
        MsgSys : TMessagingSystem;
        Id     : Integer;
        procedure Execute; override;
    end;

    TForm1 = class(TForm)
        RunThreadButton: TButton;
        DisplayMemo : TMemo;
        ToolPanel: TPanel;
        procedure RunThreadButtonClick(Sender: TObject);
    private
        FMsgSys      : TMessagingSystem;
        FThreadCount : Integer;
        procedure Display(const Msg: String);
        procedure WorkerThreadTerminate(Sender: TObject);
        procedure WMShowMessage(var Msg: TMessage);
    protected
        procedure CreateHandle; override;
        procedure DestroyHandle; override;
    end;

var
  Form1: TForm1;

implementation

{$R *.fmx}

{ TForm1 }

procedure TForm1.CreateHandle;
begin
    inherited CreateHandle;
    FMsgSys := TMessagingSystem.Create(Self);
    FMsgSys.RegisterMessageHandler(WM_SHOW_MESSAGE, WMShowMessage);
end;

procedure TForm1.DestroyHandle;
begin
    FreeAndNil(FMsgSys);
    inherited DestroyHandle;
end;
 
procedure TForm1.RunThreadButtonClick(Sender: TObject);
var
    WorkerThread : TWorkerThread;
begin
    Inc(FThreadCount);
    Display('Start thread ' + IntToStr(FThreadCount));
    WorkerThread                 := TWorkerThread.Create(TRUE);
    WorkerThread.MsgSys          := FMsgSys;
    WorkerThread.Id              := FThreadCount;
    WorkerThread.FreeOnTerminate := TRUE;
    WorkerThread.OnTerminate     := WorkerThreadTerminate;
    WorkerThread.Start;
end;

procedure TForm1.WorkerThreadTerminate(Sender: TObject);
begin
    Display('Thread ' +
            IntToStr((Sender as TWorkerThread).Id) +
            ' terminated');
end;

procedure TForm1.WMShowMessage(var Msg: TMessage);
var
    Buffer : PChar;
begin
    Buffer := PChar(Msg.LParam);
    Display(Buffer);
    FreeMem(Buffer);
end;

procedure TForm1.Display(const Msg: String);
begin
    Displaymemo.Lines.Add(IntToStr(GetCurrentThreadID) + '] ' + Msg);
end;

{ TWorkerThread }

procedure TWorkerThread.Execute;
var
    I      : Integer;
    Buffer : PChar;
const
    MaxLen = 100;
begin
    // For demo, let's do it 10 times
    for I := 1 to 10 do begin
        // Simulate some processing time by sleeping
        Sleep(1000);

        // Allocate memory to hold a message, take care of the ending nul char
        GetMem(Buffer, SizeOf(Char) * (MaxLen + 1));
        // Copy message to allocated memory, protecting overflow
        StrLCopy(Buffer,
                 PChar('Thread=' + IntToStr(Id) +
                       ' Count=' + IntToStr(I) +
                       ' ThreadID=' + IntToStr(GetCurrentThreadID)),
                 MaxLen);
        // Force a nul char at the end of buffer
        Buffer[MaxLen] := #0;
        // Post a message to the main thread which will display
        // the message and then free memory
        MsgSys.PostMessage(WM_SHOW_MESSAGE, I, LParam(Buffer));
    end;
end;

end.

This source code is really simple, isn’t it? The beauty is that it can be compiled for Win32, Win64 and Android targets without changing anything.

All the code depending on the platform has been moved to “FMX.Overbyte.MessageHandling” unit. That one takes care of calling the correct API function according to the compiler used. This is the power of OOP.

There is nothing special in the demo application except one thing: The worker thread generates messages to be displayed by the main thread. We have to take care of what happens with the storage used for the message. We cannot simply pass a string because messages are limited to two parameters of type WParam and LParam, both mapped to NativeInt. We can neither pass a reference to a string variable because it is possible a new message is generated before the previous is consumed (This happens if the main thread is heavily busy while the worker thread runs at full speed). We have to dynamically allocate storage for the message and pass the reference thru one of the message parameters. I’ve chosen to use a simple memory block allocated by GetMem and freed by FreeMem. The pointer is then passed thru the LParam parameter. The thread allocates the memory and the main thread frees it. The same allocation size is always used regardless of the message length. It is better for the memory allocator, limiting memory fragmentation.

How to use it?

TMessagingSystem class must be instantiated when the form is allocated a handle. It must be freed when the form’s handle is destroyed. After instantiation, or at any point in time, RegisterMessageHandler must be called for each custom message. That’s all!

Single unit, multiple platforms

We have seen in the demo code that the same unit to “FMX.Overbyte.MessageHandling” is used whatever the target platform is. The magic is in that unit. Here is very short source code:

unit FMX.Overbyte.MessageHandling;
{$DEFINE OVERBYTE_INCLUDE_MODE}
{$IFDEF ANDROID}
    {$I FMX.Overbyte.Android.MessageHandling.pas}
{$ENDIF}
{$IFDEF MSWINDOWS}
    {$I FMX.Overbyte.Windows.MessageHandling.pas}
{$ENDIF}

The magic is into the conditional compilation. Symbols ANDROID and MSWINDOWS are automatically defined by the compiler according to the target platform you compile for. So that small unit actually includes the Android or the Windows specific unit depending on the compiler target platform.

The two included units are just normal unit, well almost. You cannot include a unit into another one without having a problem with the “unit” line. You cannot have two such lines. This is why the symbol “OVERBYTE_INCLUDE_MODE” is defined. In the two included units, this symbol is used to conditionally compile the “unit” line.

Implementation for Android

Messaging system on Android platform is hidden in the “Looper” API. Basically, the idea is simple: Android monitors a list of handle for data availability. The list of handles is maintained by the API. You can add a new handle using ALooper_addFd API function. Each handle is associated with a callback function that Android calls when data is available.

As a handle, I use the read side of a pipe. A pipe, under Android as well as other operating systems, is like a first-in first-out queue. It has two ends identified by two handles. One is the writing end; the other is the reading end. What you write at one end is available for reading at the other end. Between both ends is a buffer. Reads and writes are asynchronous. If writing is faster than reading, the buffer is filled and nothing is lost.

This pipe is used here is the message queue. When PostMessage is called, a record with the parameters is written to the pipe. When data is available for reading, the looper API will call the LooperCallBack function we registered. From this callback, we read the pipe to remove one record at a time. When a record is read, the message number written in it is used to fetch the message handler to be executed.

{$IFNDEF OVERBYTE_INCLUDE_MODE}
unit FMX.Overbyte.Android.MessageHandling;
{$ENDIF}

interface

uses
    System.SysUtils, System.Types, System.Classes, System.SyncObjs,
    Generics.Collections,
    FMX.Platform.Android,
    Androidapi.AppGlue, Androidapi.Looper,
    Posix.UniStd, Posix.Errno, Posix.StrOpts, Posix.PThread;

const
    WM_USER         = 1024;

type
    LPARAM  = NativeInt;
    WPARAM  = NativeInt;
    LRESULT = NativeInt;

    TMessage = record
        Msg    : NativeInt;
        WParam : WPARAM;
        LParam : LPARAM;
        Result : LRESULT;
    end;
    TMessageHandler = procedure (var Msg: TMessage) of object;

    TMessagingSystem = class(TComponent)
    protected
        FPipeFD    : TPipeDescriptors;
        FData      : Byte;
        FHandlers  : TDictionary;
        FLastError : String;
        FCritSect  : TCriticalSection;
        procedure HandleMessage(var Msg : TMessage);
        function  CreatePipe: Integer;
        procedure ClosePipe;
        procedure InstallEventHandler;
        procedure UninstallEventHandler;
    public
        constructor Create(AOwner : TComponent); override;
        destructor  Destroy; override;
        function RegisterMessageHandler(uMsg    : NativeInt;
                                        Handler : TMessageHandler) : Boolean;
        function PostMessage(uMsg   : NativeInt;
                             WParam : WPARAM;
                             LParam : LPARAM) : Boolean;
        property LastError : String read FLastError;
    end;

    HWND   = TMessagingSystem;

function GetCurrentThreadID : TThreadID;

implementation

function LooperCallback(
    FileDescriptor : Integer;
    Events         : Integer;
    Data           : Pointer): Integer; cdecl;
var
    Len : Integer;
    Msg : TMessage;
    Obj : TMessagingSystem;
begin
    Result := 1;
    // Data contains a reference to our class
    if Data = nil then
        Exit;
    // Ready to cast to our class
    Obj := TMessagingSystem(Data);
    // Check if it's our ReadDes
    Obj.FCritSect.Enter;
    try
        if FileDescriptor <> Obj.FPipeFD.ReadDes then
            Exit;
    finally
        Obj.FCritSect.Leave;
    end;

    while TRUE do begin
        Len := __read(FileDescriptor, @Msg, SizeOf(Msg));
        if Len <= 0 then
            break;
        Obj.HandleMessage(Msg);
    end;
end;

{ TMessagingSystem }

constructor TMessagingSystem.Create(AOwner: TComponent);
begin
    inherited Create(AOwner);
    FCritSect  := TCriticalSection.Create;
    FHandlers  := TDictionary.Create;
    CreatePipe;
    InstallEventHandler;
end;

destructor TMessagingSystem.Destroy;
begin
    UninstallEventHandler;
    ClosePipe;
    FreeAndNil(FCritSect);
    inherited Destroy;
end;

function TMessagingSystem.CreatePipe: Integer;
var
    Status  : Integer;
    Val     : Integer;
const
    FIONBIO = $5421;
begin
    FCritSect.Enter;
    try
        if (FPipeFD.ReadDes <> 0) or (FPipeFD.WriteDes <> 0) then begin
            FLastError := 'Pipe already created';
            Result := -1;
            Exit;
        end;
        Status := Pipe(FPipeFD);
        if Status = -1 then begin
            Result := errno;
            FLastError := 'Pipe() failed. Error #' + IntToStr(Result);
        end
        else begin
            Result := 0;
            Val := 1;
            if ioctl(FPipeFD.ReadDes, FIONBIO, @Val) = -1 then begin
                Result := errno;
                FLastError := 'ioctl(FIONBIO) failed. Error #' + IntToStr(Result);
                Exit;
            end;
        end;
    finally
        FCritSect.Leave;
    end;
end;

procedure TMessagingSystem.ClosePipe;
begin
    FCritSect.Enter;
    try
        if FPipeFD.ReadDes <> 0 then begin
            __close(FPipeFD.ReadDes);
            FPipeFD.ReadDes  := 0;
        end;
        if FPipeFD.WriteDes <> 0 then begin
            __close(FPipeFD.WriteDes);
            FPipeFD.WriteDes := 0;
        end;
    finally
        FCritSect.Leave;
    end;
end;

procedure TMessagingSystem.InstallEventHandler;
var
    AndroidApp : PAndroid_app;
    Data       : Pointer;
const
    LOOPER_ID_MESSAGE_OVERBYTE = LOOPER_ID_USER;
begin
    AndroidApp := GetAndroidApp;

    Data := Self;
    ALooper_addFd(AndroidApp.looper,
                  FPipeFD.ReadDes,
                  LOOPER_ID_MESSAGE_OVERBYTE,
                  ALOOPER_EVENT_INPUT,
                  LooperCallback,
                  Data);
end;

procedure TMessagingSystem.UninstallEventHandler;
var
    AndroidApp : PAndroid_app;
begin
    FCritSect.Enter;
    try
        if FPipeFD.ReadDes <> 0 then begin
            AndroidApp := GetAndroidApp;
            ALooper_removeFd(AndroidApp.looper, FPipeFD.ReadDes);
        end;
    finally
        FCritSect.Leave;
    end;
end;

function TMessagingSystem.RegisterMessageHandler(
    uMsg    : NativeInt;
    Handler : TMessageHandler): Boolean;
begin
    FCritSect.Enter;
    try
        FHandlers.AddOrSetValue(uMsg, Handler);
    finally
        FCritSect.Leave;
    end;
    Result := TRUE;
end;

function TMessagingSystem.PostMessage(
    uMsg   : NativeInt;
    WParam : WParam;
    LParam : LParam): Boolean;
var
    Msg : TMessage;
begin
    Result := FALSE;
    FCritSect.Enter;
    try
        if FPipeFD.WriteDes = 0 then begin
            FLastError := 'Pipe is not open';
            Exit;
        end;
        Msg.Msg    := uMsg;
        Msg.WParam := WParam;
        Msg.LParam := LParam;
        Msg.Result := 0;

        if __write(FPipeFD.WriteDes, @Msg, SizeOf(Msg)) = -1 then begin
            FLastError := 'write() failed. ErrCode=' + IntToStr(errno);
            Exit;
        end;
    finally
        FCritSect.Leave;
    end;
    Result := TRUE;
end;

procedure TMessagingSystem.HandleMessage(var Msg: TMessage);
var
    Handler : TMessageHandler;
    Status  : Boolean;
begin
    FCritSect.Enter;
    try
        Status := FHandlers.TryGetValue(Msg.Msg, Handler);
    finally
        FCritSect.Leave;
    end;
    if Status then
        Handler(Msg);
end;

function GetCurrentThreadID : TThreadID;
begin
    Result := Posix.PThread.GetCurrentThreadID;
end;

end.

In that code, you’ll find a few data types frequently used in Windows applications. I used the same data types for compatibility with existing code.

TMessagingSystem class is very simple. Basically, it registers a pipe read handle with the looper API with an associated callback function. It also maintains a dictionary of message handlers. The key is the message number. The looper API also carries one pointer for you. It will give it back as an argument of the callback function. Here the pointer is used as a reference to the class instance, making is available when the callback function is called.

A critical section is used to avoid problems accessing the class data from several threads at the same time. Using this critical section makes the class fully thread safe.

Implementation for Windows

The Windows implementation makes obviously use of Windows own messaging API. There is no queue in the class because Windows queue is used.

FireMonkey forms does not provide any support for custom messages. This is not really a problem because a FireMonkey forms are just a Windows window. As any window, a FireMonkey form running on Windows has a HWND (Handle of WiNDow) and a window procedure handling all messages for the window.

To hook into this system, we must use standard Windows programming. By standard I mean it has always existed as far as I remember. What we need is to “subclass” the window. And surprisingly, this is very easy!

Windows internally maintain a structure for each window. In that structure you have all informations required for Windows to handle the window. This includes the pointer to the window procedure.

And Windows provides a function to access his internal structure. Our problem is just to get the current pointer to the window procedure and replace it with a pointer to our own procedure. From our own procedure, we will call the original procedure, or not. Our own window procedure has access to all messages sent/posted to the window, including those we add.

We have just one small problem: Windows does not know anything about a Delphi class instance. A window procedure is a simple procedure, not an object method. The problem is to get hand on our TMessagingSystem class instance from our own window procedure.

Fortunately Windows is incredibly well designed. We, as developer, can associate with any window a small piece of data called an “Atom” in Windows terminology. Once an “Atom” is created (It just has a name), you can associate the atom with any window along with a piece of data. That piece of data will be the reference to our TMessagingSystem class instance.

When called by Windows, our window procedure receives the handle of the window. We use it to fetch the piece of data we associated using the atom. From there we have access to TMessagingSystem class instance and check for the message to handle. if it is one of our registered messages, we just call the handler. If not one of our messages, the the original window procedure is called.

Here is the source code:

{$IFNDEF OVERBYTE_INCLUDE_MODE}
unit FMX.Overbyte.Windows.MessageHandling;
{$ENDIF}

interface

uses
    WinApi.Windows, WinApi.Messages,
    System.Classes, System.SysUtils, System.SyncObjs,
    Generics.Collections,
    FMX.Forms, FMX.Platform.Win;

const
    WM_USER = WinApi.Messages.WM_USER;

type
    TMessage        = WinApi.Messages.TMessage;
    WPARAM          = WinApi.Windows.WPARAM;
    LPARAM          = WinApi.Windows.LPARAM;
    TMessageHandler = procedure (var Msg: TMessage) of object;
    TWndProc        = function (hwnd   : HWND;
                                uMsg   : UINT;
                                wParam : WPARAM;
                                lParam : LPARAM): LRESULT; stdcall;

    TMessagingSystem = class(TComponent)
    protected
        FHWnd             : HWND;
        FHandlers         : TDictionary;
        FOriginalWndProc  : TWndProc;
        FLastError        : String;
        FCritSect         : TCriticalSection;
    public
        constructor Create(AOwner : TComponent); override;
        destructor  Destroy; override;
        function RegisterMessageHandler(uMsg    : NativeInt;
                                        Handler : TMessageHandler) : Boolean;
        function PostMessage(uMsg   : NativeInt;
                             WParam : WPARAM;
                             LParam : LPARAM) : Boolean;
        property LastError : String read FLastError;
    end;

function GetCurrentThreadId: DWORD; stdcall;

implementation

var
  MsgSysAtom       : TAtom;
  MsgSysAtomString : String;


function WndProc(hwnd: HWND; uMsg: UINT; wParam: WPARAM; lParam: LPARAM): LRESULT; stdcall;
var
    Msg     : TMessage;
    MsgSys  : TMessagingSystem;
    Handler : TMessageHandler;
    Status  : Boolean;
begin
    // Search if the window handle is associated with TMessageingInstance
    // We know this because we registered an atom for that purpose
    if GlobalFindAtomW(PChar(MsgSysAtomString)) <> MsgSysAtom then begin
        // Not found, just do default processing
        Result := DefWindowProc(hwnd, uMsg, wParam, lParam);
        Exit;
    end;
    // Fetch the atom property and cast it to a TMessagingSystem class
    MsgSys := TMessagingSystem(GetProp(hwnd, MakeIntAtom(MsgSysAtom)));

    // Now use the dictionary to see if the message is one we'll handle
    MsgSys.FCritSect.Enter;
    try
        Status := MsgSys.FHandlers.TryGetValue(uMsg, Handler);
    finally
        MsgSys.FCritSect.Leave;
    end;
    if Status then begin
        // Found the message and his message handler. Call it using
        // the TMessage record to hold the values
        Msg.Msg    := uMsg;
        Msg.WParam := wParam;
        Msg.LParam := lParam;
        Msg.Result := 0;
        Handler(Msg);
        Result := Msg.Result;
    end
    else begin
        // Not one of our messages, just execute original window procedure
        Result := MsgSys.FOriginalWndProc(hwnd, uMsg, wParam, lParam);
    end;
end;

{ TMessagingSystem }

constructor TMessagingSystem.Create(AOwner: TComponent);
begin
    if not (AOwner is TCommonCustomForm) then
        raise Exception.Create('TMessagingSystem.Create failed. Invalid owner');
    inherited Create(AOwner);
    FCritSect  := TCriticalSection.Create;
    FHandlers  := TDictionary.Create;

    // Find window handle corresponding to the owner form
    FHWnd := WindowHandleToPlatform(TCommonCustomForm(AOwner).Handle).Wnd;

    // If not already done, register the atom we'll use to associate
    // our messaging system with the window handle
    if MsgSysAtom = 0 then begin
        MsgSysAtomString := 'OverbyteMessagingSystem' +
                                     IntToHex(GetCurrentProcessID, 8);
        MsgSysAtom       := GlobalAddAtomW(PChar(MsgSysAtomString));
    end;

    // Associate our messaging system with the window handle
    SetProp(FHWnd, MakeIntAtom(MsgSysAtom), THandle(Self));

    // Subclass the form. That is change his handling procedure
    FOriginalWndProc := TWndProc(GetWindowLongPtr(FHWnd, GWLP_WNDPROC));
    SetWindowLongPtr(FHWnd, GWLP_WNDPROC, NativeInt(@WndProc));
end;

destructor TMessagingSystem.Destroy;
begin
    if Assigned(FOriginalWndProc) then begin
        SetWindowLongPtr(FHWnd, GWLP_WNDPROC, NativeInt(@FOriginalWndProc));
        FOriginalWndProc := nil;
    end;
    FreeAndNil(FHandlers);
    FreeAndNil(FCritSect);
    inherited Destroy;
end;

function TMessagingSystem.RegisterMessageHandler(
    uMsg    : NativeInt;
    Handler : TMessageHandler): Boolean;
begin
    FCritSect.Enter;
    try
        FHandlers.AddOrSetValue(uMsg, Handler);
    finally
        FCritSect.Leave;
    end;
    Result := TRUE;
end;

function TMessagingSystem.PostMessage(
    uMsg   : NativeInt;
    WParam : WPARAM;
    LParam : LPARAM): Boolean;
begin
    Result := WinApi.Windows.PostMessage(FHWnd, uMsg, WParam, LParam);
end;

function GetCurrentThreadId: DWORD; stdcall;
begin
    Result := WinApi.Windows.GetCurrentThreadId;
end;

end.

All the code is shown above. If you are interested by the complete project as source code, just drop me a private email.

Follow me on Twitter
Follow me on LinkedIn
Follow me on Google+
Visit my website: http://www.overbyte.be
This article is available from http://francois-piette.blogspot.be

December 9, 2013

Practical Parallel Computing

Parallel computing is a form of computation in which many calculations are carried out simultaneously. This is almost the only way of taking full power of today’s computers.

CPU have reached the point where it is not possible to increase the frequency of operation. The only way to increase computation power is to multiply the CPU. Either fully using multi-CPU computer, or partially using multi-core and hyper threading.

Most of today’s PC are equipped with 4 cores CPU. High end workstations have multiple CPU each with multiple cores and even hyper threading. For example, my own computer is a HP Z600 workstation having two Intel Xeon processors each one having 4 cores and hyper threading. Using Windows task manager, you actually see 16 processors.

Not an easy task

To benefit from multi cores computers, you must write programs so that the problem is divided into smaller ones which are then executed in parallel.

Not all problems can be divided in smaller ones for parallel execution. Numerical computation is one domain where parallelism is often possible. In this article, I will show the programming for a heavy computing problem: exploring the Mandelbrot set. Mandelbrot set is well known. It is mostly shown as a beautiful colorful picture which may take a very long time to compute. Using parallel computing will result in a high performance gain provided you run on a modern computer.

Parallel computer programs are more difficult to write than sequential ones, because concurrency introduces several new classes of potential software bugs.

The most common concurrency problem is when one task writes some shared data which is read by another task. If nothing special is done, then the reader may read partial data! The writer starts writing following data while the reader has not finished reading previous data. This kind of bug is difficult to find because it does not always occur. It depends on how one task runs compared to other task. This is referred as a race condition. Most of the time, it works because the reading and writing are occurring at different times. Sometimes it does not work because reading and writing overlaps.

Multithreading

Multithreading is the way you write parallel programs using traditional languages such as C/C++, Delphi or C# running on traditional (PC) computers. Other languages or other hardware architecture exists, but this is beyond of this article which is related to PC programming.

A thread is an execution path managed by an operating system. Multiple threads are executed simultaneously by a processor when the processor has multiple cores. With a single core processor, you still have multithreading, but they are executed one after the other. Execution switches quickly between each thread, making the human eye see all executing in parallel. They are really executed in parallel with multi core CPU or when multiple CPU are available.

It is the operating system and the language run time that take care of the required housekeeping. As seen from the programming language, a thread looks mostly like a simple function call (or an object’s method call when using an object oriented language). The difference with a normal function call is that when using multithreading, the function returns immediately while the function’s code is executed in parallel with the caller’s code.

Synchronization

To avoid messing everything, the work accomplished by different threads has to be synchronized.

For example, data cannot be read while it is being written by another thread. Several threads can simultaneously read the same data, but when one is writing data, it must be the only one accessing the data.

Another example is computing. Adding a value to a variable must be executed by a thread having exclusive access to that data. This is, at another level, the same issue as we find in multiple user database application. In such application, we use a “record lock” so that a record can be read, some computation done and then written back in an atomic operation. That is, no other user can have access to the data while the first reads-computes-writes.

The very same problem exists with multithreading. Simply the data is not in a database but in memory. Every single variable has to be protected from simultaneous reading and writing. And some operations, such as adding a variable to another one must be done atomically. And by the way, if multiple threads are doing database access, we have to use the exact same “record lock” as we do for a multiple user application.

A last example: frequently, when a problem is divided in multiple parts for parallel execution, something must be done at one point in time to be sure that all parts are done before continuing. Imagine you use parallel processing to create a dataset in memory. You must wait until all threads are done before writing the result to the file.

Operating system and language runtime provide many synchronization mechanisms. For example: critical section, semaphore, mutex, queues and others.

Introduction to Mandelbrot set

Benoît Mandelbrot, who died on 10/14/2010, was a famous mathematician. He studied fractal geometry. He found that chaos can emerge from a very simple equation:

This equation is iteration where c is a constant. We start from a first value X0 and compute X1 using the formula. Using X1 we then compute X2 and so on. This is no fun but easy. Where it becomes interesting is when we do this calculation using complex numbers and when we look what happens after a lot of iterations. Depending on the constant c, we can see that the Xn value increases quickly or not. Sometimes, the value change but never become very large. Even if computing a lot of iterations, we cannot decide whether the value will become very large.

Complex numbers can be represented by two numbers forming the coordinates X and Y on a plane. This is exactly what we use to draw a colorful representation of the Mandelbrot set. Each pixel on the image is one value of the constant c in the iteration above. The color of each pixel is given by the number of iterations computed so that the Xn value becomes large. The number of iterations is limited to a maximum. If the maximum is reached, the pixel is colored in a specific color, usually black or white. The result looks like this:

Using Delphi, the code to compute the value of a given point is as follow:

function ComputeMandelPoint(P, Q : Extended; MaxIter : Integer) : Integer;
var
    X0, Y0, X, Y : Extended;
    R, M         : Extended;
    K            : Integer;
begin
    M  := 100;
    K  := 0;
    X0 := 0;
    Y0 := 0;

    while TRUE do begin
        X  := X0 * X0 - Y0 * Y0 + p;
        Y  := 2.0 * X0 * Y0 + q;
        K  := K + 1;

        R := X * X + Y * Y;
        if R > M then
            break
        else if K >= MaxIter then begin
            K := MaxIter + 1;
            break;
        end;
        X0 := X;
        Y0 := Y;
    end;
    Result := K;
end;

In this code, P and Q are the coordinates of one point in the complex plane. The image is computed for P in the range [-2.4 … 1.0] and Q in the range [-1.3 … 1.3] and a maximum number of iterations of 1000. Coloring is done using a continuous palette composed of 48 shades of blue like this:

Since the value of each pixel is 0 to the maximum number of iteration, the palette is used cyclically. An offset of 25 is given so that the white limit is positioned nicely.

The code here above implements the formula we saw earlier. The code layout may looks strange. It has been done like that for speed optimization.

Building an image from the Mandelbrot set

The image showed above is quite easy to obtain. Here is the set of operations
1. Create a bitmap of a given resolution (1024 x 768 pixel for example)
2. Associate each bitmap pixel to the couple P, Q according to the area you want to display. In the image above, X on the bitmap (0..1023) is mapped to P = [-2.4 … 1.0] and Y on the bitmap (0..767) is mapped to Q = [-1.3 … 1.3].
3. Iterate all X, Y on the bitmap to compute the value of the pixel as a number of iterations.
4. Map the pixel value in iteration to the color value according to the color palette you like.

Mapping bitmap pixel to Mandelbrot set point

The process of step 2 above is named “mapping”. In math language, it is a simple coordinate transformation from one coordinate system (bitmap) to another one (Mandelbrot set).

Graphically, we can represent a graphic with one system on the abscissa (horizontal axis) and on the ordinate (vertical axis) we have the other system:

We can express the slope of the line as seen from X, X1, Y, Y1 and from X2, X1, Y2, Y1 which are obviously the same:

We can rewrite this equation to have the Y value based on all others:

Let’s say we want to transform the X coordinate on a bitmap to the equivalent Y coordinate in the Mandelbrot set. For example, on the bitmap shown above we go from pixel 0 to pixel 1023 and on the Mandelbrot set we go from -2.4 on the left to 1.0 on the right. This gives:

X1 = 0, X2 = 1023, Y1 = -2.4, Y2 = 1.0

Now applying our equation, given an arbitrary pixel coordinate X, we can compute the equivalent value in the Mandelbrot set:

Mandel := -2.4 + (Bitmap – 0) * (1.0 - -2.4) / (1023 – 0)

Note that (Y2 – Y1) / (X2 – X1) which is the slope of the line in the above diagram is frequently named zoom factor.

We can do similar computation for the pixel coordinate Y, using the same zoom factor so that the image is not distorted.

If we put all this in a simple Delphi program, we will get the following code:

procedure TMandelSimpleForm.FormShow(Sender: TObject);
var
    Bitmap : TBitmap;
    X, Y   : Integer;
    P, Q   : Extended;
    V      : Integer;
    Slope  : Extended;
    Colors : array of TColor;
    T0     : Cardinal;
const
    MandelLeft   = -2.4;
    MandelRight  = 1.0;
    MandelBottom = -1.3;
    MaxIter      = 1000;
begin
    SetLength(Colors, 11);
    Colors[ 0] := RGB( 47,  47, 255);
    Colors[ 1] := RGB( 21,  21, 255);
    Colors[ 2] := RGB(  0,   0, 255);
    Colors[ 3] := RGB(255, 255, 255);
    Colors[ 4] := RGB(229, 229, 255);
    Colors[ 5] := RGB(203, 203, 255);
    Colors[ 6] := RGB(177, 177, 255);
    Colors[ 7] := RGB(151, 151, 255);
    Colors[ 8] := RGB(125, 125, 255);
    Colors[ 9] := RGB( 99,  99, 255);
    Colors[10] := RGB( 73,  73, 255);

    Bitmap             := TBitmap.Create;
    Bitmap.PixelFormat := TPixelFormat.pf24bit;
    Bitmap.Width       := Image1.Width;
    Bitmap.Height      := Image1.Height;

    Slope := (MandelRight - MandelLeft) / (Bitmap.Width - 1);

    T0 := GetTickCount;
    for Y := 0 to Bitmap.Height - 1 do begin
        for X := 0 to Bitmap.Width - 1 do begin
            P := MandelLeft   + X * Slope;
            Q := MandelBottom + Y * Slope;
            V := ComputeMandelPoint(P, Q, MaxIter);
            Bitmap.Canvas.Pixels[X, Y] := Colors[V mod Length(Colors)];
        end;
    end;
    Image1.Picture.Bitmap := Bitmap;
    T0 := GetTickCount - T0;
    Bitmap.Free;
    ShowMessage('Computed in ' + IntToStr(T0) + ' mS');
end;

This code is really complete. We don’t need more to have a nice Mandelbrot set shown on screen. To create the application, use the following steps:

Create a new VCL forms application
Enlarge the form so that a 1024x768 bitmap will fit
Drop a TImage on the form, make it 1024x768
Add the code above to the form’s OnShow even.
Save, compile and run.

On my computer, it takes 4400 mS to build the bitmap. The parallel computing version will run in 690 mS for the same bitmap. This is a 6 fold improvement. It deserves the effort. Those numbers are for a Win32 application. When rebuilt as a Win64 application, times are divided by 2.

Parallel computing of a Mandelbrot set

Overview of required steps:

Create a large bitmap to hold the full final image of the Mandelbrot set
Divide the total area into a lot of smaller areas
Start a number threads giving each one area to compute
When a thread finishes his area, submit the area for “painting” and give a new area, if any remains
When there is no more area to give to a thread, terminate the thread
When all threads are terminated, all areas are computed.
Since painting is incredibly faster than computing, short after the last thread finishes, the last area is painted and the full bit map is ready

Dividing Mandelbrot set computation in smaller chunks is easy: divide the bitmap into areas and delegate the computing for those areas to a number of threads.

In my Mandelbrot Explorer, I divided the bitmap as 128x128 pixel areas. If the image size is not a multiple of 128x128, then some areas are smaller and may even be a single pixel.

The areas are described by a class instance with all required parameters. All areas are kept in an object list.

Then a number of threads are created, for example 10 threads, and kept in an object list. Each thread receives one area to process. When it has finished processing the assigned area, the next available area is given to the thread. If no more area is available, the thread is terminated, removed from the list and freed.

Each thread computes the number of iterations for each pixel. Coloring is done by the main thread after computation is done. This is interesting because you can change color mapping without computing again. Coloring is quite fast: on my machine, it is almost instantaneous.

Submitting an area for coloring is done using a custom message posted to the message queue. We need to use that technique because submission is executed in the thread’s context and we want the main thread to do the coloring. Using custom messages is an easy way of inter-thread communication.

The code

The class describing each area is as follow:

TMandelArea = class
public
    Rect     : TGPRect;           // Area rectangle in full bitmap
    PixValue : array of Integer;  // Computed pixel value
    Zoom     : Extended;          // To compute (P, Q) from bitmap coord
    XOffset  : Extended;          // To compute (P, Q) from bitmap coord
    YOffset  : Extended;          // To compute (P, Q) from bitmap coord
    Colors   : TList;     // The list of colors for coloring
    MaxIter  : Integer;           // Max iterations when computing
    Index    : Integer;           // Index in areas list
    procedure Compute;            // Compte all pixels in area
    // Colorize the pixel in the area and draw the area into full bitmap
    procedure Colorize(FullBitmap  : IGPBitmap;
                       ColorInside : TColor;
                       ColorDiv    : Integer;
                       ColorOff    : Integer);
end;

TMandelArea is a very simple class. It has all the fields with parameters we talked about and two methods. One for computing the pixel values and one for colorize the pixels and transfer the result in the full bitmap.

What I named “FullBitmap” is an interface coming from GDI+. Windows GDI+ is a class-based API that provides two-dimensional vector graphics, imaging, and typography. It is faster than Windows GDI API which is used by Delphi runtime.

TMandelArea.Compute is quite trivial. It looks much like the method we have seen above in the simple Mandelbrot application. Here it is:

procedure TMandelArea.Compute;
var
    X, Y       : Integer;
    P, Q       : Extended;
    Index      : Integer;
begin
    SetLength(PixValue, Rect.Width * Rect.Height);
    Index := 0;
    for Y := 0 to Rect.Height - 1 do begin
        Q := (Y + Rect.Y) / Zoom + YOffset;
        for X := 0 to Rect.Width - 1 do begin
            P := (X + Rect.X) / Zoom + XOffset;
            PixValue[Index] := ComputeMandelPoint(P, Q, MaxIter);
            Inc(Index);
        end;
    end;
end;

Basically, the method just iterates thru all pixels in the area, transforms the coordinates from bitmap system to Mandelbrot set system and calls the function we have seen above to compute the pixel value. That value is stored in an array of integer. PixValue array is dimensioned before computing.

Computation has nothing to do with creating the result bitmap and coloring its pixels. The resulting bitmap, which will be shown on screen and saved to disk as JPEG, is created before anything else. I use a GDI+ bitmap which is handled thru in interface.

The programmer can access the data of a GDI+ bitmap. For that, it has to call “LockBits” which returns a record having among other things the pointer to actual bitmap data. For better performance, the lines of a bitmap data is always aligned to memory locations. See how Row is calculated below to see how to handle that.

Bitmap data is organized as RGB value, that is 3 bytes for red, green and blue components of each pixel color. In Delphi, a TColor is simply an integer value with the 3 bytes. This is why you see code like: Row^ := Byte(ColorValue shr 16) to extract one of the bytes and copy it to the bitmap memory.

procedure TMandelArea.Colorize(
    FullBitmap  : IGPBitmap;
    ColorInside : TColor;
    ColorDiv    : Integer;
    ColorOff    : Integer);
var
    BitmapData : TGPBitmapData;
    X, Y       : Integer;
    Row        : PByte;
    ColorValue : TColor;
    PixIndex   : Integer;
begin
    BitmapData := FullBitmap.LockBits(Rect, [ImageLockModeWrite],
                                      FullBitmap.PixelFormat);
    PixIndex := 0;
    for Y := 0 to Rect.Height - 1 do begin
        Row := PByte(BitmapData.Scan0) + (Y * BitmapData.Stride);
        for X := 0 to Rect.Width - 1 do begin
            ColorValue := ColorMandelPoint(PixValue[PixIndex],
                                           MaxIter,
                                           ColorInside,
                                           FALSE, 0,
                                           TRUE,
                                           ColorDiv,
                                           ColorOff,
                                           Colors);
            Row^ := Byte(ColorValue shr 16);
            Inc(Row);
            Row^ := Byte(ColorValue shr 8);
            Inc(Row);
            Row^ := Byte(ColorValue);
            Inc(Row);
            Inc(PixIndex);
        end;
    end;
    FullBitmap.UnlockBits(BitmapData);
end;

The thread class:

TMandelThread = class(TThread)
strict private
    FArea       : TMandelArea;
    FNewArea    : TMandelArea;
    FOnComputed : TComputedEvent;
public
    procedure Execute; override;
    property Area       : TMandelArea    read  FArea
                                         write FArea;
    property OnComputed : TComputedEvent read  FOnComputed
                                         write FOnComputed;
end;

implementation

procedure TMandelThread.Execute;
begin
    while Assigned(FArea) do begin
        FArea.Compute;
        FNewArea := nil;
        if Assigned(FOnComputed) then
            FOnComputed(Self, FArea, FNewArea);
        FArea := FNewArea;
    end;
end;

This thread class is incredibly straightforward. It has a single property “Area” which is the area to be computed, a single event “OnComputed” to be called when computation is done and one method “Execute” which is the thread’s entry point.

Execute method is a loop calling Compute method on behalf of the area to be computed and then trigger the OnComputed event to request the next area to compute. The event passes the current computed area (which will be submitted for coloring) and receives, as a var parameter, the new area to compute. If no new area is available, it receives a nil value and the thread terminates the loop.

The main code

The main code must create all the areas, all the threads, allocate areas to threads, colorize areas when computed by each thread and render the result on screen optionally saving it to a jpeg file.

Of course the main code also handles the user interface. I built a nice user interface to help the user explore the Mandelbrot set. I will not describe it here because it is just simple and easy user interface programming.

I will just show and describe the user interface:

The tool bar has edit fields to select the bitmap size, the maximum number of iteration, the number of threads to use for computation and color divisor and offset for coloring.

The buttons are to start computing, navigate to previous position, navigate to the next position, show the full Mandelbrot set, change the color palette, save the bitmap to disk file, change the inside color, zoom in and zoom out.

Using the mouse, the user may draw a rectangular area which is then computed. A double click anywhere on the bitmap recomputes a new bitmap centered on the double click point.

On the status bar, you see the time required computing the displayed image, the zoom factor, the coordinates of the image center and the cursor position coordinates with number of iterations at that point.

When the window is resized, the bitmap is as well. The screen always shows the complete bitmap, even if you select a very large bitmap.

If you load a new color palette, the bitmap is colored without computing again. This is almost instantaneous, even on large bitmaps.

The most interesting parts of the code are the method “MandelCreateImage” and the two event handlers for the thread.

procedure TMandelbrotExplorerForm.MandelCreateImage;
var
    Graphics  : IGPGraphics;
    Area      : TMandelArea;
    X, Y      : Cardinal;
    I         : Integer;
    NewThread : TMandelThread;
begin
    // Just do nothing if already terminating
    if FTerminate then
        Exit;

    // We cannot create a new image if threads are still computing
    // We set a flag so that no new area are given to thread and
    // defer processing until all threads are done
    if FThread.Count > 0 then begin
        FTerminate := TRUE;
        PostMessage(Handle, WM_TERMINATING, 0, 0);
        Exit;
    end;

    // Get parameters from user interface, use existing if invalid values
    FMaxIter         := StrToIntDef(MaxIterEdit.Text,  FMaxIter);
    FMandelWidth     := StrToIntDef(WidthEdit.Text,    FMandelWidth);
    FMandelHeight    := StrToIntDef(HeightEdit.Text,   FMandelHeight);
    FThreadCount     := StrToIntDef(ThreadsEdit.Text,  FThreadCount);

    // Validate all parameters
    if FMaxIter < 1 then
        FMaxIter := 1;
    if FMandelWidth < 1 then
        FMandelWidth := 1;
    if FMandelHeight < 1 then
        FMandelHeight := 1;
    if FThreadCount < 1 then
        FThreadCount := 1;

    // Send values back to user interface to replace invalid values
    MaxIterEdit.Text  := IntToStr(FMaxIter);
    WidthEdit.Text    := IntToStr(FMandelWidth);
    HeightEdit.Text   := IntToStr(FMandelHeight);
    ThreadsEdit.Text  := IntToStr(FThreadCount);

    FXOffset          := FCenter.X - (FMandelWidth div 2) / FZoom;
    FYOffset          := FCenter.Y - (FMandelHeight div 2) / FZoom;

    FFullBitmap       := TGPBitmap.Create(FMandelWidth, FMandelHeight,
                                          PixelFormat24bppRGB);
    FBackBrush        := TGPSolidBrush.Create(
                                 TGPColor.Create(GetRValue(clSilver),
                                                 GetGValue(clSilver),
                                                 GetBValue(clSilver)));
    Graphics          := TGPGraphics.FromImage(FFullBitmap);
    Graphics.FillRectangle(FBackBrush,
                           0, 0,
                           FFullBitmap.Width, FFullBitmap.Height);
    // Divide the image into AreaSize x AreaSize pixels
    FAreas.Clear;
    Y := 0;
    while Y < FFullBitmap.Height do begin
        X := 0;
        while X < FFullBitmap.Width do begin
            Area             := TMandelArea.Create;
            Area.Index       := FAreas.Add(Area);
            Area.Colors      := FColors;
            Area.MaxIter     := FMaxIter;
            Area.Zoom        := FZoom;
            Area.XOffset     := FXOffset;
            Area.YOffset     := FYOffset;
            Area.Rect.X      := X;
            Area.Rect.Y      := Y;
            Area.Rect.Width  := Min(AREA_SIZE, FFullBitmap.Width - X);
            Area.Rect.Height := Min(AREA_SIZE, FFullBitmap.Height - Y);
            // Make last rectangle on row or column has correct size
            if (Area.Rect.X + Area.Rect.Width) >= Integer(FFullBitmap.Width) then
                 Area.Rect.Width := Integer(FFullBitmap.Width) - Area.Rect.X;
            if (Area.Rect.Y + Area.Rect.Height) >= Integer(FFullBitmap.Height) then
                 Area.Rect.Height := Integer(FFullBitmap.Height) - Area.Rect.Y;
            Inc(X, AREA_SIZE);
        end;
        Inc(Y, AREA_SIZE);
    end;

    // Create all threads, but don't start them
    // Thread cannot be started because it would interfere with area allocation
    FAreaIndex := 0;
    for I := 1 to FThreadCount do begin
        NewThread                 := TMandelThread.Create(TRUE);
        NewThread.Area            := FAreas[FAreaIndex];
        NewThread.OnComputed      := MandelThreadComputed;
        NewThread.OnTerminate     := MandelThreadTerminate;
        NewThread.FreeOnTerminate := TRUE;
        FThread.Add(NewThread);
        Inc(FAreaIndex);
        if FAreaIndex >= FAreas.Count then
            break;
    end;

    // Set the counter pf to be colorized areas
    FColorizeCount := FAreas.Count;

    // Now start all thread, in reverse order. Reverse order is required
    // because threads may terminate and be removed from list, messing the
    // indexing.
    FStartTime := GetTickCount;
    for I := FThread.Count - 1 downto 0 do
        FThread[I].Start;
end;


// Warning: this is called in the context of the threads
procedure TMandelbrotExplorerForm.MandelThreadComputed(
    Sender        : TObject;
    const CurArea : TMandelArea;
    var   NewArea : TMandelArea);
begin
    if FTerminate then
        Exit;
    // We must pass the index instead of the pointer because the message
    // handling can be done much later when the area has been reallocated
    PostMessage(Handle, WM_COLORIZE, 0, CurArea.Index);
    FCritSect.Enter;
    if FAreaIndex < FAreas.Count then begin
        NewArea := FAreas[FAreaIndex];
        Inc(FAreaIndex);
    end;
    FCritSect.Leave;
end;


procedure TMandelbrotExplorerForm.MandelThreadTerminate(Sender : TObject);
begin
    // Simply remove the terminated thread from the thread list.
    FThread.Remove(Sender as TMandelThread);
end;

The FCritSect variable is very important. This is one of the synchronization mechanisms I talked above. A critical section works like this: only one thread at a time can execute the “Enter” method. Once the “Enter” method has been executed by a thread, any other thread calling “Enter” is made sleeping. When the “Leave” method is called, one and only one of the sleeping threads is awaken and continue execution. When this thread executes the “Leave” method, another thread is awaken and so forth.

A critical section MUST be used here because we increment FAreaIndex and use it to get the next area for the thread calling MandeThreadComputed.

Another part of the code requires some explanation. A flag “FTerminate” is used to signal the user wants to terminate the current computing. It is set my MandelCreateImage itself when called before all threads are done.

Since we cannot stop the thread immediately (It takes some time to stop a thread), we use another custom message WM_TERMINATING. The message handler will check if all thread are done. If not, it sleeps 0.25 sec and then post the message again. If all threads are done, it calls MandelCreateImage which will start a new image creation.

Here the message handler code:

procedure TMandelbrotExplorerForm.WMTerminating(var Msg: TMessage);
var
    TmpMsg : TMsg;
begin
    if FThread.Count <= 0 then begin
        // No more threads are computing.
        // Remove any WM_COLORIZE from message queue
        while PeekMessage(TmpMsg, Handle, WM_COLORIZE, WM_COLORIZE, 1) do;
        FTerminate := FALSE;
        MandelCreateImage;
    end
    else begin
        Sleep(250);
        PostMessage(Handle, WM_TERMINATING, 0, 0);
    end;
end;

The final word

I hope you enjoyed reading this article. All in all, the code is fairly simple. Actually the code for the user interface is longer!

If you are interested by the source code (Delphi XE5), just drop me a private email.

References:

Parallel computing: http://en.wikipedia.org/wiki/Parallel_computing
Multithreading: http://en.wikipedia.org/wiki/Multithreading_(software)#Multithreading
Synchronization objects: http://msdn.microsoft.com/en-us/library/windows/desktop/ms686364(v=vs.85).aspx
Mandelbrot set: http://en.wikipedia.org/wiki/Mandelbrot_set
GDI+: http://msdn.microsoft.com/en-us/library/windows/desktop/ms533798(v=vs.85).aspx
GDI+ for Delphi : http://www.bilsen.com/gdiplus

Follow me on Twitter
Follow me on LinkedIn
Follow me on Google+
Visit my website: http://www.overbyte.be
This article is available from http://francois-piette.blogspot.be

December 2, 2013

Mandelbrot Fractal Explorer

I wrote a nice Mandelbrot fractal explorer application. It is written using Delphi XE5 and use extended precision floating point calculation to be able to produce ultra high zoom images.

You can create bitmaps much larger than the screen size. This way you can have a high resolution printout.

The program use multithreading to take advantage for multicore CPU. Computation is quite fast.

You can download the utility from http://www.overbyte.be/arch/dump/MandelbrotExplorer.zip The zip file contains the executable as well as a number of color palette files (You can edit it with notepad if you like).

If there is some interest, I could write another blog article about the design of the application. Just add a comment to this blog post if you are interested.

The program make use of multithreading to compute several parts of the image simultaneously. This makes huge speed improvement for multicore CPU.

Also, GDI plus is used for graphic rendering which make the display very fast, even for ultra high bitmap resolution.

Follow me on Twitter
Follow me on LinkedIn
Follow me on Google+
Visit my website: http://www.overbyte.be
This article is available from http://francois-piette.blogspot.be

February 23, 2013

High speed generic queue class

This article presents a generic class implementing a high speed queue. It has been specially designed for communication use but can be used for anything else.

Reading this article, you'll learn:

How to design a generic class
How to use nested data type
How to implement an enumerator
How to make a class thread safe
How to use variant record
How to use methods in a record

The code itself is quite short! Thanks to the efficient Delphi language.

High Performance

About high speed: Initially, I had an issue with performance in a communication system where a process was producing messages to be handled and another process was processing the messages. The two processes are totally asynchronous. I use the term "process" not to refer to different programs, but to different units of a single program.

The bottle neck was coming from memory allocation. Record where allocated by the producer process and freed by the processing processes. This resulted in a large amount of memory being allocated freed.

I designed this class to avoid as much as possible memory allocation. The design makes use of a several doubly linked lists of "buffers". The buffers are actually the data type specified by the generic type. Usually it is a record with a structure specific to the messages exchanged between the two process. If the application requires several types of buffers, then you should use a variant record to hold all types. We will see an example later.

High speed comes also from the fact that you don't copy data except when really forced to. To avoid copying data, we use pointers. Data stay where it is, we access it thru his address. The compiler takes care of the details for us.

Doubly linked list as a queue

A queue is a data structure in which you can append items (aka "node") at one end and remove items at the other end. This scheme is frequently named "First in - First out" or FIFO for short. You have another variant which append and remove items at the same end. That one is named LIFO which stands for "Last in - First out". In the application I described above, I need a FIFO queue.

FIFO queues can be implemented in a lot of different ways. Here I selected a doubly linked list because in my context this is the fastest way doing it.

A doubly linked list is a very classic data structure. Basically a doubly linked list is a variable number of items (or nodes) of any data type. One item is linked to the next using a pointer and linked to the previous using another pointer. There are two more separate pointers: one point to the first item and one point to the last item. To append items, we use the pointer to the last item and to remove items, we use the pointer to the first item.

You can find a lot of articles about it everywhere on the internet. So I will only describe what make my implementation specific.

High performance revisited

Now that we have a doubly linked list in hand, we will use it to achieve high performance. We will use 3 linked lists to avoid memory allocation:

Linked list of active items
Linked list of inactive items
Linked list of acquired items

Active items are those referring to the items added to the queue and not removed yet from the queue. They are actively waiting to be processed.

Inactive items are those having already be processed. Instead of freeing the processed items, we put it in the inactive queue where they are waiting for reuse. When a new active item has to be created, before allocating one new item, the inactive linked list is checked and if not empty, an item is removed from the list and reused, that is moved to the end of the active list. If the inactive list is empty, then a new item is allocated. This has a tremendous impact on the performance: adding a large number of items take 30 mS the first time when they need to be allocated. Later, when they can be reused, it only takes 17 mS!

Finally, the acquired items linked list is used when there are several processes consuming the same queue. For example when we have a multithread application and several threads are fetching messages from the same active queue. The third list is used to store one item extracted from the active list while it is processed so that another thread won't take the same item. After it has been processed, the "acquired" item is moved to the inactive list using a method I named Release.

Generic class use

A generic class is a class which has a variable part not known when designing the class. In our case, the variable part is a record. I named the generic class TCommQueue.

To use TCommQueue, you have to declare a record type and a variable. Let's assume the record type is:

  TCommFrame = packed record
    Name : array [0..50] of Char;
    Size : Integer;
  end;

This is a totally arbitrary record made simple for this article. In a real application, this record will be much more complex with all the members required for your application.

The variable representing the queue is declared as follow:

    FQueue : TCommQueue<TCommFrame>;

It is probably a member variable of a form in your application or in a component, data module or whatever you need. It doesn't matters in fact. TCommQueue inherit from TObject so before use, you must create it and after use you must free it. In most applications, you will do that in the constructor and destructor of your form, data module or component. A more complete source code is as follow:

type
    TForm1 = class(TForm)   
    protected
        FQueue : TCommQueue<TCommFrame>;
    public
        constructor Create(OAwner: TComponent); override;
        destructor Destroy; override
    end;

implementation

constructor TForm1.Create(AOwner : TComponent);
begin
    inherited;
    FQueue := TCommQueue<TCommFrame>.Create;
end;

destructor TForm1.Destroy;
begin
    FreeAndNil(FQueue);
    inherited;
end;

You see that using a generic type is very easy: you use the generic type name (TCommQueue here) and append your own data type in angle brackets ( here).

The compiler will compile the generic class by replacing the parameter type with the one you supplied.

Generic class declared

Now it's time to see the declaration of TCommQueue generic class. Remember we are using doubly linked list which are defined here as nested data type.

Nested data types have been introduced with Delphi XE3. See online help at http://docwiki.embarcadero.com/RADStudio/XE3/en/Nested_Type_Declarations

Nested data types are not related to generics, but are very handy in that case because nested data types within a generic class may make use on the parameter data type.

type
    // Forward declarations
    TCommQueue<T : record>     = class;
    TCommQueueEnum<T : record> = class;

    TInitCallBack = procedure (Sender : TObject;
                               Item   : Pointer) of object;

    TCommQueue<T : record> = class(TObject)
    // This nested type actually defines a doubly linked list of <T>
    type
        TItemPtr = ^T;          // Pointer to <T>
        PLLNode  = ^TLLNode;    // Pointer to a linked list node
        TLLNode  = record       // Doubly linked list cell
            Item : T;           // Item /must/ be the first node member
            Next : PLLNode;     // Pointer to the next node
            Prev : PLLNode;     // Pointer to the previous node
        end;
        TLinkedList = record    // Doubly linked list
            First : PLLNode;    // Pointer to the first node in the list
            Last  : PLLNode;    // Pointer to the last node in the list
            Count : Integer;    // Number of nodes in the list
            procedure Add(Item : PLLNode);
            function  ExtractLast : PLLNode;
            procedure Extract(Item : PLLNode);
            procedure FreeAllItems; inline;
        end;
    protected
        FActiveList   : TLinkedList;      // List of active items
        FInactiveList : TLinkedList;      // List of inactive items
        FAcquiredList : TLinkedList;      // List of acquired items
        FCurrent      : TItemPtr;         // The current item
        FCritSection  : TCriticalSection; // For multithread support
        // Enum is called back from the enumerator class
        // Give an item pointer, it returns the next item pointer or nil
        function  Enum(Item : Pointer) : Pointer;
        // GetItem is called back from the enumerator class
        // Get an item from the list. The item is COPYED while all other
        // functions make use of pointers.
        function  GetItem(ItemPt : Pointer) : T;
    public
        constructor Create;
        destructor Destroy; override;
        // Required method to supprt for..in contruct (not thread safe)
        function  GetEnumerator : TCommQueueEnum;
        // Add a new item, reusing an old one of any and return a pointer to it
        function  Add : TItemPtr;
        function  Add(InitCallBack : TInitCallBack) : TItemPtr; overload;
        // Remove an item and return pointer to the next
        // Current is not affected unless it is the one removed
        function  Remove(Item : TItemPtr) : TItemPtr; overload;
        // Remove current item and return a pointer to the next
        // The current becomes the next
        function  Remove : TItemPtr; overload;
        // Return pointer to first item. Update current.
        function  First : TItemPtr;
        // Return pointer to last item. Update current.
        function  Last : TItemPtr;
        // Return pointer to next item. Update current.
        function  Next : TItemPtr; overload;
        // Return a pointer to the item after a given one. Update current.
        function  Next(Item : TItemPtr) : TItemPtr; overload;
        // Return pointer to previous item. Update current.
        function  Previous : TItemPtr; overload;
        // Return a pointer to the item before a given one. Update current.
        function  Previous(Item : TItemPtr) : TItemPtr; overload;
        // Acquire the first item, if any and remove it from the list
        // it must be later be released by calling ReleaseItem
        // Acquire Item doesn't affect current unless it is the one acquired
        function  AcquireItem : TItemPtr;
        // Release an item previously acuired. The item is moved to the
        // inactive list for later reuse
        procedure ReleaseItem(Item : TItemPtr);
        // Free all items in the inactive List
        procedure FreeAllInactiveItems;

        property Current       : TItemPtr      read FCurrent;
        property Count         : Integer       read FActiveList.Count;
        property CountInactive : Integer       read FInactiveList.Count;
        property CountAcquired : Integer       read FAcquiredList.Count;
    end;

    // Class to support for..in construct with the main TCommQueue<T> class
    // Warning: for..in is not really thread safe
    TCommQueueEnum<T : Record> = class
    protected
        FQueue : TCommQueue<T>;
        FIndex : Pointer;
    public
        constructor Create(AQueue : TCommQueue<T>);
        function GetCurrent: T;
        function MoveNext: Boolean;
        property Current: T read GetCurrent;
    end;

The generic class TCommQueue takes one parameter which is the data type you want to use for the items. Instead of I could have used. The difference is that the later allow any data type to be used for T while specifying "record" impose the constraint of have a non nullable data type, mostly a record but also a scalar type such as integer, char and the likes. You cannot use string, array, class nor interface. I imposed such restriction because memory is allocated/freed using new/dispose.

The doubly linked list is declared as a nested data type at line 19. It is a record named TLinkedList. It contains two pointers to each end of the linked list, and a count of items in the list. I added a count because counting elements in a linked list is expensive: you have to iterate thru entire list to count how many node it has. Maintaining a counter is a useful optimization which doesn't cost much.

The nodes or items of the linked list are made of a record named TLLNode defined. As you can see at line 15, it makes use of the parameter type to build the record. Remember, in a doubly linked list, a node is made of the actual data you want in the list plus two pointers to the next and previous node in the list.

Lines 23 to 26 are method declarations for the record. Their implementation handles adding (append) or extracting (remove) nodes. This is only a matter of manipulation the pointers, no data is actually allocated, moved nor freed. This is really fast!

At line 29, 30 and 31 are declared the 3 doubly linked lists we talked before. Since TLinkedList are records, there is no need to allocate/free memory.

Line 31 declares FCurrent a pointer to an item. It is used in conjunction with the methods First, Next, Previous and Last that I implemented to iterate thru the list. Please note that those are NOT thread safe.

For the fun, I implemented an enumerator for the queue. Please refer to my article "Writing an iterator for a container" for details: http://francois-piette.blogspot.be/2012/12/writing-iterator-for-container.html
The enumerator is NOT thread safe and copies the data which is bad for performance.

The real interesting stuff is in methods Add, AcquireItem and ReleaseItem. They are all fully thread safe.

Add will add an item to the queue (Active linked list), reusing an inactive one or allocating a new item. Add returns a pointer to the item. There are two overloaded versions of method add. One without argument and one with a method pointer argument.

The version with the method pointer argument (line 48) is required to be fully thread safe. Allocating a new item in the queue may require initialization of the item. But initialization must take place before the item is visible in the queue otherwise another thread could remove it before it is initialized. The method pointer passed as argument is used as a callback (a kind of event if you like) which will be called within add method just after the new item is allocated or extracted from inactive list and just before it is actually appended to the active list.

A last note about multithreading: we handle linked list pointers in the implementation. To be thread safe, the handling must make sure only one thread at a time can update any of the linked lists. This is why I used a critical section declared at line 33.

Implementation

The implementation is rather boring. It is actually very basic Delphi programming. I give the source below to be complete. Drop a message if you need some more explanations.
Complete source code is available at http://www.overbyte.be/eng/blog_source_code.html

Follow me on Twitter
Follow me on LinkedIn
Follow me on Google+
Visit my website: http://www.overbyte.be

The article is available at:
http://francois-piette.blogspot.com/2013/02/high-speed-generic-queue-class.html