The language suppport for this is:
exec = thread(callable, args...)starts a new thread that calls callable (e.g. a function or method) and passing args. Returns an execution context. The thread runs until callable returns.
critsect statementexecutes statement indivisibly with respect to other threads. Eg:
critsect ++shared_counter;
critsect { item = shared_list; shared_list = item.next; }More importantly:
waitfor (wait-condition; wait-object) statementwaits till wait-condition is true, sleeping until wait-object is woken up before each re-test of the condition. Once wait-condition is true, statemet is executed. All of this is done under a critical section except for the actual sleep on the wait-object. For example, suppose jobs is an array to which things to-do get added occasionally in some other thread.
waitfor (nels(jobs) > 0; jobs) job = rpop(jobs); /* rpop? see below. */Some other thread might have code:
push(jobs, job); wakeup(jobs);The wakeup function wakes up all threads waiting on the given object, which makes them re-evaluate their wait condition. The object can be anything. An integer, a string, an array (as in this example). For example, a wakeup is done on the execution context object of a thread when it exits.
It was tricky to add multiple execution contexts to the execution engine without slowing things down. Adding a single indirection in the top-of-stack references added 10..20% to the execution time of some programs. In the end I devised a method that didn't involve any additional indirection. I also reduced the use of macros in this area (which I think makes it clearer and easier to debug).
On the C side there is not much impact. Current C code should not notice any difference - it will just run with the global mutex taken and be indivisible with respect to other ICI threads. You can call ici_leave() to release the mutex. It gives you a pointer that you pass to ici_enter() to re-aquire the mutex. For example:
{ exec_t *x; x = ici_leave(); ...read file or something... ici_enter(x); }
To solve this, internally arrays conceptually have two forms: pure stacks, and the general case of being a queue. As long as no rpush or rpop operations have been done, they have the same internal operation they always did. All the important stacks used internally have this property. Once an rpush or rpop has been done an extra pointer introduced into arrays might be different from the base -- it is now a circular buffer. In the general case you have to use new rules for accessing the contents of arrays from C. This was probably the hardest change to make. Arrays are used everywhere.
rpush() and rpop() seem like a small additional feature. But after using it just a few times I think it was worth it. Lots of things become easier to do.
x = ici_talloc(type); ... ici_tfree(x, type);and
x = ici_nalloc(size); ... ici_nfree(x, size);98% of the time this is really easy because the place you free the data knows exectly how big it is. For the occasions where this is not convenient, you can use the completely malloc/free equivalent:
x = ici_alloc(size); ... ici_free(x);Unfortunately, in making this change I have lost all that beautiful debug support that was put into the old allocator. I might go back and try to retro-fit it sometime.
The net effect of this is an improvement in CPU time and memory usage. The improved CPU time comes mostly (I think) from a large total memory bandwidth reduction into the processor cache. Especially on garbage collection. (Before objects were so big they would just about fill a whole cache line by themselves. Now a whole bunch come in together.) A "small" object is one less than or equal to 64 bytes. Things over 64 bytes go straight to malloc without further memory overhead.
The down-side of this change (and the reason I didn't do it years ago) is that the dense allocation can't be freed until you shut the interpreter down with ici_uninit(). But I figured that in most applications you want ICI to go faster and have lower peak memory usage; more than have it reduce malloc heap usage between tasks.
* Added an ici_pcre() function to avoid exposing internals of PCRE in the ici.h include file. * Changed the definition of the struct lookup look-aside cache stored in strings. It used to apply only to variables. But now it applies to all struct lookups. That meant it could be cleaned up and the number of times it is invalidated (by incrementing ici_vsver) greatly reduced. This makes a good improvement in execution speed. * Generalised the super mechanism. Objects that want to support a super (still only structs in the core language) use the new type objwsup_t (object-with-super) instead of object_t as their header. This includes the super poiner. They must then also set the O_SUPER flag in their header. They must also support some extra fetch/assign functions. There are quiet a few place where struct_t types became objwsup_t types as a consequence of this. * Removed the version number from the naming of auto-loading ICI modules (but not native code modules). Thus, for example, the version 3 startup file was called: ici3core.ici but it will now be called icicore.ici I think the ICI language (as opposed to its internal APIs) is sufficiently stable that it is not really required. I found it an unnecessary inconvenience. * Changed chkbuf() to ici_chkbuf(). * Added a new basic type "handle". This not accessible from the core language, but C code can use it to return generic references to C data objects. It supports a super pointer, so C code can associate a class (i.e. a struct full of intrinsic methods) with it to allow it to be used as an OO object (which can also identify it as an object of the expected type when passed back from ICI code). It also allows a type name to be associated with the handle which will appear in diagnostics. * Changed 'error' to 'ici_error'. * Changed ici_evaluate() to use a catch object on its C stack as its frame marker on its ICI execution stack. This avoids an object allocation on each ici_evaluate call. The arguments to ici_evaluate have changed slightly as a consequence. * Removed syscall functions from the core. They will only be accessible through the sys module in future. * Changed the allocation routines to allocate small objects densly (no boundary words) out of larger chunks. The technique for eliminating boundary words, and keeping the fast free lists, is to have alloc/free routines where the caller is required to tell the free how much memory it asked for on the alloc. Thus we have: x = ici_talloc(type); ... ici_tfree(x, type); and x = ici_nalloc(size); ... ici_nfree(x, size); 98% of the time this is really easy because the place you free the data knows exectly how big it is. For the occasions where this is not convenient, you can use the completely malloc/free equivalent: x = ici_alloc(size); ... ici_free(x); * Added a small array (32) of pre-generated small ints to allow a quick check and use of these very common numbers. * Changed the internal ICI calling convention. The call operator object used to store the number of actual parameters to a function. Now an seperate int is pushed onto the operand stack. There is now just a single static call operator. Because ints are not heavily optimised, a call from C to ICI now, typically, does no allocation until its in the main execution engine. * Moved the lib curses based text window feature out of the core. Will put it in an extension module soon. * Changed new_array() to take an int argument being the initial number of slots for the array to have. The caller can assume that that many items can be pushed on. Use 0 for the default value. * Changed arrays so that they can be efficiently push()ed and pop()ed at *both* ends. Thus they can be used to form efficent queues. Although apparently a small feature, queues are something that I've always felt were important and missing from ICI. However this was a *big* change (much harder than the object header change). The parser and execution engine rely heavily on arrays for their efficiency. To prevent an impact on them we distinguish arrays that have neve been used as a queue (never had the new functions rpush() or rpop() done on them) from the general case. Virgin arrays are refered to as stacks and have all the old semantics. But in the general case arrays are now growable circular buffers. Were you don't know the origin or history of an array, you must assume the general case and use some new knowledge, functions and macros to access it * Removed the feature of binary << that allowed "array << int". This has been flagged for removal in the documentation for a long time, and became difficult to support. * Changed the universal object header(!) From 2 x 32 bit words to 1 x 32 bit word. Theoretically this is a huge change, but it was actually pretty easy. It requires recompilation of external modules, and some changes to their source. Basically the type is now completely indicated by the small int o_tcode field of the header. To find a pointer to the type structure you must index an array of pointers to them. Use ici_typeof(o) for this. Types must now register their type_t structure to obtain their small int type code, which they should remember and use when making new objects. After this change, the next release will move to version 4 to keep extension modules with the new smaller objects seperate. The overall effect on CPU time seems to be neutral or a slight improvement. * Added multi-threading. This is based on native machine threads, but the whole mass of ICI objects and static data is gated through a single mutex. So it works fine except threads competing for the ICI execution engine will not take advantage of multiple processors (but they will if they spend their time in functions that release the mutex while running). It was a little difficult to achieve this without slowing things down. Introducing a single extra indirection in top-of-stack accesses (the obvious way) adds up to 20% to the execution time of some programs. But I managed it. New language contructs "waitfor (expr; obj) stmt" and "critsetc stmt" have been added. As well as the "wakeup(obj)" and "sleep(num)" functions. All I/O routines in the core release the mutex around the low level I/O - except the parser. (See new documentation.) In the process, o_top, x_top and v_top macros got removed. Use ici_os.a_top instead. * The ICI Technical Description has been updated to FrameMaker 6 format and split into seperate chapters; each in a seperate source file. * Removed the obsolete function ici_op_offsq() and operator o_offsq from from array.c, ici.def, and fwd.h.